What does HIT Scraper, Turkmaster, or your other favorite scripts lack?

Discussion in 'Scripts & Resources' started by thetroll, Jun 29, 2017.

  1. thetroll

    thetroll New Member

    Messages:
    9
    Ratings:
    +7 / 0 / -0
    I'm working on a new sort of project, which I plan to eventually share with the community, but I'm also looking for some ideas. My idea draws on the ideas found in scripts such as HIT Scraper and Turkmaster, but I'm also looking for other ideas to enhance the worker experience.

    A short list of examples:

    - Better CAPTCHA notifications in Turkmaster - I cannot tell you how many times I've had PANDAs running in TM, and an hour passing, before I realize that my PANDAs aren't running because I hit a CAPTCHA 45 minutes ago. Since TM doesn't always alert you when you hit a CAPTCHA (such as when the active tab is not on MTurk, it doesn't notify you of the CAPTCHA).

    - Auto-clearing watchers from Turkmaster - If you set a lot of watchers in TM, and you don't keep up with them (clearing them when they're caught, clearing at the end of each day, etc.), clearing all those watchers can be a pain.

    - Better filtering for HIT Scraper - HIT Scraper is great, and there are very few things I can think of to improve on it. However, one thing is the HIT filtering for HITs like private error HITs for surveys. How many times have you heard HS ding, and go to grab that new HIT, only to find out it was just a HIT uploaded for 'Worker with AXXXXXXXXX only'?


    I've only included examples of HIT Scraper and Turkmaster because they're a couple of the only scripts I use that really help with finding work. I've already begun implementing a lot of the ideas I have into my project, but this is where you guys come in.

    If you have any ideas on how to improve the scripts that you use, I would really love to hear them.

    P.S. I've avoided adding too much information about my project for now, so as not to build up any unnecessary hype. When the time is right, I'll reveal more information, but for right now, I'm just looking to get ideas and feedback from my fellow MTurkers, to see how you feel your MTurk or script experience could be improved. :)

    P.S.S. I'm posting this on various other forums, and Reddit. I'm looking for a wide variety of problems/solutions, so if you see this elsewhere, that's why. ;)
     
  2. Randomacts

    Randomacts Well-Known Member

    Messages:
    23,700
    Gender:
    Male
    Ratings:
    +40,676 / 78 / -6
    I'm going to go ahead and pretend that I don't believe that your name is on the nose for a few seconds.

    Hit Scraper and Turkmaster are basically dead and most people are using http://mturksuite.com/. Some people use overwatch and pandacrazy as well.

    lol @Kadauchi
     
    • LOL LOL x 1
  3. thetroll

    thetroll New Member

    Messages:
    9
    Ratings:
    +7 / 0 / -0
    I'm not very social in the MTurk community, and so, while I've been off in my own little world, I'm seeing that a lot has passed me by. And here I thought my project was going to be something worthwhile lol

    Anyways, I've started looking into the these scripts, and aside from recreating and building upon some of the features in those scripts (seriously, kudos to the creators of MTurk Suite and PandaCrazy, I like some of the new features we haven't seen before), I've actually found a pretty efficient method of finding suggestions for my project.

    Thanks for responding though, I can now see why I've gotten so few responses.
     
  4. Randomacts

    Randomacts Well-Known Member

    Messages:
    23,700
    Gender:
    Male
    Ratings:
    +40,676 / 78 / -6
    I mean you are free to make a new script and the stuff we have now isn't perfect. The main issue you likely are having is your name and people are going to have a hard time taking you seriously (if you even are being serious here) with that name.

    /r/mturk seems to still love hitscraper and turkmaster for whatever reason so what you have would be an upgrade for them most likely.
     
    • Useful / Informative Useful / Informative x 1
  5. thetroll

    thetroll New Member

    Messages:
    9
    Ratings:
    +7 / 0 / -0
    To be honest, the name is more of a nickname I've acquired from IRL friends, than an actual indicator of my online behavior, though I completely understand your point. I am actually being serious, though; I've been working at this for months now, as a personal project to help step up my game, but I've been feeling generous and wanting to give back to the community (and maybe, just maybe, give people a reason to want to actually talk to me, since I rarely seem to get the time of day otherwise, whether I'm posting under an account named 'thetroll', or under a not-so-silly name).

    As for HS and TM, they were my initial draws of inspiration for this project, but now that I've discovered these other tools, I'm re-thinking how this project is going to go, and it could be months before it's finished enough to release publicly (what with the constant learning, script-breaking bugs, tons of new feature ideas to implement, and of course, work/family life, to top it off). I'm sure once it's finally out there, anyone who has used HS or TM will be able to clearly see the influence they've had, but in an effort to stay even marginally relevant in today's Turking world, it looks like I've got quite a mountain of features and coding to climb.

    Thanks for your time, Randomacts. Since you're likely right about my username, coupled with the fact that this idea is probably not going to be taken seriously as of right now, I'll just let this thread peacefully wither, and perhaps I'll return in the coming months with news of the project when it's closer to the alpha/beta testing phase(s) (and perhaps under a new username, if it'll really make that much of a difference).

    On a completely unrelated note: I was actually on /r/MTurk, browsing, and stopped on a comment you left there before randomly jumping on here, not even realizing that the comment I stopped on was posted by the very same person that responded to my thread on here. Just a funny coincidence.
     
    • Like Like x 2
  6. Randomacts

    Randomacts Well-Known Member

    Messages:
    23,700
    Gender:
    Male
    Ratings:
    +40,676 / 78 / -6
    One weird quirk about this forum is that people don't actually venture outside of the daily thread. Or at least most people don't so that severely limits the amount of people that you will see that will respond. It is a thing that a bunch of us have been trying to get people to move past but :dunno: people are stuck in their ways I guess hah.

    Now I guess I should quickly breakdown why the meta of mturk has moved past HS and TM being viable ways to work with imho at least. And this does not include the detailed performance difference compared to MTS (Hit finder / Hit Catcher) and Pandacrazy.

    HS while it has a lot of searching options does not have a log for you to search through old hits and get the panda links from them It also only runs on www.mturk.

    TM as far as I am aware actually has a memory leak in the most recent version on up to date browsers (although apparently some older version didn't have this issue). It also has the issue of only running on www.mturk. One side note is that the 'search' feature that people like in TM is objectively a waste of the page refresh error pool. (Oh and www.mturk and worker have their own pools so that is why the meta changed when worker was added)

    Hit finder is able to run on worker and hit catcher will be soonish. Panda crazy is also able to run on worker as well.

    One innovation that is fairly recent in the turking world is TTS (text to speech) alerts and that was introduced through Overwatch. This allows the worker to basically passively search for stuff on the include list and know just hearing alerts with no visual exactly what hit they grabbed or what hit just went up from their include list.

    Honestly overwatch still has the best implementation of TTS but MTS's current setup is serviceable and it is still be actively worked on so it will get better over time.

    As far as the username goes it won't make a difference once you are no longer a new user and have proven yourself (as it seems that you have to me) that you aren't an actual troll but you can message @ChrisTurk if you would like to have your name changed and he can do that without you having to make a new account.

    Okay this made me laugh a bit. Honestly the community at /r/mturk is pretty bad and I'm happy that I have been able to rescue some people from it. That community seems to be happy with horrible wages and is far behind the meta on scripts so that is just leaving them behind to earn even less money.

    Oh and if you don't know what I mean when I say meta this would likely be useful for you.
     
    • Useful / Informative Useful / Informative x 1
  7. thetroll

    thetroll New Member

    Messages:
    9
    Ratings:
    +7 / 0 / -0
    I know what you mean. I was thinking about posting this there originally, but given my lack of transparency on the project, and the current state of the project (which is, far from ready for any kind of public release) I figured it was best to post it here, so as not to get a bunch of hype going and/or just avoiding stirring the pot (previous experiences with other forum communities in the past have taught me that it's generally not a good idea to start making announcements and bold statements unless/until you've got something to back it). Then again, I'm sure I could've found a more tactful way to go about it posting in the Daily HITs threads, but I wanted to provide some context, so I figured my novel was better suited here for now.

    THIS. Thank you so much, this is extremely helpful for me, as I can now do some testing with TM to see how the performance between it and my script are. Of course, I don't plan to have any memory leaks in my script, so that would eliminate the issue people face with TM, but then again, who plans for memory leaks anyways? I guess we'll see how they stack up against one another, performance-wise, when the time comes.

    As for HS, that was actually another of the things I found bothersome about it in the first place, and I've already "sorta" addressed it (I have a logging system, but this was actually written to keep track of HITs caught with one of the multiple scripts that makes up my project). With what you said in mind, that's another great idea for me to try to implement (that is, a user-friendly database of recent/new HITs found on the scraper, for easy PANDA-ing).

    When you say "worker", are you talking about the new MTurk worker platform? I'm assuming so, but given my relatively limited knowledge of the new platform, and MTurk in general lately, I feel I should ask for clarification. You said that mturk and worker have their own pools, are you saying there are some HITs available on one platform, but not the other? If so, very strange, I'll have to look into it more, as I was under the impression that Amazon was simply trying to give MTurk a new look/feel.


    Oh ffs. I mean, that's REALLY awesome, don't get me wrong, but this oughta be a treat trying to implement something like that into this (and while I know it's obviously not required, I can see it becoming the new norm for scripts in the same vein of these newly created, all-in-one script suites). I suppose if I can pull it off, I'll implement it, and if not, well, -1 point to my project's relevancy, hehe.

    See, the funny thing is, when I originally planned on posting this, I first went to MTG, since I'd had an account there for a while (under the same username and profile pic as here). Then I learned that MTG is pretty much dead, and I discovered this forum shortly thereafter, after seeing someone else mention that most of that forum's userbase migrated to either here or MTC.

    A random aside about my username: along with the fact that it's my IRL nickname, I also use the name ironically for the very reason that I expect people to expect me to be a random internet troll, when in reality, I'm typically not (barring the occasional meme war on Facebook :p ). I suppose that's probably not the brightest idea when trying to post something serious like this, but then again, shattering expectations is always fun, so maybe I'll run with it. Either way, thanks for the advice.

    Hah, excellent. Also, I'm inclined to agree with you, as I've felt the same way about that portion of Reddit since I learned of it when I started Turking a few years ago. However, be that as it may, (for me) it's a much more organized and ideal place to get ideas on what problems people face when Turking, since it:

    A) seems to generally consist of a lot of newbie-ish workers, which I feel in turn leads to more questions being asked, rather than expert Turkers like those who frequent the forums, who are too busy hamming on work or figuring out the answers themselves to ask questions.

    B) I don't have to swim through a sea of GIFs on the Daily HIT threads, where people are actually posting at, unlike on the forums (you're the only person to respond to me out of the 4 forums I posted on; /r/MTurk was the only other place where I got any feedback - my trollish name aside, posting in the not-Daily HITs thread on those forums is definitely the issue). While I may still have to go through pages of threads and comments there, I can at least get an idea of what a thread will contain before I spend the time looking through it.

    The irony of this whole thing is, while I thought directly asking for feedback would garner more specific and detailed answers (aside from the helpful information you've enlightened me with), instead, searching through posts people make about their issues with 'x script' or general complaints about MTurk, scripts, and extensions has provided a very rich pool of ideas.

    I guess necessity breeds ingenuity (or at the very least, forces me to not be lazy) :emoji_smirk:
     
  8. NBadger

    NBadger Well-Known Member

    Messages:
    6,390
    Gender:
    Female
    Ratings:
    +19,673 / 9 / -0
    You can only give a certain number of page requests per second (technically I think it's one per .700ish) on murk. The www. version and worker version share different PRE pools, so you can therefore get more page requests by using both sites (1 per .700 each). Many people have their scraper/OW/HF set on worker, and their panda script set on www, or vice versa. The hits are all the same across the board. Click on PRE to read the wiki page on it. :)


    Also regarding daily threads, it's often useful to go ahead and post your idea as a thread of its own, like you did here, and then link to it in the daily thread so more people see it. I spotted this the other day and linked to it on the daily, and then RA replied to it from there. ;)
     
  9. thetroll

    thetroll New Member

    Messages:
    9
    Ratings:
    +7 / 0 / -0
    Wow! That's really awesome, actually. Thanks for the heads up on the info on that, I feel like knowing that now, this script will be a lot more powerful than I originally anticipated.

    Noted. I honestly don't understand why I never thought of that in the first place? Ahh well, live and learn. :)
     
    • Like Like x 2