Hit_scraper with hit export script added CUZ IT'S MORE CONVENIENT! Here's a few guides, one on mturkforum.com and one on mturkgrind.com
v1.4 update!
Big update here, I added a lot of cosmetic and other functionality to it. Let me know if there are any issues, new functions will probably be buggy until taken care of.
Features added:
- Ability to block by title and requester (so you can block individual hits you've done)
- Ability to view only certain requesters with Include list (Must add requesters to list individually for the moment, if there's a desire I'll add in a button like the blocklist)
- Ability to make scraper make a "ding" noise when it finds new work.
- Tied in with HitDB so clicking the R/T at the end will show you the work you've done for that requester (only for green items, might not work on firefox)
- Added A-Z sort
- Added inverse sort
- Added checkbox for "Correct For Skips" (mouseover the checkbox to see what it does, or try it out! On by default, will change to off by default if necessary).
Cosmetics:
- Re-organized a bit of the header section with some | characters to separate things
- Added some helpful "status" messages to explain some things a bit (IE why it's scraping more than the pages you told it to)
- Moved the status messages to below the header
Subtle:
- Made it pull the blocklist every time you run it so you can have multiple instances and they'll work together properly.
TODO:
- Save everything to localstorage so you won't have to set it up individually each time
- Add capability for multiple export templates (so you can have one scraper for a bunch of sites)
- Make it easier to theme (Add a table with colors you can edit and such)
v1.4.1 changes:
- Initial themeing support. Put all the color values up at the top of the code, with descriptors, so they can be changed easily
Older update logs:
Updated to fix an issue with the export not getting the proper quals for the proper hit.
Updated so it wouldn't clobber the normal hit export script
Updated to fix a bug, and now the requester list is case insensitive.
Added description as mouseover text for title link. Hold the mouse over the title to see it.
1.3.0.10: Added ability to block requesters dynamically, and revert to the blocklist set in the code. Default blocklist contains:
"oscar smith", "Diamond Tip Research LLC", "jonathon weber", "jerry torres", "Crowdsource", "we-pay-you-fast", "turk experiment", "jon brelig"
To clear any of those from the default, just remove them from the code (line 18, remove the " marks and comma as well). To add a requester to the block list, click the "BLOCK" button next to their name. To reset to default, click "Reset blocklist" at the top.
1.3.0.11: Added a line (line 24) to change the hit export to text symbol to whatever you'd like.
1.3.0.12: Changed such that the "reset blocklist" is now a confirm dialog in case you misclick.
1.3.0.13: Updated an error with no TO hits.
1.3.0.14: Initial method of editing the existing blocklist to add/remove requesters manually. I'd like a better way of doing it, expect that to be coming.
1.3.0.15: Added "hits available" to default template per request.
1.3.1.0: Major release because of all the changes so far. This one has logical updating of the block list. What's that mean? It means when you click "Edit Blocklist" you'll get a textarea you type in. Remove requesters, add requesters, whatever you'd like. Then just click save and it saves.
1.3.1.1: Updated with Miku's new API link.
1.3.1.2: Fixed correct for skips to accurately reflect the pages you select.