warriors
2. I had small experience with amazon: scraping the start reviewers for books. But it wasn't very complex. i have recently designed a scraping tool for TaoBao - chinese wholesaler. 30 brands, each brand between 2000 and 70000 products. Before that did the system for project data scraping for 3 different websites 10000 projects each with 15 data points to scrape. Also did a system for shipping industry with vessel tracking data scraping for 50 ships per day.
3. IP blocking is the constant problem. Public sites usually don't bother with it but commercial just block you flat out, especially the once like Amazon, who sells access through API. I have tried to use IP rotation, but didn't have good experience with that - too flimsy and unreliable, too much maintenance. I use a bit of masking techniques with browser and device type, it can full the server for a while but also not reliable. The best solution I found so far is Google Apps. All requests go through bigG and no one blocks bigG.
4. Heard about it, but didn't have the need so far.
5. No haven't done that. But we will figure it out.
6. Individual, don't have any other job. Could dedicate upto 5 hours a day to your project.
7. That is a funny request, any employer requests complete confidentiality with this things, so No I can't send you any examples of serious scrapping jobs :)
My proposal is to use Google Apps for the whole system. It will definitely be the most reliable and the cheapest solution!