I require a crawler/spider that will have a number of features: Project Requirements 1. Database creation 2. Routine for allowing the entry of the websites to be crawled 3. Means of tracking what site records came from and then deleting or achiving if record has been removed from original site 4. Multitheaded crawling 5. Ability to recognise data fields and include them in appropriate tables 6. Ability to identify if we are being blocked and then go through proxies if required (a list of proxies is needed) 7. Needs to be able to get around [login to view URL] telling it cant crawl 8. Needs to be able to be slowed down if it is being firewalled due to sites rejecting too many queries 9. Needs a means to identify if the same record is in two sites 10. Needs a means to report its success or if it is being blocked and updates 11. Ability to follow to other sites that are linked and run routine again 12. Must use open source software Its a condition of accepting the project that the programmer must assign copyright to us for this and any subsequent work performed for us. All work must be done on our servers. Payment by escrow
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
I don't care what you use.