Find Jobs
Hire Freelancers

Crawler to extract keywords and more, from a website

$50-500 USD

Completed
Posted over 8 years ago

$50-500 USD

Paid on delivery
Executive summary: We need a script to extract all the text as ords from given websites, sort those keywords in ascending order, remove some irrelevant ones such as 'and', 'the', and associate them with the sentence that those keywords are mentioned in. - We're a company, developing chatbots used in customer service, answering their questions on their website or mobile apps. Think of "Siri for customer service". - Customers, especially smaller ones usually don't want to dedicate time to create questions and answers to their chatbots, so we have a high barrier of entry in terms of setup. It's not easy to sell expecting them to dedicate time to start using their chatbot. We want to overcome this hurdle by developing an automated content crawler. "Content" is basically most frequently used keywords and important sentences that contain these keywords. (No need to create questions, as keywords and sentences associated with them are the building blocks of a chatbot) So, we essentially need a snapshot of the website in terms of what the website is all about. Phases are: 1- A crawler to crawl plain-text-based content from various websites submitted via a form (domain and subdomains) 2- A frequency analysis to list most common keywords in ascending order. (We need a list of stop words (irrelevant words such as 'the', 'a' etc.) so that we can strip them off before concluding the keyword list. 3- List keywords and the sentences that contain those keywords. We might need to remove from this list some entries based on some rules that may make them irrelevant. We'll figure out those rules as we see some results. The remaining content will be the essence of our project. 4- Based on experience so far, we might develop a new phase to further fine tune results. Our goal is to show the finalized content to website owner and let them say "this exactly covers every bit of information about my business" 5- We'll feed those keywords and sentences (answers to those keywords) to a chatbot database. 6- Last but the not least, we'd like to feed a list of websites to execute previous 5 steps automatically, and create the same output for all of them. If you can provide a really simple proof of concept, I could accept your bid the same moment. We need to see that the results generated from this project will be helpful for us to develop chatbots and increase sales. Oppositely, a copy-paste message will not get you the job. It demonstrates that you're a serial bidder, not someone who is willing to go the extra mile to solve a real problem for a business.
Project ID: 8262662

About the project

14 proposals
Remote project
Active 9 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
Hi, I'm not a serial bidder and your project is very interesting. withing my bid budget, I can build up a proof-of-concept that will 1-3 (cannot quote 4-5 as it heavily depend on 1-3), including database structure to store keywords / sentences. Should we discuss UI for steps 1-3?
$499 USD in 5 days
5.0 (345 reviews)
9.0
9.0
14 freelancers are bidding on average $495 USD for this job
User Avatar
Hello ther,e I will like to help you out with this project. I have experience scraping websites and it is only plain text that will be easy. Regarding the keywords, I have experience in data mining which means I know how to handle the database in order to find the most used keywords, and other importants aspect like relations between keywords which I think will be really important for you. Hope to hear back from you soon. Thanks
$416 USD in 3 days
5.0 (111 reviews)
8.8
8.8
User Avatar
Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi
$565 USD in 6 days
5.0 (573 reviews)
8.4
8.4
User Avatar
Dear Sir, I'm very much delighted to let you know that i did data scraping with PHP-cURL, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database or excel or csv or xml file. I worked on many similar projects, I have big experience in data mining projects. I have written hundreds of web scrapers which scrape millions of pages each day. I'm ready to fulfill your requirement. I can finish this task in short time, with the best quality. I can assure 100% accuracy. Please give me the opportunity to do the work. With Kind Regards, Debdulal Roy Proshanta
$444 USD in 3 days
4.9 (106 reviews)
7.7
7.7
User Avatar
I want to discuss this project with you further, let me know the best suitable time for you to schedule the meeting, Feel free to message me at any time, i used to be online 14 hrs in a day on this website so probably you will get a quick response from my end.
$515 USD in 12 days
4.8 (54 reviews)
7.1
7.1
User Avatar
Hello sir after reading your requirements . I can do you project using semantics feature i have already done that it automatically gives ability for those keywords that have high frequency occurance. And A universal list of stop words and word breakers(differrent forms of verb) . My Info: i have done scrapping almost on Half of Worldwide web including ecommerce giants(Amazon,ebay,craigslist) News Feed, Social media websites, API's. I develop my own tools based on client requirements with Mulithreading, a Bot with human behaviour and Scrapping Applications with documents parsing. I Can do PDF Parsing and Capctha ByPass code as well. Contact me for further details or Demo
$305 USD in 3 days
4.9 (53 reviews)
6.8
6.8
User Avatar
Hello Dear, I can do this for you. Please send a massage in the PMB for details.......Best Regards flashsaiful
$500 USD in 6 days
4.8 (133 reviews)
6.7
6.7
User Avatar
Hello! I'm web scraping expert. I use python scrapy framework. My scripts can run on windows or linux, but linux is preferably. I can schedule scripts on server if it is required. I can scrape secured and protected sites, my crawlers can enter into login form, emulate ajax requests etc. If site block IP i can use proxy or TOR. I can try avoid captha on site in avtomatic or manual mode. I can export data into json, csv (excel), mysql, mongodb. I have a lot of finish projects (google scraping, facebook scraping, yellow pages, webshops and other sites with lists of any items). I like your project. Message me if you have questions.
$449 USD in 3 days
4.8 (107 reviews)
6.5
6.5
User Avatar
Hi I have extensive knowledge in web scraping with ruby & have 5+ years experience in building apps with Ruby on Rails. I can get this completed in 3 days. We can discuss in details via message. Thanks
$300 USD in 3 days
5.0 (9 reviews)
4.7
4.7

About the client

Flag of UNITED STATES
New york, United States
5.0
67
Payment method verified
Member since Mar 31, 2002

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.