Find Jobs
Hire Freelancers

Blog Scraping - open to bidding

$30-250 USD

Closed
Posted over 9 years ago

$30-250 USD

Paid on delivery
I would like a Python script written using Scrapy that scrapes every post on [login to view URL] and parses the contents into a JSON file that matches this structure for each post: { 'post_type' : "blog_post", 'url': '[login to view URL]', 'post_author_twitter1': '@johnbiggs', 'post_author1': 'John Biggs', 'post_author_twitter2': '', 'post_author2': '', 'post_date': '2007-06-21', 'post_subject': 'Writers Write "B-Logs," Get Money', 'post_content': 'USA Today, that bastion of hard news, is covering a new fad popular....', } Some posts have multiple authors perhaps with matching twitter profiles that need to be parsed into individual fields.
Project ID: 6452355

About the project

19 proposals
Remote project
Active 9 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
19 freelancers are bidding on average $157 USD for this job
User Avatar
Hello! Although I am new to Freelancer.com, I am an experienced programmer/web scraper with a Master's degree in Computer Science. I can create the blog-to-JSON scraper you have requested. I have created similar web scraping software in the past using Python (which I would recommend using for the third party libraries such as Scrapy, BeautifulSoup and Mechanize), and will gladly provide code and previously scraped data for an example. Thank you for your consideration, and I hope to work with you soon.
$222 USD in 7 days
4.9 (43 reviews)
6.1
6.1
User Avatar
A proposal has not yet been provided
$231 USD in 7 days
4.8 (61 reviews)
5.9
5.9
User Avatar
I am a Python/scrapy expert, and also interested in your project, Please contact me to discuss more details, Thanks, ################################################################################################################################
$133 USD in 3 days
5.0 (13 reviews)
4.6
4.6
User Avatar
Hi. I'm an experienced Python programmer and have experience with Scrapy. I am interested in taking up this job. We can discuss further details on chat. Thanks.
$166 USD in 3 days
4.9 (4 reviews)
4.4
4.4
User Avatar
This is Nitin having HUGE experience in scraping HUGE data in least amount of time. I code in php, python and perl, and scrapers written by me are being used to scrape more than 30 million pages per day without being blocked. I would like to help you in getting all the data you are looking for. Please pm me in case you find my bid suitable. And don't forget to check my reviews here : http://www.freelancer.com/users/1303125.html Cheers, Nitin
$222 USD in 4 days
5.0 (2 reviews)
4.4
4.4
User Avatar
Hi Sir, I have developed more than 70 scrapers using scrapy and node.js. For multiple authors it would be better to use another format. .... 'authors' :[ {'post_author_twitter': '...', 'post_author': '...'}, {....}], .... This format will work out of box. If you still want such format I can create new exporter which will convert to your desired format. Regards Ilshat
$155 USD in 3 days
5.0 (10 reviews)
3.8
3.8
User Avatar
Hello sir, I have experience of the implementing scrappers of different types of content in Python. **How can I help you?** Firstly, as soon as techcrunch supports RSS, I will fetch urls and titles from RSS feed. Secondly, using Python requests library, I'll fetch content of article and authors. It's easy to do using BeautifulSoap library. At the end I will make JSON file using standard Python's library. You just should answer for a few questions: 1) An article may contain images or some kind of formatting. Do you want to save text only? 2) How much last articles should the script fetch? When I receive answer for that questions, I can start working on grabber. Best, Vyacheslav
$111 USD in 2 days
5.0 (8 reviews)
3.9
3.9
User Avatar
Hello, Can your json structure be adjusted in any way? We could use a json array for the authors if there are more authors. If structure can't be changed, that's fine. Also, do I need to use Scrapy? That's ok too but I completed similar projects before without using this framework. Thanks, Bogdan
$155 USD in 3 days
5.0 (2 reviews)
3.8
3.8
User Avatar
La propuesta todavía no ha sido proveída
$131 USD in 3 days
4.9 (20 reviews)
3.8
3.8
User Avatar
Hi. I checked TechCrunch and it's seems quite possible to scrape all their blog posts. Their search can be used for listing all blog posts (there are less than 10 000 posts in total) and the rest from there is piece of cake. This task shouldn't be very difficult as I have scraped data successfully from websites with over 100 000 pages. Project shouldn't take long, but to be safe, I marked that it will take 6 days. It will be probably done in 2 days. Waiting for you response so I could start working already.
$222 USD in 6 days
5.0 (3 reviews)
3.1
3.1
User Avatar
Dear potential employer. Perl/Python/Web professionals here. Please, accept this bid to have your task done nicely in a reasonable time. Thank you
$133 USD in 3 days
3.8 (1 review)
2.8
2.8
User Avatar
Hello, i have experience using scrapy and can help you with parsing =) and if you want i can make GUI in Qt it would be beauty and crossplatform =)
$77 USD in 5 days
3.8 (1 review)
1.0
1.0
User Avatar
A proposal has not yet been provided
$155 USD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello there, thank you for this opportunity, I really interested in this Scrapy job. I've just placed my initial bid. If you are serious, maybe I can provide you with some demo. Please reply if you are interested too :) Regards, Dolek
$98 USD in 1 day
0.0 (0 reviews)
0.0
0.0
User Avatar
La propuesta todavía no ha sido proveída
$277 USD in 5 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Cambridge, United States
5.0
2
Payment method verified
Member since Feb 12, 2009

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.