Find Jobs
Hire Freelancers

AI to Extract Rate Formula from Text Description in PDF

$250-750 USD

Closed
Posted over 2 years ago

$250-750 USD

Paid on delivery
Hello. This is a unique problem. Please provide a detailed proposal. Vague applications will be ignored. Speak to the problem. Looking for people with creative ideas. The task is to extract a rate formula from a textual description in a PDF file. In Texas, the electricity market is deregulated. Rates are defined by a document called an Energy Facts Label (EFL). Several examples of EFLs are attached. These PDFs then describe, in words, a math formula. There are thousands of these EFLs. The Rate Formulas PDF file (attached) gives several examples of different descriptions, and a graph of the formulas that result. Rates are a function of kwh, ie R(x) where x = kilowatt hours. EFLs include a spot pricing table at 500, 1000, and 2000 kwh. This shows the rate value at those precise points, ie R(500), R(1000), and R(2000). This is useful for testing whether an accurate rate formula solution has been found or not. C# source code has been attached. There are two console applications. 1) PowerToChooseScraper. This program will download all the EFLs currently in the market. Just give it a target folder and it will download the PDFs there. This program may have some little bugs, but should work for you. 2) PTC. This is old code. It is a first draft attempt at creating a program to parse the PDFs and extract the rate formulas. Code hasn't been touched for many years. At the time it was created, it was looking good. Not 100%, but was getting ~65% accuracy. I do not care if the existing PTC code is used or not. I also don't care if your work is in C# or something else, but whatever the solution, the final working version will end up in C#. If you want to use a language other than C# for developing the initial logic, I'll ask why. If using ML techniques, that could be a good reason. This is a unique problem because it could be approached in a lot of ways. It could maybe be solved using ML/learning techniques. Maybe word similarity algorithms like Jaro-Winkler. The PTC code works by trying multiple approaches. It runs in a loop, stepping through methods, until it successfully found a solution. The approaches attempted are all fairly rudimentary. No learning algorithms have been attempted. I also do not expect 100% accuracy. Just as close as possible. ~95%. It's possible some EFLs have human errors in them, where the numbers are actually wrong and don't make sense. In which case the goal is to discover that. If a solution can't be found, we want to flag this EFL for a human to review it and determine what is going on. Over time we can improve the accuracy. I'm looking for for the discrete logic that processes a single PDF and outputs the rate formula, or an error code if it can't be determined. The larger infrastructure to then download and process these files, database the results, etc., is a separate thing outside the scope of this project. I will be working with you directly on this. I am an expert in C#, ML, and well versed in these EFLs. I can help guide your approach.
Project ID: 31562780

About the project

5 proposals
Remote project
Active 2 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
5 freelancers are bidding on average $705 USD for this job
User Avatar
Nice to meet you. I have checked your job descriptions and can do it perfectly. My work will include these steps - OCR: detect the text from pdf files - Extract the energy and rate info from OCR result - Estimate the formula using mathematical method like as LS method. Thanks
$500 USD in 7 days
5.0 (12 reviews)
5.3
5.3
User Avatar
Hi I'm Hoss I have a PhD in engineering and 15 years of professional programming experience in different languages I read your full description (thank you for the full explanation) The project is a challenge and I was really interested Reading text from ocr is a simple task. My solution is to use ML first to classify EFLs. EFLs that are generally similar to be in the same group (this can be done with ML and C #) In the next step, each group is trained separately and their keywords are determined. Due to the similarity of the group EFLs, it can be hoped that the accuracy will be acceptable. The method that can be done for this step should be based on Reinforced Learning. Of course, the project is such that after categorizing the groups completely, and after reviewing, we can make a better decision for the type of second stage algorithm. thank you
$1,000 USD in 7 days
5.0 (2 reviews)
4.3
4.3
User Avatar
My Background is Electrical Engineer only. This is my daily job, working on Electrical Bills. Preparing data from Electrical Bills is my work.I know if I get this work it will be a long time relationship. Thanks and regards Mritunjay Sinha
$500 USD in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
New York, United States
0.0
0
Member since Sep 14, 2021

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.