"Please provide a High level approach on how you are going to do it with your Bid "
I have X num of data load jobs scheduled. Each job writes to a log file and jobs are interdependent i.e. they run either serial or parallel. I want a web page to show live Lineage and tracker system with following features -
should allow me to design dependency. I must be able to choose log files and decide the lineage which will show up in the report by having a "browse File" like functionality.
One log file will have one BOX denoting the load job.
Box should show the % of completion based on Log file [login to view URL] the load process completes, log file will be completed and the color of Box should turn green sating load completed 100%.
Once BOX 1 completes 100% it should start the same for other dependent job.
If the Log file shows any error, it should show in the corresponding Box.
Once entire workflow which consist X num of load job completes, it should make everything green and notify by email.
Design and UX can go to multiple round of review
must be Linux based. Log files are Hadoop/Hive logs
Must use Python/Django/flask