Experienced analyst required to undertake detailed analysis and comparison between 3 sources of similar data (each containing 6500-7000 records). All 3 sources are reporting the same data but have slight variation. All 3 sources have up to approx. 10 corresponding fields.
The scope of this work is as follows:
1) In source 1 and source 2 – find duplicates (and possible duplicates where unique records are very similar but may have small variation on one field. Ie: you will need to concatenate or use a similar method to identify close matches)
2) Compare source 1 against source 2 – identify unique records in each (2 way comparison)
3) Compare source 1 and 2 against source 3. Source 3 has been previously ‘cleaned’ but data is out of date.
a. Identify which records in source 1 and 2 are unique compared to source
4) These tasks need to summarised into a clear presentation, summarising the types of data variation. The specific examples of each variation need to easily identified within the data itself or as a separate excel file (or similar)
Excellent communication in English and necessary skills to present findings in a format for executive level presentation is a requirement. The task is not urgent but it would be preferable that the outcomes are delivered in a few days.
There is one additional requirement -
Source 3 contains a field that needs to be assigned to the corresponding records in Source 1 and 2 where there is an exact match