Register
Submit a solution
The challenge is finished.

Challenge Overview

Topcoder has a client that needs to convert a set of public data from a MySQL relational database to a SQL Server backend.  In addition, to the raw data conversion there are some data generation requirements that need to take place.

  1. The source schema and data loading script can be found in the Code Document forums as a MySQL export in the file MudLogSourceMySQL.sql

  2. The target schema is found in the MudLogOCR.ddl file also in the Code Document forums.

  3. The migration of the source data to target schema is straightforward for the following tables:

    1. IMAGE_OCR

    2. IMAGE_OCR_DEPTH

    3. IMAGE_OCR_PHRASE

  4. You should populate the IMAGE_OCR.UWI field with the 14 digit number from the filename.  For example, if the IMAGE_OCR record has a name field labelled:  43007100260000_sample.tif. The UWI for this record would be ‘43007100260000’. Please make sure these numbers are converted as strings since some may contain leading zeroes.

  5. Your script should populate the IMAGE_OCR_DEPTH_SEC table.  This isn’t difficult. Each image in our MySQL database only has one depth section which is the information that this table is capturing.  Simple interrogate the MIN Y and MAX Y values from the image from the IMAGE_OCR_DEPTH records using the smallest and the largest Y coordinates. The key both tables in the FILE URL.

  6. The data in the following tables need to be synthesized:

    1. uwi_lat_long -- Simply generate random float values which correspond to latitude and longitude reading within the continental US.  For extra credit, you can put all these wells in state of Utah where they actually exists (but really this isn’t necessary). This data is going to support an app which will display a map of the United States so they’ll need to render on that map.

    2. Well_info -- Generate random information for WELL_NAME, COUNTY_NAME, FIELD_NAME, OPERATOR_NAME, and SPUD_DATE.  For COUNTRY_NAME, you can input “UNITED STATES” and for BASIN_NAME you can list “UINTA BASIN” for all records. Extra credit here if you want to find a list of oil fields in UTAH for FIELD_NAME and/or counties in US State of Utah for COUNTY_NAME.   Please base all the names on actual English names and words -- hashed values will be disconcerting.

 
 


Final Submission Guidelines

You can accomplish this data migration and generation task in a variety of ways depending on your technology of choice.  Python 3 or Java are good options or you could make the transformations in SQL itself.  The end goal is to have a simple repeatable script/process for populating a new SQL Server database with the provided data plus the data you’ve generated.

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30101621