Challenge Overview

Challenge Objectives

 

In this challenge, we are developing a REST API that will provide the backend API to view the result of the prediction app which parses and analyzes the LAS files of a client in the oil and gas exploration industry. 

 

In short, we will update the app to 

 

  • Develop the REST API based on swagger design and prototype

 

Project Background

Topcoder has been working with a client in the oil and gas exploration industry to develop a tool to assign a particular name (and some other data elements) to a log file.  In previous challenges, our community has developed an application that parses and analyzes the log files which are in LAS text format.

 

  • The python app will run against the las file and store the data in MongoDB

  • Java-based REST API will expose those data to consume by Web app

  • After this challenge, we will integrate the API with the web app prototype. 

 

The web app will help users to view the status and result of the prediction app’s result.

 

The overall system will be as:

 

“Client will upload the file in the OneDrive and webhooks will call an endpoint in our API. The API will then run the prediction app’s command which will be on the same server. And the result of prediction and metadata will be stored in MongoDB, which will be shown in a web app for convenience”

 

Technology Stack

JAVA 1.8, Spring boot 2.1.2.RELEASE, Python 3.7, MongoDB, OneDrive API

 

Provided Assets

Please check the forum to access the

  • Prediction app

  • Swagger definition

  • Web app prototype

 

Individual Requirement

  • Build the Rest API based on the swagger definition

  • Unit tests with more than 80% of coverage are required

  • Check the prototype, API should fully support the web app

  • Authentication and authorization are not required, we will handle it later.

  • If anything needs to updated prediction please let us know, we will provide the updated code.

 

Implementation notes for each API

  1. GET /dashboardStats

  • totalLasFiles: total number of las files

  • avgUploadToday: total number of las files uploaded today

  • avgParsingToday: total number of las files processed today,

  • avgMLAlgorithmData: average algorithm run per day,

  • avgUploadRatio: {

  •     success: percentage of the successfully uploaded las files,

  •     failure: percentage of the uploaded files which are failed

  •   },

  •   avgParsingRatio: {

  •     success: percentage of the successful parsing ratio,

  •     failure: percentage of the failed parsing ratio

  •   }

 
  1. GET /las

  • It is a straightforward API. Get the details from the table and return

  • Check readme.md of architecture for more detail on filters and pagination

 
  1. POST /las/runMlAlgorithm

  • Find file name from las_meta_collection with given file id(s)

  • If any file is not found ignore that file and continue with other

  • If no file is found throw 404

  • For found files in DB, check if the file exists in input directory(which will be a local folder of one drive) remove all file details from DB

  • call the curve-predict command of prediction app with found files. 

  1. GET /las/:id

  • Get the details from the table and return

  1. DELETE /las/:id

  • Delete all information related to that las file

  • Don’t delete the LAS file from the input directory

  • Delete the LAS file from the output directory

  1. PATCH /las/:id

  • Updates the lnam of predict_collection table

  • If confidence is zero and lnam is updated then the status should be changed to “Manually Filled”

  • If confidence is not zero then the status should be changed to “Manually Corrected”

  1. GET /las/:id/export

  • Only las file is needed for now

  • No need to support other types of export.

  • Export should be done from output directory(which must be configurable)

  • The output directory is same as --output_dir in curve-predict command

 
  1. GET /lookup/well - unique list of `las_meta_collection.supplemental_info.wellName`

  2. GET /lookup/serviceCompany - unique list of `predict_collection.serviceCompany`

  3. GET /lookup/operator - - unique list of `las_meta_collection.supplemental_info.operator`

  4. GET /reports/allUploads

  • Calculate this report data from upload history maintained by webhook listener

  1. GET /reports/certaintyValue

  • Number of files categorized by confidence value and grouped week day wise

  • 80 and above will be "Okay as it"(Green)

  • between 50 and 79 will be "Needs mapped/add"(Yellow)

  • below 50 will be "TBD"(red)

 
  1. GET /reports/mLAlgorithmPrediction

  • “Correctly Predicted”: The number las files whose status is not in “Manually Filled” or “Manually Corrected”

  • “Manually Filled” and “Manually Corrected” number of files with those statuses 

  • Group the data as weekday wise.

  1. GET /reports/currentHourlyTrends and /reports/fileParsingTrend 

  • Out of scope

 
  1. Webhook listener 

  • Create an endpoint to subscribe the one drive webhooks https://docs.microsoft.com/en-us/onedrive/developer/rest-api/concepts/using-webhooks

  • Whenever we add a file in one drive folder, we should receive the notifications and with that info, we should be able to identify the file added/changed

  • If file is added/edited then run the following commands of prediction app in sequence (Check the readme of prediction app to see the full options for the command)

  1. Clean

  2. curve-predict

  • The above commands will save the result in mongo collections, which should be accessible by the web API

  • If the file is added for the first time, then before running the above commands we need to save the fileName and status in las_meta_collection.  Status will be “Received”

  • It will also need to store the following data in separate collection

    • Upload history data (so that we can compute avgUploadRatio, avgParsingToday, and allUploads reports)

    • Prediction algorithm run history(so that we can compute avgParsingToday, avgMLAlgorithmData, and avgParsingRatio)

 

Let’s discuss if there is anything.

 


Final Submission Guidelines

  1. API Source Code

  2. Updated Architecture document

  3. Guide how to set up OneDrive and share its local copy to the network, so other users on the same network should be able to upload files in that folder as a shared drive.

  4. A readme with the deployment of the database, prediction app, and REST API

  5. Verification guide showing how the individual requirements are met

  6. Winner needs to submit Pull request after the challenge ends

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30103090