Topcoder Challenge | Topcoder Community

Challenge Overview

In a previous challenge, Topcoder members developed a Decline Curve analysis application. It’s working well in most situations especially for oil production forecasting. But there are a few exceptions. In this challenge we’re hoping to improve the predictive ability of our existing application and handle some of the exceptions more gracefully. Our client would also like to add a few enhancements to the current application to make it more flexible and add parameters to allow some variability in the forecasting. We also want to focus on the accuracy of our natural gas decline curves. These are more difficult to predict than the oil curves.

Here are the new requirements:

Improve the performance of the Oil Curves as suggested in our attached documentation. This can be found in the Code Document forum. An expert has identified several wells where the predictions are less than ideal. Please update the code to improve prediction performance in these (and similar) instances.
Improve the performance of the Gas Curves as suggested in our attached documentation. This can be found in the Code Document forum. An expert has identified several wells where the predictions could use some improvements. Please update the code to improve prediction performance in these (and similar) instances.
Improve outlier filtering: provide option to more aggressively filter outliers on a rolling- time basis. Incorporate smart logic regarding downtimes/shut-ins (e.g., production dropping anomalously low and spiking high in quick succession is often indicative of operational issues and could be ignored).
Weighting of late-time data over early-time data. Incorporate feature that allows user to toggle weighting on data points, and specify which portions of data to weight (e.g., user can choose to query a forecast that weights the last 20% of production data heavily, but can also query forecasts to maintain equal weighting on all data points).
Bullish vs. bearish forecasts: allow user to query optimistic or pessimistic (conservative) forecasts. This logic may likely correspond with the degree of filtering in #3 above. Logic may include filtering on outliers below rolling median/average (which may result in optimistic forecasts) or filtering on outliers above rolling median/average (which may result in conservative forecasts). These forecast will be not be evaluated in the RMSE calculations.
Remove the dependency on Water Usage and Casing Pressure from the input files. The revised input files will simply contain Well Number, Date, Daily Oil Production and Daily Gas Production columns. We’ll provide both versions of the files for now so you can execute the current codebase without issues. But for final testing we’ll only be testing without the Water Usage and Casing Pressure.

Automate output file generation. Right now the application expects a test file with a series of well names and dates. Let’s add a command line parameter to the app which specifies the number of days for which output is required. The wells can be determined based on the input.

Final Submission Guidelines

Submission Guidelines

Please use the existing codebase provided in the forums as a starting point. You’re welcome to make any modifications necessary to improve predictive performance.
Please update documentation accordingly to explain in detail the changes you made to improve the algorithm.
1. In your documentation please thoroughly explain the additional parameters you’ve included and provide examples for their usage.
Jupyter Notebooks are welcome for explanation or exploration purposes but this should not be your only deliverable. The code will be deployed directly to a VM or Desktop environment without a local Jupyter server.
We’re providing a couple of supplementary code samples which performed better on the examples provided.
1. Submission 283025 does quite a bit better than 283032 in some of the oil prediction cases reviewed by our experts. This codebase will be included in the forum and available for your review. It should be noted that overall 283025 performs worse in terms of RMSE than 283032 on the test set provided but it does handles certain declines more accurately.
2. Submission 282962 was our best in the previous match for forecasting gas decline curves. It had the lowest RMSE on the test set.

Scoring

In this challenge, we’ll be using the following scoring rubric. The 283032 codebase which will be our baseline generates RMSE and AIC calculations in the evaluate_results.csv file.

The submissions will be ranked in order of Oil RMSE. The lowest Oil RMSE averaged across all the test wells will receive the highest rank. Oil RMSE will be 10% of your score.
The submissions will be ranked in order of Gas RMSE. The lowest Gas RMSE averaged across all the test wells will receive the highest rank. Gas RMSE will be 10% of your score.
The submissions will be ranked in order of Oil AIC. The lowest Oil AIC will receive the highest rank. Oil AIC will be 10% of your score.
The submissions will be ranked in order of Gas AIC. The lowest Gas AIC will receive the highest rank. Gas AIC will be 10% of your score.
We’ll conduct a visual inspection of the test data by industry experts and we’ll rank the submissions accordingly.
15% of your score will be based on the visual inspection of the Oil Curves
15% of your score will be based on the visual inspection of the Gas Curves.
30% of your score will be determined by the evaluation of your implementation of the new requirements above.

Decline Curve Analysis Enhancements Code Challenge

Key Information

Challenge Overview

Final Submission Guidelines

LEARN:

ELIGIBLE EVENTS:

REVIEW STYLE:

Final Review:

Approval:

CHALLENGE LINKS:

TOOLBOX:

SHARE:

ID: 30088557