Challenge Overview
OVERVIEW
This challenge seeks to improve the skill of short-term streamflow forecasts (10 days). Solvers will develop and implement their methods for locations across the western United States, attempting to outperform state of practice streamflow forecasts.
There will be 12 monthly competitions occurring over a year-long period where solvers provide real-time daily streamflow forecasts. Know more about the year-long series here.
This is the February challenge.
INTRODUCTION
Streamflow forecasts are integral to water management. With higher skill forecasts, water managers can likely better operate facilities for high flows, to mitigate drought impacts, and to achieve other improved outcomes (e.g. hydropower generation).
The growing capability of Artificial Intelligence (AI) / Machine Learning (ML) and High-Performance Computing (HPC) has begun to be applied toward generating streamflow forecasts. This has the potential to complement traditional physically based streamflow forecast methods via hybrid or purely data-driven approaches. This challenge aims to spur innovation in the field of streamflow forecasting, with emphasis on assessing the potential application of ML/AI for streamflow forecasting.
PRIZE STRUCTURE
The total sum of all prizes is $466,000.
The sum of any bonus funds that are not awarded will be reallocated to later prizes purses. 50% of it will go to the next quarter, and the other 50% will go towards the overall prize. Any funds not awarded during the overall bonus evaluation (when there are less than 10 competitors beating RFC forecast score) will be redistributed to the winning positions.
TASK OVERVIEW
Your task is to predict a 10-day streamflow forecast in 6 hours intervals for the specific locations (i. e., 40 values per location). You must return the streamflow forecast in cubic feet per second. The first value should be for 18:00 on the 1st day, the last value should be at 12:00 on the 11th day). All times are in UTC (also called GMT).
The quality of your algorithm will be judged by how closely the predicted streamflow matches the actual, measured values. See Scoring for details.
We will evaluate the solutions on live data.
SCHEDULE OF CHALLENGES
(Current challenge highlighted.)
Month Quarter
October 2020 Q1
November 2020 Q1
December 2020 Q1
January 2021 Q2
February 2021 Q2
March 2021 Q2
April 2021 Q3
May 2021 Q3
June 2021 Q3
July 2021 Q4
August 2021 Q4
September 2021 Q4
INPUT DATA
There is no official data set. We have prepared a list of data sources which we expect may be useful for the task. It is located here. You are also allowed to use any other data, provided it is freely available. You may also use data that may not be freely available provided: (1) they can be made available to Topcoder for scoring/validation and (2) they could be procured by the client for future use.
GROUND TRUTH DATA
In this challenge, the task is to provide the forecast for the following locations:
- LocationID, USGS Station Number, Description, Link to More Info and Data
- BNDN5, 08353000, Rio Puerco near Bernardo, NM, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=08353000
- ARWN8, 06468250, James River above Arrowwood Lake near Kensal, ND, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=06468250
- TCCC1, 11523200, Trinity River above Coffee Creek near Trinity Center, CA, https://waterdata.usgs.gov/ca/nwis/uv/?site_no=11523200
- CARO2, 07301500, North Fork Red River near Carter, OK, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=07301500
- ESSC2, 06733000, Big Thompson River above Lake Estes, CO, https://dwr.state.co.us/surfacewater/data/detail_tabular.aspx?ID=BTABESCO&MTYPE=DISCHRG
- NFDC1, 11427000, North Fork American River at North Fork Dam, CA, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=11427000
- LABW4, 09209400, Green River near La Barge, WY, https://waterdata.usgs.gov/wy/nwis/wys_rpt/?site_no=09209400&agency_cd=USGS
- CLNK1, 06847900, Prairie Dog Creek above Keith Sebilius Lake, KS, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=06847900
- TRAC2, 09107000, Taylor River above Taylor Park, CO, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=09107000
- NFSW4, 06279940, North Fork Shoshone River at Wapiti, WY, https://waterdata.usgs.gov/nwis/inventory?agency_code=USGS&site_no=06279940
We will use the following source to generate ground truth for scoring:
https://nwis.waterservices.usgs.gov/ - all locations except for ESSC2
https://dwr.state.co.us/ - location ESSC2
These websites offer various services which you may find useful for generating your predictions. It contains real-time and historic streamflow values for locations across the United States.
See, e. g., this info about the sites, generated by one of the services available on the first above mentioned website.
We provide the tool to download ground truth values for the locations which will be used in this challenge.
OUTPUT FILE
In this contest you must submit a docker container which may be used to generate the forecast. The output must be created as a single CSV file. The file must be formatted as follows:
DateTime,LocationID,ForecastTime,VendorID,Value,Units
where
-
DateTime is date and time for which we predict the streamflow, in YYYY-MM-DDThh format, e. g., 2021-02-01T18,
-
LocationID is the location identifier,
-
ForecastTime is date and time of the beginning of the 10 day period for which the prediction is being made, in the same format as the first column,
-
VendorID is TC+ followed by your Topcoder handle, e. g., if your handle is Alice, you should set this to TC+Alice,
-
Value is the streamflow forecast for the given location, date and time, in cubic feet per second,
-
Units is CFS.
Your solution file should contain the above header line (optional), plus exactly 400 lines (mandatory): 40 lines for each of the 10 target locations.
Sample lines:
DateTime,LocationID,ForecastTime,VendorID,Value,Units
2021-02-01T18,NFSW4,2021-02-01T00,TC+your_TC_handle,8710,CFS
2021-02-02T00,NFSW4,2021-02-01T00,TC+your_TC_handle,9120,CFS
2021-02-02T06,NFSW4,2021-02-01T00,TC+your_TC_handle,9270,CFS
2021-02-02T12,NFSW4,2021-02-01T00,TC+your_TC_handle,9470,CFS
2021-02-02T18,NFSW4,2021-02-01T00,TC+your_TC_handle,9790,CFS
. . .
We provide a tool which can check your output file for format errors.
SUBMISSION FORMAT
This match uses the "submit code" submission style. The required format of the submission package is specified in a submission template document. This current document gives only requirements that are either additional or override the requirements listed in the template.
A sample submission 1 package in Python is available here. The same in C++ is here.
The train.sh script has no input arguments. Note that train.sh will NOT be used during the testing phase. We shall use it only in the validation phase and only for the submissions winning prizes. You are not obliged to provide train.sh in your submission. We will ask the winners to provide it with the documentation (see the “Final Prizes” section at the end of this document).
The test.sh script has this signature (which is different from the one in the template!):
test.sh <target-date> <output-file>
<target-date> is the date of the first day of the 10-day period for which the forecast is being calculated, in the YYYY-MM-DD format.
<output-file> is a file where your code must save its forecast in the format specified in the "Output file" section.
A sample call:
./test.sh 2021-02-01 /workdir/forecast.csv
This should generate the file with the forecast for the period from 2021-02-01T18:00 to 2021-02-11T12:00.
It is required that the container working directory is /work (put command WORKDIR /work in Dockerfile). The name of the external writable folder attached to your container will be /workdir and it will be empty.
Code requirements and suggestions
-
The allowed time limit for the test.sh script is 30 minutes during the testing and 2 hours during the validation phase.
-
The allowed time limit for the train.sh script is 7 days.
-
Hardware specification. Your docker image will be built and your test.sh script will be run on a Linux AWS M4 General Purpose instance with at least 16 GB RAM, 4 vCPUs and 500 GB volume size. In case you place on a prize winning position, the validation of your train.sh script (and test.sh afterwards) will be done on m4.10xlarge or p3.2xlarge Linux AWS instance, based on your preference. Please see here for the details of these instance types.
-
We understand that the instance type and the allowed running time (30 minutes) may be very limiting. Also, dockerizing your solution may be a difficult task which may be worth doing only in case you win a prize. Therefore, if you are able to calculate the forecasts on your own system, but fail to set it up for our system, you can consider doing the following:
-
Calculate the forecast on your system and upload it to a location where it can be downloaded (Dropbox, Google Drive, ...).
-
Set up your test.sh script so that it downloads the precalculated forecast.
-
Sample submission 2, which uses this approach, is available here. If you do it this way, you must make sure that your precalculated forecast is available at the time when we call your test.sh script. See “Testing” section below for more details on this. Also note that in case you win a prize, you will have to provide the dockerized version of your solution with an updated test.sh script which actually calculates the forecast.
TESTING
The submission phase ends at the end of the target month, but you should submit your solution at the beginning of the target month, so that it can be executed on each day of the month. You can upgrade your solution during the month.
After you submit your solution, we first build a docker image from it. You should do it before 08:00 UTC on the day when it will be tested for the first time, otherwise we do not guarantee that your solution will be executed on that day. We will not rebuild the image on each day, we will do it only if you submit another solution. The build command must conclude in less than 30 minutes.
For each target date, we will execute the code always on that date. For each execution, we will create a docker container, which will be destroyed after it outputs the forecast. On each day, we will start executing the solutions at 08:00 UTC (or later) and finish before 16:00 UTC. The final ranking will be based on the target dates from 2021-02-01 (executed on 2021-02-01) to 2021-02-28 (executed on 2021-02-28). We will execute the solutions also for target dates prior to the target month to test the system is working properly, but these executions will not affect the final ranking.
Running different contest solutions at different times may favor solutions executed later (with access to more recent data). Therefore:
-
The solutions will be run each day in a random order.
-
The 8 hour window from 08:00 UTC to 16:00 UTC is just the worst case estimate for running all the solutions. If we will be confident that all the solutions will conclude in a much shorter period of time, we will start the executions later.
-
We will run the solutions more than once.
-
The last execution will be between 15:00 UTC and 16:00 UTC with less resources (1 vCPU, 2GB RAM, 5 minutes running time). If your solution only downloads the precalculated forecast, it should pass this execution.
-
Each resulting output file will be validated against the correct output format (400 data lines, correct columns, ...). We will use the last successful (validated) execution for scoring. If your solution fails to produce the correct output file in the given time limit, you will be considered as you missed to provide the forecast on that date (see the Scoring section below).
During testing, we will only run your test.sh script. In case you place on a prize winning position after the tests, we will run your train.sh followed by your test.sh with new target dates, to verify that your system can provide the forecast for any date in the future.
SCORING
During scoring, your solution CSV files (generated by your docker container) will be matched against expected ground truth data using the following algorithm.
If your test.sh script doesn't generate the expected output file (e.g., because it terminates with an error) or your output file is invalid, your output file will be replaced with the default forecast, which predicts for each location a single constant value: the long-time average over the 10 day period from the years 1995 - 2019. You may miss to provide (including the cases when your solution fails) a forecast for at most 3 target dates in a month, 9 target dates in a quarter and 36 target dates overall to be eligible for the respective prizes.
It may happen that ground truth is not known for a specific location and time (e. g., it may be impossible to measure the streamflow when the stream partially freezes). If at least 20% of the values out of the 40 values (that is, at least 8 values) for a given location and target date are unknown, the location is excluded from scoring at this target date. Note that only values with timestamps 00:00, 06:00, 12:00 and 18:00 are relevant for the scoring purposes; observed values with different timestamps do not affect the score.
Otherwise your score for a single target location and a single target date is calculated as
NRNSE_W = (NRNSE_1_10(actual, predicted)+NRNSE_6_10(actual, predicted))/2,
where actual is the true observed value (streamflow) at each location and time, predicted is your algorithm's output at the corresponding location and time, and NRNSE_1_10 and NRNSE_6_10 is the Normalized Regularized Nash–Sutcliffe model efficiency coefficient calculated over all the 10 days and last 5 days of the prediction range, respectively. To calculate NRNSE (for a given range), we first calculate the Regularized Nash–Sutcliffe model efficiency coefficient, RNSE, as
RNSE = 1 - MSE(actual, predicted) / max(Var(actual), EPS).
Here, MSE is the mean squared error, Var is the variance and EPS = 0.01 is introduced to handle the cases when the observed streamflow is constant or almost constant. Then
NRNSE = 1 / (2 - RNSE).
The value NRNSE_W_Avg is calculated as the average of NRNSE_W values over all locations and all target dates in the given evaluation period (month, quarter or year).
Finally, for display purposes your score is mapped to the [0...100] range (100 being the best), as:
score = 100 * NRNSE_W_Avg
The scoring script is available here.
LEADERBOARDS
During the submission phase, the submission tab contains daily updated provisional scores for the current month, based on incoming live ground truth data. We also provide more detailed info on the scores. It is accessible in this shared folder. There, you will find these files:
- processed_dates_and_sites.csv: info on which dates have been processed so far and when the scores were last updated
- score_details.csv: individual location/site scores
- leaderboard_{period}.csv: provisional leaderboard of the {period} (month, quarter, year) which includes number of missed days for each participant and NRNSE averages.
There is also locations_rankings subfolder, which contains rankings for individual locations. These are only informal rankings.
The following benchmarks are represented by MarathonTester{1..5} handles:
- marathontester1: long term average forecast; if you miss a forecast, you will receive for that test case the same score as this benchmark
- MarathonTester2: RFC forecast; you need to beat it to be aligible for quarterly/overall bonus prizes
- MarathonTester3: NWM forecast (informal benchmark)
- MarathonTester4: NCAR forecast (informal benchmark)
- MarathonTester5: UT forecast (informal benchmark)
These handles, as well as KHGMEC_{raw, bc, lstm}, are not competing for prizes.
The provisional scores are based on ground truth data downloaded at non-specified time, this time may differ for each target date. Note that ground truth data may be subject to revision, so it may change even several days after it is published. After each update, the ground truth for the preceding target dates may or may not be redownloaded.
Final scores for each month will be based on ground truth data downloaded 20 days after the end of the month. E.g., for February, the last submission window ends on March 10th and we will wait another 10 days and then redownload all the ground truth data on March 20th and recalculate all the scores. A few days after we will announce the monthly winners.
GENERAL NOTES
-
This match is not rated.
-
Teaming is allowed. Topcoder members are permitted to form teams for this competition. If you want to compete as a team, please complete a teaming form. After forming a team, Topcoder members of the same team are permitted to collaborate with other members of their team. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing this Topcoder Teaming Form. Each team must declare a Captain. All participants in a team must be registered Topcoder members in good standing. All participants in a team must individually register for this Competition and accept its Terms and Conditions prior to joining the team. Team Captains must apportion prize distribution percentages for each teammate on the Teaming Form. The sum of all prize portions must equal 100%. The minimum permitted size of a team is 1 member, with no upper limit. However, our teaming form only allows up to 10 members in a team. If you have more than 10 members in your team, please email us directly at support@topcoder.com and we will register your team. Only team Captains may submit a solution to the Competition. Notwithstanding Topcoder rules and conditions to the contrary, solutions submitted by any Topcoder member who is a member of a team on this challenge but is not the Captain of the team are not permitted, are ineligible for award, may be deleted, and may be grounds for dismissal of the entire team from the challenge. The deadline for forming teams is February 11th, 2021. Topcoder will prepare a Teaming Agreement for each team that has completed the Topcoder Teaming Form, and distribute it to each member of the team. Teaming Agreements must be electronically signed by each team member to be considered valid. All Teaming Agreements are void, unless electronically signed by all team members by February 18th. Any Teaming Agreement received after this period is void. Teaming Agreements may not be changed in any way after signature.
-
Relinquish - Topcoder is allowing registered competitors or teams to "relinquish". Relinquishing means the member or team will compete, and we will score their solution, but they will not be eligible for a prize. Once a competitor or team relinquishes, we post their name to a forum thread labeled "Relinquished Competitors".
-
Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about the problem itself or possible solution techniques.
-
In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Final Prizes” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.
-
You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client. If your solution includes licensed elements (software, data, programming language, etc) make sure that all such elements are covered by licenses that explicitly allow commercial use.
-
If your solution includes licensed software (e.g. commercial software, open source software, etc), you must include the full license agreements with your submission. Include your licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.
FINAL PRIZES
In order to receive a final prize, you must do all the following:
-
Achieve a score in the top 10 according to the final test results. See the "Scoring" section above.
-
In case of quarterly and overall prizes, your score must be higher than the score calculated over RFC forecasts.
-
Once the final scores are posted and winners are announced, the prize winner candidates have 7 days to submit:
-
Report outlining their final algorithm explaining the logic behind and steps to its approach, including the documentation on how to run the code. You will receive a template that helps you create your final report.
-
Updated dockerized version of your algorithm with working train.sh and test.sh scripts, along with any assets/materials necessary to deploy, use and train it.
-
-
If you place in a prize winning rank but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above.