Challenge Overview
Problem Statement | ||||||||||||||||||||||||
Prize Distribution1st place - $2,500 2nd place - $2,000 3rd place - $1,500 4th place - $1,000 BackgroundThe client is looking to run a contest in order to better understand the effect on market prices of traded securities based on trading volume data. Contestants will use supplied traded data to create an algorithm that will attempt to predict swap prices. The 2010 Dodd���Frank Wall Street Reform and Consumer Protection Act (the Dodd-Frank Act) created new entities called swap data repositories (SDRs) ���in order to provide a central facility for swap data reporting and collecting. Under the Dodd- Frank Act, all swaps, whether cleared or uncleared, are required to be reported to registered SDRs.��� As of January 2013, all registered swap dealers active in credit and interest rate trading send trade data to the public swap repository. Depending on their size and type (e.g., block trades), swap transactions must be reported within 5 to 15 minutes of execution. These developments have increased the availability of swap trade data. An extract of this data for a specified time period is supplied for this challenge. ObjectiveSupply and demand in the swap market affect swap prices. Swap prices are also influenced by tenor. Tenor is the maturity of the swap measured in full-years such as 2, 3, 5, 7, 10, and 30. We are interested in using the volume of vanilla US$ / Libor spot start swap transactions of the full-year maturities to predict the prices of those same instruments over relatively short time intervals. The scoring will focus on the tenors of full years in PriceData. In SwapData, you may receive some irregular tenors such as *Y*M. Data DescriptionPrice Data:
ImplementationThe evaluation will be a streaming mode. That is, predictions are made when you receiving some new data and your prediction will be compared to mid prices in a short period (e.g., 5 to 10 minutes) after the latest data your have. The data will be sent strictly in the chronological order. Your task is to implement two methods: update and predict, whose signatures are detailed in the Definition section below. Both methods will be called several times. In update, you will receive some new data with timestamps. More specifically, you will receive two lists of comma separated strings (quotes enclosed). The columns are in the same order as data description. In predict, you should return a list of predictions in the same order of the received test data. The test data has the similar format as Price Data, but there are no ���ABC mid��� and ���DEF mid���. Every prediction forms a string containing two values separated by a comma: the predicted ABC mid and DEF mid of the specified tenor at the specified time. For example, ���1.002,2.000��� (without quotes) could be a prediction. ScoringSubmissions will be scored by running the solution against different data from different time periods. Before the first call of predict, at least 2 hours of data will be given to make sure you have a reasonable volume of data to build up your model. The generation of test case is as follows:
In every test case, the raw error is calculated as rawErr = 0 for i = 1 to N do rawErr += (ABCTruth[i] - ABCPred[i])^2 + (DEFTruth[i] - DEFPred[i])^2 where, N is the total number of predictions. As a naive solution, we will use the average price of all seen data of the same tenor as the baseline. For example, to predict the ABC mid for 3Y, all 3Y ABC mid���s have been seen until now will be used to calculate an average as the ABCpred. If there is no such data ever seen before, we will predict it as 0. The raw error computed based on this method serves as our baseErr. The raw score will beraw score = max(0, 1 - rawErr / baseErr) The final score of each test case will be the raw score multiplied by 1000000.0. And the score showing on the standing will be the average score of different test cases. Requirements to Win A PrizeIn order to receive a prize, you must do all the following:
ReportYour report must be at least 2 pages long, contain at least the following sections, and use the section and bullet names below. Your InformationThis section must contain at least the following:
Approach UsedPlease describe your algorithm so that we know what you did even before seeing your code. Use line references to refer to specific portions of your code. This section must contain at least the following:
If you place in the top 4 but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above. Additional Information
| ||||||||||||||||||||||||
Definition | ||||||||||||||||||||||||
| ||||||||||||||||||||||||
Notes | ||||||||||||||||||||||||
- | This match (is) rated. | |||||||||||||||||||||||
- | The allowed programming languages are Java, Python, C++, C# and VB. | |||||||||||||||||||||||
- | You can include open source code in your submission, provided it is free for you to use and would be for the client as well. Code from open source libraries under the BSD or MIT licenses will be accepted. Other open source licenses may be accepted too, just be sure to ask us. | |||||||||||||||||||||||
- | The usage of external data and pre-trained models are allowed, as long they meet the license requirements. | |||||||||||||||||||||||
- | The test servers have only the default installs of all languages, so no additional libraries will be available. | |||||||||||||||||||||||
- | Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about the problem itself, possible solution techniques or related to data analysis. | |||||||||||||||||||||||
- | You can train your solution offline based on the given files and you can hardcode data into your solution -- just remember that you can't use data from other sources than this contest. | |||||||||||||||||||||||
- | There are 2 test cases in example test; 10 test cases in provisional test; 30 test cases in system test. | |||||||||||||||||||||||
- | Time limit is 10 minutes per test and memory limit is 1024MB. | |||||||||||||||||||||||
- | There is no explicit code size limit. The implicit source code size limit is around 1 MB (it is not advisable to submit codes of size close to that or larger). | |||||||||||||||||||||||
- | The compilation time limit is 600 seconds. You can find information about compilers that we use and compilation options here. | |||||||||||||||||||||||
Examples | ||||||||||||||||||||||||
0) | ||||||||||||||||||||||||
| ||||||||||||||||||||||||
1) | ||||||||||||||||||||||||
|
This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2020, TopCoder, Inc. All rights reserved.