Challenge Overview
Problem Statement | |||||||||||||
Prize DistributionPrize USD 1st $25,000 2nd $16,000 3rd $12,000 4th $8,000 5th $5,000 Progress prizes* 3 * $3,000 Special prizes* Best POI Category $5,000 Best undergraduate $5,000 Open Source Incentives 3 * $5,000 Total Prizes $100,000*see the 'Award Details and Requirements to Win a Prize' section for details Background and motivationIntelligence analysts, policy makers, and first responders around the world rely on geospatial land use data to inform crucial decisions about global defense and humanitarian activities. Historically, analysts have manually identified and classified geospatial information by comparing and analyzing satellite images, but that process is time consuming and insufficient to support disaster response. The functional Map of the World (fMoW) Challenge seeks to foster breakthroughs in the automated analysis of overhead imagery by harnessing the collective power of the global data science and machine learning communities. The Challenge publishes one of the largest publicly available satellite-image datasets to date, with more than one million points of interest from around the world. The dataset contains satellite-specific metadata that researchers can exploit to build a competitive algorithm that classifies facility, building, and land use. See more background information about the challenge here. ObjectiveYour task will be to classify the objects present in satellite images. The classification labels your algorithm returns will be compared to ground truth data, the quality of your solution will be judged by the combination of precision and recall, see Scoring for details. Input FilesSatellite images Satellite images are available in a variety of formats:
You may choose any of the above formats (or more of them) to work with, the scene content is the same but the number of spectral bands, the image resolution and the level of image compression are different. Image metadata and ground truth bounding boxes Metadata on each of the image files is available in JSON files, having the same file name but the .tif or .jpg extension is replaced by .json. The most important pieces of metadata are the following:
Differences between training and testing data The file structure and naming conventions of training and testing data are different. Also the testing data has been altered in several ways to remove ground truth information and increase the difficulty of the challenge.
A note on training and validation data. The training dataset contains data in two folders: train and val. The content of these two folders are similar, they were created by randomly assigning the whole training dataset into two subsets. You can use both subsets as training data. Downloads Input files are available for download from the fmow-full and fmow-rgb AWS buckets as well as the corresponding fmow-full and fmow-rgb BitTorrent files. The fmow-full dataset contains the TIFF images, the fmow-rgb dataset contains the compressed JPEG images. Both datasets contain the accompanying image metadata files. A separate guide is available that details the process of obtaining the data. Note that the dataset in the fmow-full bucket is huge (~3.5 TB). The following torrent files are available:
Meta data archives that contain sample false_detection bounding boxes for the val subset of the training data:
Output FilesYour output must be a text file that describes the object classifications your algorithm makes for all of the images in a test set. You must make a prediction for each bounding box specified in the metadata files of the test set. The file should contain lines formatted like: <bounding_box_id>,<category> where
Some sample lines: 3456,crop_field 1234,false_detection Your output must be a single file with .txt extension. Optionally the file may be zipped, in which case it must have .zip extension. Your output must only contain algorithmically generated classifications. It is strictly forbidden to include manually created predictions, or classifications that - although initially machine generated - are modified in any way by a human. FunctionsThis match uses the result submission style, i.e. you will run your solution locally using the provided files as input, and produce a TXT or ZIP file that contains your answer. In order for your solution to be evaluated by Topcoder's marathon system, you must implement a class named FunctionalMap, which implements a single function: getAnswerURL(). Your function will return a String corresponding to the URL of your submission file. You may upload your files to a cloud hosting service such as Dropbox or Google Drive, which can provide a direct link to the file. To create a direct sharing link in Dropbox, right click on the uploaded file and select share. You should be able to copy a link to this specific file which ends with the tag "?dl=0". This URL will point directly to your file if you change this tag to "?dl=1". You can then use this link in your getAnswerURL() function. If you use Google Drive to share the link, then please use the following format: "https://drive.google.com/uc?export=download&id=" + id Note that Google has a file size limit of 25MB and can't provide direct links to files larger than this. (For larger files the link opens a warning message saying that automatic virus checking of the file is not done.) You can use any other way to share your result file, but make sure the link you provide opens the filestream directly, and is available for anyone with the link (not only the file owner), to allow the automated tester to download and evaluate it. An example of the code you have to submit, using Java: public class FunctionalMap { public String getAnswerURL() { //Replace the returned String with your submission file's URL return "https://drive.google.com/uc?export=download&id=XYZ"; } } Keep in mind that your complete code that generates these results will be verified at the end of the contest if you achieve a score in the top 5, as described later in the "Requirements to Win a Prize" section, i.e. participants will be required to provide fully automated executable software to allow for independent verification of the performance of your algorithm and the quality of the output data. ScoringA full submission will be processed by the Topcoder Marathon test system, which will download, validate and evaluate your submission file. Any malformed or inaccessible file, or one that contains an invalid category label, or one that does not contain a prediction for each bounding box that belong to the test set will receive a zero score. First an F-score is calculated for each object category present in the test set. We define true positive (TP), false positive (FP) and false negative (FN) counts as follows: For each bounding box in the test set let E be its expected (i.e. ground truth) category and G be your guessed (i.e. predicted) category. If E equals G then the TP counter for E is incremented by one, otherwise the FN counter for E is incremented by one and also the FP counter for G is incremented by one. Then for each category let
Finally, your score will be the weighted-average of category F-scores calculated as above, multiplied by 1,000,000. The weights for each category are as follows:
Note that although the F-score you achieve in the 'false_detection' category is ignored in the final score calculation, the false_detection -> C mismatches increase the FP counter of category C. That is if a true label of bounding box is 'false_detection' but you classify it as 'park' then this error will lower the F-score you will achieve in the 'park' category, and thus your overall score as well. Similarly, a C -> false_detection mismatch increases the FN counter of category C. For the exact algorithm of the scoring see the visualizer source code. Example submissions can be used to verify that your chosen approach to upload submissions works and also that your implementation of the scoring logic is correct. The tester will verify that the returned String contains a valid URL, its content is accessible, i.e. the tester is able to download the file from the returned URL. If your file is valid, it will be evaluated, and detailed score values will be available in the test results. The example evaluation is based on the following small subset of the training data: bounding_box_id image_id --------------- -------- 144 airport_0 30912 airport_100 30175 park_320 1 prison_0 23 single-unit_residential_0 Though recommended, it is not mandatory to create example submissions. The scores you achieve on example submissions have no effect on your provisional or final ranking. Note that during the first week of the match online scoring will not be enabled, you may make submissions but a score of 0 will be reported. Meanwhile, you can work locally with the provided images, tools and resources. Starting 21st September submissions will be scored normally. Final ScoringThe top 10 competitors according to the provisional scores will be invited to the final testing round. The details of the final testing are described in a separate document. Your solution will be subjected to three tests: First, your solution will be validated (i.e. we will check if it produces the same output file as your last submission, using the same input files used in this contest). Note that this means that your solution must not be improved further after the provisional submission phase ends. (We are aware that it is not always possible to reproduce the exact same results. E.g., if you do online training then the difference in the training environments may result in different number of iterations, meaning different models. Also you may have no control over random number generation in certain 3rd party libraries. In any case, the results must be statistically similar, and in case of differences you must have a convincing explanation why the same result can not be reproduced.) Second, your solution will be tested against a new set of images. Third, the resulting output from the steps above will be validated and scored. The final rankings will be based on this score alone. Competitors who fail to provide their solution as expected will receive a zero score in this final scoring phase, and will not be eligible to win prizes. Additional Resources
General Notes
Award Details and Requirements to Win a PrizeProgress prizes To encourage early participation bonus prizes will be awarded to contestants who reach a certain threshold at 3 check points during the competition. The threshold for the first such prize is 600,000. Thresholds for the 2nd and 3rd such prizes will be announced later in the contest forums. Any competitor whose provisional score is above the threshold will get a portion of the prize fund ($3000 for each month) evenly dispersed between the others who also hit the threshold. To determine these prizes a snapshot of the leaderboard will be taken on the following days: October 10, November 5, December 2. Best POI Category Performance in a single category or subset of no more than 10 categories (to be identified at a later date). The highest scoring eligible submission, calculated using unweighted average F1 if there is more than one category, will be used to award the $5,000 prize. The category or categories will be identified so as to encourage solutions capable of labeling difficult categories that many other contestants underperform on. Best undergraduate The highest scoring (after provisional scoring) undergraduate university student who did not win one of the main prizes is awarded $5,000. Open Source Incentives The top 3 contestants (after final scoring) will be given the option of winning an additional $5,000 by open sourcing their solution and publishing it on GitHub. Final prizes In order to receive a final prize, you must do all the following: Achieve a score in the top 5 according to final test results. See the "Final scoring" section above. Once the final scores are posted and winners are announced, the prize winner candidates have 7 days to submit a report outlining their final algorithm explaining the logic behind and steps to its approach. You will receive a template that helps creating your final report. If you place in a prize winning rank but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above. Additional EligibilityJohns Hopkins University, Booz Allen, and Digital Globe affiliates will be allowed to participate in this challenge, but will need to forego the monetary prizes. Winners will still be publicly recognized by IARPA in the final winner announcements based on their performance. Throughout the challenge, Topcoder���s online leaderboard will display your rankings and accomplishments, giving you various opportunities to have your work viewed and appreciated by stakeholders from industry, government and academic communities. | |||||||||||||
Definition | |||||||||||||
| |||||||||||||
Examples | |||||||||||||
0) | |||||||||||||
|
This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2020, TopCoder, Inc. All rights reserved.