Spacenet 7: Multi-Temporal Urban Development Challenge

Register
Submit a solution
The challenge is finished.

Challenge Overview

The SpaceNet Challenge, round 7

Multi-Temporal Urban Development Challenge



 

Prize Distribution

 

Prize                     USD

1st                  $20,000

2nd                  $10,000

3rd                   $7,500

4th                   $5,000

5th                   $2,500

Top Graduate          $2,500

Top Undergraduate     $2,500

Total Prizes         $50,000

 

Challenge Overview

Satellite imagery analytics have numerous human development and disaster response applications, particularly when time series methods are involved [1]. The SpaceNet 7 Multi-Temporal Urban Development Challenge aims to improve these methods, while simultaneously advancing the state of the art in SpaceNet’s core mission of foundational mapping. SpaceNet 7 will be featured as a competition at the 2020 NeurIPS conference in December, and seeks to identify and track buildings in a time series of satellite imagery collected over rapidly urbanizing areas. Beyond its relevance for disaster response, disease preparedness, and environmental monitoring, this task poses interesting technical challenges for the computer vision community. 

The competition uses a new open source dataset of Planet satellite imagery mosaics, which includes 24 images (one per month) covering ~100 unique geographies. Each geography features significant urban change over the two-year timespan. The dataset comprises >40,000 square kilometers of imagery and exhaustive polygon labels of building footprints in the imagery, totaling over 10 million individual annotations. Challenge participants are asked to track individual building construction over time, thereby directly assessing urbanization.

SpaceNet 7 poses a unique challenge from a computer vision standpoint because of the small pixel area of each object, the high object density within images, and the dramatic image-to-image difference compared to frame-to-frame variation in video object tracking. The stakeholders of this contest believe this challenge will aid efforts to develop useful tools for overhead change detection.

CosmiQ Works’ blog, The DownLinQ, provides additional background information on this challenge, details on the SCOT metric used for scoring, and an overview of the dataset used in the challenge.

 

Input Files

 

For an overview of the dataset see here

 

Satellite Images

 

Imagery consists of RBGA (red, green, blue, alpha) 8-bit electro-optical (EO) monthly mosaics from Planet’s Dove constellation at 4 meter resolution. For each of the Areas Of Interest (AOIs), the data cube extends for roughly two years, though it varies somewhat between AOIs. All images in a data cube are the same shape, though some data cubes have shape 1024 x 1024 pixels, while others have a shape of 1024 x 1023 pixels.  Each image accordingly has an extent of roughly 18 square kilometers.

 

Images are provided in GeoTiff format, and there are two imagery data types:

 
  1. images  (training only) - Raw imagery, in EPSG:3857 projection.

  2. images_masked (training + testing) - Unusable portions of the image (usually due to cloud cover) have been masked out, in EPSG:3857 projection..

 

Building Footprint Labels

The location and shape of known buildings are referred to as ‘ground truth’ in this document. Building footprint labels are distributed in multiple formats for the training set:

  1. labels This folder contains the raw building footprint labels, along with unusable data mask (UDM) labels.  UDMs are caused primarily by cloud cover.  Building footprint labels will not overlap with UDM areas. In EPSG:4326 projection.

  2. UDM_masks This folder contains the UDM labels rendered as binary masks, in EPSG:4326 projection.

  3. labels_match This folder contains building footprints reprojected into the coordinate reference system (CRS) of the imagery (EPSG:3857 projection).  Each building footprint is assigned a unique identifier (i.e. address) that remains consistent throughout the data cube.  

  4. labels_match_pix This folder contains building footprints (with identifiers) in pixel coordinates of the image.

  5. CSV format. All building footprints of the whole training set are described in a single CSV file. It is possible to work only with this file, you may or may not find additional value in using the other options listed above. This file has the following format:

filename,id,geometry

 

global_monthly_2020_01_mosaic_L15-1281E-1035N_5125_4049_13,42,"POLYGON ((1015.1 621.05, 1003.7 628.8, 1001.5 625.7, 1012.9 617.9, 1015.1 621.05))"

 

global_monthly_2018_12_mosaic_L15-0369E-1244N_1479_3214_13,0,POLYGON EMPTY

 

global_monthly_2019_08_mosaic_L15-0697E-0874N_2789_4694_13,10,"POLYGON ((897.11 102.88, 897.11 104.09, 900.29 121.02, 897.11 121.02, 897.11 125.59, 891.85 125.59, 891.85 102.88, 897.11 102.88), (900.29 113.97, 900.29 108.92, 897.11 108.92, 897.11 113.97, 900.29 113.97))"

 

(The sample above contains 4 lines of text, extra line breaks are added only for readability.)

  • filename is a string that uniquely identifies the image. See 'Notes on file names' later for details.

  • id is a unique identifier (i.e. address), an integer assigned to the building footprint that remains consistent throughout the data cube.

  • geometry specifies the points of the shape that represents the building in Well Known Text format. Only the POLYGON object type of the WKT standard is supported. The special POLYGON EMPTY construct is used for cases when there are no buildings present in the image. The building id is ignored in this case. (See line #3 for an example above.) The coordinate values represent pixels, the origin of the coordinate system is at the top left corner of the image, the first coordinate is the x value (positive is right), the second coordinate is the y value (positive is down). Note that the polygons must be closed: the first and last entries in their point list must be the same.

Usually a building is defined by a single polygon that represents the exterior boundary edge of its shape. This is sometimes not enough, see building #14 in global_monthly_2018_01_mosaic_L15-0357E-1223N_1429_3296_13 in the training set for an example shape that contains a hole. In such cases two (or more) polygons are needed, the first one always defines the exterior edge, the second (third, fourth, etc.) defines the interior rings (holes). (See line #4 above for an example of the required syntax in this case.) Note that the way of representing holes is different from how the WKT standard specifies them: the standard mandates that the points of a polygon must be enumerated in an anti-clockwise order if the polygon represents a shape with positive area (exterior rings), and in a clockwise order if the polygon represents a shape with negative area (interior rings). However, what appears clockwise in the latitude/longitude based geographic coordinate system appears anti-clockwise in the image space where the y axis points downwards, so in the hope of causing less confusion we chose a simpler approach: the first polygon is always positive, the rest are always negative.

 

Downloads

Input files are available for download from the spacenet-dataset AWS bucket. See this guide for details on how to set up an account to access data stored in an AWS S3 bucket. 

 

Note that the same bucket holds data for the previous SpaceNet challenges as well, you need only a subset of the bucket content for this challenge: download only the files in the spacenet/SN7_buildings/tarballs directory (~10GB compressed). 

Here

  • SN7_buildings_train_sample.tar.gz contains a single data cube with images and labels from the training set. Use this if you want to get familiar with the data without having to download any of the large files.

  • SN7_buildings_train.tar.gz is the training set. It contains imagery and building footprint labels for 60 AOIs (areas of interest).

  • SN7_buildings_train_csvs.tar.gz contains a CSV file that describes all building footprints for the whole training set in a single CSV file.

  • SN7_buildings_test_public.tar.gz is the testing set that you should use to submit results during the provisional testing phase of the challenge. It contains imagery but does not contain building footprints.

 

Note that individual files or smaller subsets are available from the spacenet/SN7_buildings/train/ or spacenet/SN7_buildings/test_public/ folders. Use these if you want to get familiar with the data without having to download the full sets.

 

Notes on file names.

  • The format of a filename (as defined above for the footprint definition CSV file) is:
    global_monthly_<time>_mosaic_<AOI-name>
    for example:
    global_monthly_2018_02_mosaic_L15-0369E-1244N_1479_3214_13

  • <time> is a timestamp in YYYY_MM format that represents when image collection happened.

  • <AOI-name> is a unique identifier of a location. All AOI-names are 28 characters long.

  • All ids (filenames and AOI names) are case sensitive.

  • Image data is stored in files named <filename>.tif in the images, images_masked and UDM_masks folders. 

 

Output file

Your output must be a CSV file with identical format to the building footprint definition files.

  • filename,id,geometry

 

Your output file may or may not include the above header line. The rest of the lines should specify the buildings your algorithm extracted, one per line.

The required fields are:

  • “filename” is a string that uniquely identifies the image within a test set.

  • “id” specifies the unique identifier of the individual building.

  • “geometry” specifies the points of the shape that represents the building you found, in pixel coordinates. 

Your output must be a single file with .csv extension. The file must not be larger than 500MB and must not contain more than 4 million lines.

 

Constraints

  • A single file must contain building footprints for ALL images in the test set.

  • The file may (and typically does) contain multiple lines for the same filename.

  • Two lines with the same filename may not also have the same id. In other words, each building identifier should appear no more than once per image.

  • If you found no buildings on an image then you must use the POLYGON EMPTY construct. In this case your file must not contain other lines for the same filename.

 

Submission format and code requirements

This match uses a combination of the "submit data" and "submit code" submission styles. The required format of the submission package is specified in a submission template document. This current document gives only requirements that are either additional or override the requirements listed in the template.

  • Your algorithm must process the AOIs of the test set one by one, that is, when you are predicting building footprints you must not use information from adjacent AOIs.

  • You must not submit more often than 3 times a day. The submission platform does not enforce this limitation, it is your responsibility to be compliant to this limitation. Not observing this rule may lead to disqualification.

  • An exception to the above rule: if your submission scores 0, then you may make a new submission after a delay of 1 hour. 

  • The /solution folder of the submission package must contain the solution.csv file, which should be formatted as specified above in the Output file section and must list all extracted buildings from all images in the test set. 

 

Scoring

 

During scoring, your solution.csv file (as contained in your submission file during provisional testing, or generated by your docker container during final testing) will be matched against  expected ground truth data using the following method.

If your solution is invalid (e.g. if the tester tool can't successfully parse its content), you will receive a score of 0.

If your submission is valid, your solution will be scored according to the SpaceNet Change and Object Tracking (SCOT) metric. The metric is described conceptually in this article. A Python implementation of this metric is available in the Solaris GitHub repo. The score is generated by running the multi_temporal_buildings() function using default values for all keyword arguments.  (That function is defined here and relies on other functions defined here). A Java implementation of the metric is available in the visualizer source code, see the common.ScoringAlgorithm class. 

For convenience, we also describe the metric below. For the exact details of the algorithm see the Python or Java implementation.

  • For each AOI, the following steps are carried out:

    • For each month, matches are made between ground truth building footprints and solution building footprints from the solution.csv file.  A pair of footprints (one ground truth and one solution) are eligible to be matched if their IOU (intersection over union) exceeds 0.25, and no footprint may be matched more than once.  A set of matches is chosen that maximizes the number of matches.  If there is more than one way to achieve that maximum, then as a tie-breaker the set with the largest sum of IOUs is used. The resolution of any remaining ties is implementation-dependent (this is generally infrequent).

    • A match between a ground truth footprint and a solution footprint is a “mismatch” if the ground truth footprint’s identifier (in the “id” column) was most recently matched to a different solution identifier, or vice versa.

    • A footprint is “new” if it appears in any month after the first one for the given AOI and its identifier has not been used for another footprint of the same kind (i.e., ground truth or solution) in any previous month.

    • An F1 score is calculated where all matches that are not mismatches are considered true positives, while all unmatched or mismatched ground truth and solution footprints are considered false negatives and false positives, respectively.  This is the “tracking term.”

    • An F1 score is calculated where all matches between two new footprints are considered true positives, while all new ground truth and solution footprints without a new match are considered false negatives and false positives, respectively.  This is the “change detection term.”

    • A combined score for the AOI is calculated by taking a weighted harmonic mean of the change detection term and tracking term, using a weight value beta=2 to emphasize the tracking term.

  • The final score is the average of the combined scores on all the individual AOIs.

 

When computing the score, all ground truth footprints smaller than 4 pixels are ignored.  However, solution.csv files are permitted to include footprints of any size.

 

Final testing

This section details the final testing workflow, and the requirements against the /code folder of your submission are also specified in the submission template document. This current document gives only requirements or pieces of information that are either additional or override those given in the template. You may ignore this section until you decide you start to prepare your system for final testing. 

  • The signature of the train script is as given in the template:
    train.sh <data_folder>
    The supplied <data_folder> parameter points to a folder having raw SpaceNet satellite data in the same structure as is available for you during the coding phase, tar.gz files already extracted. The supplied <data_folder> will be the parent folder of the folders representing AOIs. The ground truth CSV file will also be available in the same folder.

  • The allowed time limit for the train.sh script is 8 GPU-days (2 days on a p3.8xlarge with 4 GPUs). Scripts exceeding this time limit will be truncated.

  • A sample call to your training script (single line) follows. Note that folder names are for example only, you should not assume that the exact same folders will be used in testing.
    ./train.sh /data/SN7_buildings/train/
    In this sample case the training data looks like this:
      data/

    SN7_buildings/
      train/

        L15-0331E-1257N_1327_3160_13/

        L15-0357E-1223N_1429_3296_13/

        ... etc., other AOI folders

        sn7_train_ground_truth_pix.csv
 

  • The signature of the test script:
    test.sh <data_folder> <output_file>
    The testing data folder contains similar imagery as is available for you during the coding phase. The supplied <data_folder> will be the parent folder of folders representing AOIs.

  • The allowed time limit for the test.sh script is 12 GPU-hours (3 hours on a p3.8xlarge with 4 GPUs) when executed on the full provisional test set (the same one you used for submissions during the contest). Scripts exceeding this time limit will be truncated.

  • A sample call to your testing script (single line) follows. Again, folder and file names are for example only, you should not assume that the exact same names will be used in testing.
    ./test.sh /data/SN7_buildings/test_public/ /wdata/my_output.csv
    In this sample case the testing data looks like this:
      data/

    SN7_buildings/
      test_public/
        L15-0369E-1244N_1479_3214_13

        L15-0391E-1219N_1567_3314_13 

        ... etc. other AOI folders
      

  • The verification workflow will be different from what is described in the template. All submissions will be retrained first, and final scores will be established by using the retrained models on the final test set.

  • To speed up the final testing process the contest admins may decide not to build and run the dockerized version of each contestant's submission. It is guaranteed however that at least the top 10 ranked submissions (based on the provisional leader board at the end of the submission phase) will be final tested.

  • Hardware specification. Your docker image will be built, test.sh and train.sh scripts will be run on a p3.8xlarge Linux AWS instance. Please see here for the details of this instance type.

 

Additional Resources

 
  • A visualizer is available that you can use to test your solution locally. It displays satellite images, your extracted building footprints and the expected ground truth. It also calculates detailed scores so it serves as an offline tester.

  • See the algorithm description and source code of winning entries in previous SpaceNet contests here.

 

General Notes

 
  • This match is NOT rated.

  • Teaming is allowed. Topcoder members are permitted to form teams for this competition. After forming a team, Topcoder members of the same team are permitted to collaborate with other members of their team. To form a team, a Topcoder member may recruit other Topcoder members, and register the team by completing this Topcoder Teaming Form. Each team must declare a Captain. All participants in a team must be registered Topcoder members in good standing. All participants in a team must individually register for this Competition and accept its Terms and Conditions prior to joining the team. Team Captains must apportion prize distribution percentages for each teammate on the Teaming Form. The sum of all prize portions must equal 100%. The minimum permitted size of a team is 1 member, the maximum permitted team size is 5 members. Only team Captains may submit a solution to the Competition. Topcoder members participating in a team will not receive a rating for this Competition. Notwithstanding Topcoder rules and conditions to the contrary, solutions submitted by any Topcoder member who is a member of a team on this challenge but is not the Captain of the team are not permitted, are ineligible for award, may be deleted, and may be grounds for dismissal of the entire team from the challenge. The deadline for forming teams is 11:59pm ET on the 21th day following the date that Registration & Submission opens as shown on the Challenge Details page. Topcoder will prepare a Teaming Agreement for each team that has completed the Topcoder Teaming Form, and distribute it to each member of the team. Teaming Agreements must be electronically signed by each team member to be considered valid. All Teaming Agreements are void, unless electronically signed by all team members by 11:59pm ET of the 28th day following the date that Registration & Submission opens as shown on the Challenge Details page. Any Teaming Agreement received after this period is void. Teaming Agreements may not be changed in any way after signature.
    The registered teams will be listed in the contest forum thread titled “Registered Teams”.

  • Organizations such as companies may compete as one competitor if they are registered as a team and follow all Topcoder rules.

  • Relinquish - Topcoder is allowing registered competitors or teams to "relinquish". Relinquishing means the member will compete, and we will score their solution, but they will not be eligible for a prize. Once a person or team relinquishes, we post their name to a forum thread labeled "Relinquished Competitors". Relinquishers must submit their implementation code and methods to maintain leaderboard status.

  • In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Requirements to Win a Prize” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.

  • You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client.

  • If your solution includes licensed software (e.g. commercial software, open source software, etc.), you must include the full license agreements with your submission. Include your licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.

  • External data sets and pre-trained networks are allowed for use in the competition provided the following are satisfied:

    • The external data and pre-trained network dataset are unencumbered with legal restrictions that conflict with its use in the competition.

    • The data source or data used to train the pre-trained network is defined in the submission description.

    • The external data source must be declared in the competition forum in the first 45 days of the competition to be eligible in a final solution. References and instructions on how to obtain are valid declarations (for instance in the case of license restrictions). If you want to use a certain external data source, post a question in the forum thread titled “Requested Data Sources”. Contest stakeholders will verify the request and if the use of the data source is approved then it will be listed in the forum thread titled “Approved Data Sources”.

    • Using OpenStreetMap or similar, street level data sources that explicitly describe the targeted test city is not allowed.

    • Using georeferencing of the data (utilizing the geographic coordinates) is not allowed in the testing phase.

    • Using geographically adjacent tiles to improve prediction scores is not allowed.

  • Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about possible solution techniques.

 

Award details and requirements to Win a Prize 

 

Final prizes

In order to receive a final prize, you must do all the following:

 

Achieve a score in the top five according to final system test results. See the "Final scoring" section above.

 

Comply with all applicable Topcoder terms and conditions.

 

Once the final scores are posted and winners are announced, the prize winner candidates  have 7 days to submit a report outlining their final algorithm explaining the logic behind and steps to its approach. You will receive a template that helps when creating your final report.

 

If you place in a prize winning rank but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above.

 

Top undergraduate / top graduate award

The highest scoring undergraduate university student (someone actively pursuing a Bachelor’s degree), or team of such students is awarded $2,500. The same applies to the highest scoring graduate student (someone pursuing a master’s or doctoral degree) or team of such students.

 

Both prizes are based on the rankings after the final tests. The top undergraduate / graduate prize and the final prizes are not exclusive, the same contestant / team may win a final prize and also one of these two awards. For teams to be eligible for one of these awards all team members must be eligible for the same award. 

 

Eligibility

To be eligible to win the Graduate and Undergraduate prizes, individuals must provide proof of enrollment in an accredited degree program prior to the end of the challenge.

 

Employees of (or contractors for) In-Q-Tel, Maxar Technologies (DigitalGlobe, Radiant Solutions, SSL, and MDA), Amazon Web Services (AWS), Intel, TopCoder, Capella Space, the Geoscience and Remote Sensing Society (GRSS) of IEEE are not allowed to participate in the contest.

 

---