This article will walk you through using a provided AWS key and secret to download large data sets for Topcoder challenges. Certain challenges rely on large data sets stored in S3. However, users are limited by the AWS S3 console UI to only download one object at a time, which is tedious and consumes precious contest time. If you are participating in a challenge that supplies datasets via AWS S3, you will be provided an AWS key and secret and S3 bucket name that you can use to download the data set(s) from S3 to your local computer. You will most likely receive this information via an email, however specific details will be provided in the challenge specification or the challenge forms.
The easiest way to download the data is through the AWS CLI.
Here are links you can use to download and install the AWS CLI on your local machine:
To use the key and secret you receive, you will add them to the credentials file. In Linux / macOS, this file is ~/.aws/credentials
On Windows, this file is ntials
. UserProfile /.aws/crede
Edit the credentials file and add a section like this, using the key and secret you received. The text in brackets (challenge-name) can be anything you want, but you may have many credentials, depending on how many challenges you compete in that use this process, so you want to use the challenge name
1 2 3
[challenge - name] aws_access_key_id = <key> aws_secret_access_key=<secret>
More detailed information can be found here.
Once the credentials are set up, you can sync the data. Along with the key and secret will also be an S3 bucket name that contains the data set.
The AWS CLI command will look something like this:
aws --profile challenge-name s3 sync <some local path> s3://<bucket name>
Where:
challenge-name is the name you used, in brackets, when adding the key and secret to the credentials file above.
<bucket name>
is the S3 bucket name for the data set, provided with the key and secret
<some local path>
is the path on your local computer where the files will be downloaded.
Please note that you will likely incur AWS charges if you choose to use this method. Topcoder cannot reimburse you for these charges.
Follow all the above configuration steps. Also create an S3 bucket in your AWS account console.
The command will be:
aws --profile challenge-name s3 sync s3://<bucket name> s3://<your bucket>
The data sets for some challenges can be very large. Please ensure your local path that you specify is on a drive with appropriate storage.
Your key and secret give you read-only access to the data. You can only sync from S3 to your local drive. You cannot change or add files and sync those files back to S3.
Your key and secret are yours alone. You should never share them with anyone else.
Your key and secret will be deleted at the end of the challenge, so you will lose access to the data when the challenge completes.
If you would rather not use the AWS CLI, or you want an easier UI, there are utilities available that can download files from S3 buckets:
https://cyberduck.io (Windows and macOS)