Challenge Overview
Challenge Overview
In this challenge, we are going to
- create a new kafka processor for Topcoder Marathon Match events
- use AWS Sagemaker to run Marathon Match submissions
Project Background
Topcoder currently has a custom processor built for scoring marathon matches.
This processor listens to our event bus for submissions, grabs the zip file for the submission, and then turns the submission into a docker image. It then runs the image as a container and produces outputs that a scoring function processes in another container.
At the end a score is produced. Then send the score and the requisite metadata back to the review API for that submission with its score and details.
We would like to adopt SageMaker to replace current solution.
Technology Stack
- Node 10
- Nodejs AWS SDK
- Docker
Challenge Assets
- One kafka processor codebase
- Sample submission code
Individual requirements
Here is the sample of one message in kafka
{ "topic": "submission.notification.aggregate", "originator": "me", "timestamp": "2020-01-21T00:01:00.000Z", "mime-type": "application/json", "payload": { "resource": "review", "id": "f3c3f4e3-6384-4042-a093-bac85d1dda81", "created": "2020-01-21T00:48:26.870Z", "updated": "2020-01-21T00:48:26.870Z", "createdBy": "mL4037GVeLQOvb8TgU4KW38mlevpfp4y@clients", "updatedBy": "mL4037GVeLQOvb8TgU4KW38mlevpfp4y@clients", "status": "completed", "score": 100, "reviewerId": "b8ff717c-d098-487e-a61f-4e40c7595a65", "submissionId": "d01c8c8b-2a7f-463f-992b-7bc5423e1dd0", "scoreCardId": 30001850, "typeId": "55bbb17d-aac2-45a6-89c3-a8d102863d05", "originalTopic": "submission.notification.create" } }
1. (major) Create new kafka processor
One kafka processor is shared in the forum. Create the new processor based on this code.
-
Update app.js, listening to KAFKA_SUBMISSION_AGGREGATE_TOPIC
check if the following fields match the values, if not, write log and return
- "originalTopic": "submission.notification.create"
- "typeId": "55bbb17d-aac2-45a6-89c3-a8d102863d05"
- "score": 100
- "resource": "review"
-
Update ProcessorService.js to follow the steps (see below)
-
Please use config package to define useful variables, here are a list of things we would like to be configurable:
- kafka related, such as URL, cert. They already exist in the current codebase.
- KAFKA_SUBMISSION_AGGREGATE_TOPIC: default to submission.notification.aggregate if no env is provided
- submission api related config: see https://www.npmjs.com/package/@topcoder-platform/topcoder-submission-api-wrapper
- INPUT_S3_BUCKET: S3 bucket name, used by Sagemaker for training data
- INPUT_FOLDER_PREFIX: in S3 bucket, data is in stored in a folder
- OUTPUT_S3_BUCKET: S3 bucket name, used by Sagemaker for uploading model
- INSTANCE_TYPE: instance type used for Sagemaker
- INSTANCE_NUMBER: instance number for Sagemaker
- IAM_ROLE: AWS IAM role for running Sagemaker job. This is important because we don't want to use aws root credentials.
Please add more config if applied and ask in the forum if you are not sure. Please keep the code clean and only keep useful files.
2. (major) Update ProcessorService.js
-
Get submission id from message payload (payload.submissionId)
-
Use https://www.npmjs.com/package/@topcoder-platform/topcoder-submission-api-wrapper to download submission
sample submission is provided. Its structure looks like this:
code \
|
| — src \
| — Dockerfile
|
-
Unzip downloaded submission into a temperate folder
-
Build docker image and upload to AWS ECR
- image name is {submissionId}
- tag name is latest
-
Train the model by calling Sagemaker. Input data is in S3 bucket {INPUT_S3_BUCKET + INPUT_FOLDER_PREFIX}. Output model is in S3 bucket {OUTPUT_S3_BUCKET + submissionId}. Wait until its finish.
Important Notes - Use async await pattern. - Use standard as your linter. Ensure there are no lint errors in your submission. - Typescript is not allowed. No build or compilation of code is expected. - No tests are needed.
Final Submission Guidelines
- code for new MM processor
- README and VERIFICATION guides should be provided in new MM processor.