Challenge Overview

Challenge Overview

In this challenge, we are going to

  • create a new kafka processor for Topcoder Marathon Match events
  • use AWS Sagemaker to run Marathon Match submissions

Project Background

Topcoder currently has a custom processor built for scoring marathon matches.

This processor listens to our event bus for submissions, grabs the zip file for the submission, and then turns the submission into a docker image. It then runs the image as a container and produces outputs that a scoring function processes in another container.

At the end a score is produced. Then send the score and the requisite metadata back to the review API for that submission with its score and details.

We would like to adopt SageMaker to replace current solution.

Technology Stack

  • Node 10
  • Nodejs AWS SDK
  • Docker

Challenge Assets

  • One kafka processor codebase
  • Sample submission code

Individual requirements

Here is the sample of one message in kafka

{
  "topic": "submission.notification.aggregate",
  "originator": "me",
  "timestamp": "2020-01-21T00:01:00.000Z",
  "mime-type": "application/json",
  "payload": {
	  "resource": "review",
	  "id": "f3c3f4e3-6384-4042-a093-bac85d1dda81",
	  "created": "2020-01-21T00:48:26.870Z",
	  "updated": "2020-01-21T00:48:26.870Z",
	  "createdBy": "mL4037GVeLQOvb8TgU4KW38mlevpfp4y@clients",
	  "updatedBy": "mL4037GVeLQOvb8TgU4KW38mlevpfp4y@clients",
	  "status": "completed",
	  "score": 100,
	  "reviewerId": "b8ff717c-d098-487e-a61f-4e40c7595a65",
	  "submissionId": "d01c8c8b-2a7f-463f-992b-7bc5423e1dd0",
	  "scoreCardId": 30001850,
	  "typeId": "55bbb17d-aac2-45a6-89c3-a8d102863d05",
	  "originalTopic": "submission.notification.create"
  }
}

1. (major) Create new kafka processor

One kafka processor is shared in the forum. Create the new processor based on this code.

  • Update app.js, listening to KAFKA_SUBMISSION_AGGREGATE_TOPIC

    check if the following fields match the values, if not, write log and return

    • "originalTopic": "submission.notification.create"
    • "typeId": "55bbb17d-aac2-45a6-89c3-a8d102863d05"
    • "score": 100
    • "resource": "review"
  • Update ProcessorService.js to follow the steps (see below)

  • Please use config package to define useful variables, here are a list of things we would like to be configurable:

    • kafka related, such as URL, cert. They already exist in the current codebase.
    • KAFKA_SUBMISSION_AGGREGATE_TOPIC: default to submission.notification.aggregate if no env is provided
    • submission api related config: see https://www.npmjs.com/package/@topcoder-platform/topcoder-submission-api-wrapper
    • INPUT_S3_BUCKET: S3 bucket name, used by Sagemaker for training data
    • INPUT_FOLDER_PREFIX: in S3 bucket, data is in stored in a folder
    • OUTPUT_S3_BUCKET: S3 bucket name, used by Sagemaker for uploading model
    • INSTANCE_TYPE: instance type used for Sagemaker
    • INSTANCE_NUMBER: instance number for Sagemaker
    • IAM_ROLE: AWS IAM role for running Sagemaker job. This is important because we don't want to use aws root credentials.

Please add more config if applied and ask in the forum if you are not sure. Please keep the code clean and only keep useful files.

2. (major) Update ProcessorService.js

  • Get submission id from message payload (payload.submissionId)

  • Use https://www.npmjs.com/package/@topcoder-platform/topcoder-submission-api-wrapper to download submission

    sample submission is provided. Its structure looks like this:

    code \

    |

    | — src \

    | — Dockerfile

    |

  • Unzip downloaded submission into a temperate folder

  • Build docker image and upload to AWS ECR

    • image name is {submissionId}
    • tag name is latest
  • Train the model by calling Sagemaker. Input data is in S3 bucket {INPUT_S3_BUCKET + INPUT_FOLDER_PREFIX}. Output model is in S3 bucket {OUTPUT_S3_BUCKET + submissionId}. Wait until its finish.

Important Notes
- Use async await pattern.
- Use standard as your linter. Ensure there are no lint errors in your submission.
- Typescript is not allowed. No build or compilation of code is expected.
- No tests are needed.


Final Submission Guidelines

  • code for new MM processor
  • README and VERIFICATION guides should be provided in new MM processor.

ELIGIBLE EVENTS:

2020 Topcoder(R) Open

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30113997