Topcoder Challenge | Topcoder Community

Challenge Overview

Context

Project Context

Ubahn is an employee management system to determine employees that are no longer working on active projects and to understand their qualifications and expertise for suitability in other projects

Challenge Context

Update our existing elasticsearch implementation to use Elasticsearch’s enrich data feature

Expected Outcome

Currently, when we fetch users data, we get the enriched data already. However, the approach taken here is different. After this contest, we are expecting the data to be enriched through elasticsearch’s enrich processor.

Challenge Details

Technology Stack

Nodejs version 12
Elasticsearch version 7.7 (Note that the earlier version expected 7.4 - for this contest, you need to use 7.7 since the enrich feature is not available in 7.4)

Development Assets

Two code bases are involved here

You can find the API code base here. Ensure that you are using the “develop” branch.
You can find the Elasticsearch processor code base here. Ensure that you are using the “develop” branch.

Individual requirement

Before we dive into the requirements, let us first understand the current behaviour.

Start off by deploying the api. Follow the README.md file of the API codebase. Note that this incorrectly indicates the Elasticsearch version as 6.x - you need to use 7.x (7.4 precisely - and replace it with 7.7 for this contest)
After deploying, insert mock data through the npm run insert-data script.
Once done, make an api request POST /skill-search/users or GET /users?enrich=true (import POSTMAN collection in the docs folder) to get the users and the enriched data. Enriched data here refers to the achievements, skills, attributes, external profiles, roles etc details that are passed along with the user details in the request
If you debug this workflow, you will find that the code first fetches the user details, and then proceeds to individually fetch the related data for each user before returning the response. The two functions involves are searchUsers() and searchElasticsearch() present in this file

Now, let’s come to the requirements of this contest

We are not expecting any changes to the response of the earlier API request. We are however changing how we gather this data.
Currently, the “enrichment” happens during read - we fetch the users and then fetch the related data (in the same request) and collect them all to return the data in the response.
We want to instead start using Elasticsearch’s “Enrich Processor” feature (available in 7.7 version of elasticsearch and NOT in 7.4) to enrich the data.
Where earlier we used to enrich the data during read, after implementing the enrich processor, you will be enriching the data during write, that is, when the user record is created or updated in elasticsearch, the user data is enriched.
When we thus make the same API request as before, the data is thus returned as before too (because it is already enriched and stored in the index) and does not undergo any additional enrichment in the code.
Note that the existing filters used in the API are still applicable - you are only changing how the data is enriched.
You can check out this example that explains in simple steps how one goes about “enriching” the data. You can apply the same logic to our application too.

Where does the elasticsearch processor come into picture?

When data is modified (created, updated or deleted) through the API, the api only carries out the action in the Amazon QLDB database. It then sends a message to the bus api which posts the message to Kafka. This is read by the elasticsearch processor, which is the one that modifies the data in Elasticsearch. Think of the API as only reading data from elasticsearch whereas the processor is the one that writes the data.

While you will be making changes to the API, the changes are related to how we read the data.
The bulk of your changes would actually go into the elasticsearch processor, where we write the data into elasticsearch.
The enrich policy creation, execution, ingest pipeline creation and use during insertion of data - all these would go into the elasticsearch processor.

Important Notes

Use standard for your lint tool
Use the async await pattern
For the API, tests are not needed. For the processor, updating the tests is a nice-to-have.
You are expected to modify the existing scripts in the API that insert mock data to work with the new enrichment approach.
Both GET /users?enrich=true and POST /skill-search/users endpoints are in scope. The changes that you make should be applicable to both.

Deployment guide and validation document

Update the deployment guide as needed.

Scorecard Aid

Judging Criteria

Implementing the enrichment feature in both the API and Processor is a major requirement.
Updating the mock data and scripts is a minor requirement.
Reviewers will use the respective sections in the scorecard, depending on the type of requirement.

Final Submission Guidelines

You need to submit a zip file containing the git patch of the changes made, which will be applied to the “develop” branch. Submit two git patch files - one for the api and one for the processor. This is the recommended approach. However, if you face issues generating the patch, you can upload the entire code base too in the zip file. As much as possible, please upload ONLY the patch files.

Project UBahn | Enrich data

Challenge Overview

Context

Expected Outcome

Challenge Details

Scorecard Aid

Final Submission Guidelines

Learn

ELIGIBLE EVENTS:

Review style

Final Review

Approval

Challenge links

Toolbox

ID: 30145434