Challenge Overview

Welcome to the Topcoder challenge. You may implement this task and submit your solution until the deadline. Reviewers would score it and 2 people with higher scores (not less than minimal) would get the prizes. Learn "How to compete" and read the task below.

Context

Project Context

Topcode Connect is the client-facing application of Topcoder. Customers may create Topcoder projects there and Topcoder managers and copilots pick it up from there.

Challenge Context

Topcoder Project Service is the main backend service of Topcode Connect and it is constantly under development. When we are working on Topcoder Project Service we set it up locally and usually need some demo data for testing. We already have a script for creating some demo data, but it has several disadvantages which makes it hard to use during local development. The main disadvantage is that it uses Topcoder Project Service API for importing data, but API has limitations, so we cannot recreate exact data in the database. Another disadvantage is that when we use API data is created in the database instantly but it takes around 10 minutes for all the data to be indexed in Elastcisearch because it uses a separate service Project ES Processor for indexing and passes data using Kafka streams. Please, have a look at Topcoder Project Service Architecture for details.

Expected Outcome

Two scripts: one script to export current data from a database to a file and another script to import data from a file to a database (DB) and Elasticsearch index (ES).

Challenge Details

Technology Stack

  • Node.js

  • PostgreSQL

  • Elasticsearch

Code access

The work for this challenge has to be done in 1 repository:
- Project Service repo https://github.com/topcoder-platform/tc-project-service branch `feature/export-import` commit `23b9374fe2685b851755321d30e42c784b4e2e4e` or later.

- Please, follow the Local Development guide for local setup.

- Config for local setup is provided on the forum.

Individual requirements

Export script

  • Create npm command `npm run data:export` that would export data from the database to a JSON file.
    - By default, it should export to file `data/demo-data.json`.
    - But we should be able to define another file for export as a command argument, for example like this: `npm run data:export -- --file path/to/another/place/file-name.json`.
    - The script should create the path if it doesn't exist.
    - If the file already exists, the script should confirm if it has to override it or not. Only if we type "y" or "Y" it should override the file, otherwise, it should stop.

  • We use soft-delete when we are deleting records in the database and soft-deleted records should be also exported.

Import script

  • Create npm command `npm run data:import` that would import data from a file to the database and index data from the database to the Elasticsearch index.
    - By default, it should import data from file `data/demo-data.json`.
    - But we should be able to define another file for import as a command argument, for example like this: `npm run data:import -- --file path/to/another/place/file-name.json`.
    - if the file to import doesn't exist, the script should return an error message

  • As we export soft-deleted records to file we also have to import soft-deleted records from the file and they should stay soft-deleted in the database.

  • After data is imported from the file to the database, all data in the database should be indexed to the Elasticsearch index.

Index data

As a part of the "import script" data has to be indexed from the database to the Elasticsearch index. For this purpose, we cannot reuse our current approach with Kafka + Project ES Processor as we don't want this script to depend on these services and because such a way of indexing is slow. Instead of this, we want to use methods to directly index data from the database to the Elasticsearch index. We already have such code:

  • indexMetadata - method to index all models of `metadata` index.

  • project-index-create.js - this is the code of a special endpoint for admins to index a list of projects with ids from "projectIdStart" to "projectIdEnd" to the `projects` index.
    - extract code from this endpoint to a reusable method `indexProjectsRage` for indexing a list of projects.
    - use this new method in this admin endpoint and make sure that this admin endpoint works the same as before. Also, you could use this method in the "import script".
    - this method is requesting additional user details for each project member using `getMemberDetailsByUserIds`. As we might index many projects, to speed-up things, implement caching of user details inside this new reusable method `indexProjectsRage` so we only call method `getMemberDetailsByUserIds` once for `userId`. And if we already know all the members for a project, don't call the method `getMemberDetailsByUserIds` at all. Such caching should also work well in the existing admin endpoint.

  • Note, that unlike database, soft-deleted records are never indexed in Elasticsearch index and our existing code already takes it into account.

DB Models in Scope

  • Scripts for export and import with indexing should support the next models:
    - All models in `metadata` index, see list.
    - All models in `projects` index: Project, ProjectPhase, PhaseProduct, ProjectAttachment, ProjectMember, ProjectMemberInvite

General requirements

  • We should be able to easily add new models to be supported by the export and import scripts in the future.

  • Split code into meaningful methods instead of creating big methods that do everything.

  • All the code which potentially can be used in the application should be placed inside the `src` directory.

  • The code that could be only used for the export/import script should be placed in the `scripts/data` directory.

  • The code for reusable methods for indexing data from the database to Elasticasearch should be placed into  `src/utils/es.js`.

  • Add a new section to README.md with instructions on how to use export and import scripts. 

  • Add JSDoc for new functions.

  • Lint should pass (don’t disable lint rules).

Deployment guide and validation document

To validate the scripts create an example file with demo data `data/demo-data.json` using the next guide:
- use the existent command to import metadata from DEV server, see "Import sample metadata projects" (some records are failed to import which is ok, see correct output screenshot).
- run Connect App locally as per, see "Run Connect App with Project Service locally"
- now, using the local version of Connect App, create 2 projects using the next links: Talent as a Service and Design, Development & Deployment

- make sure that we have example data for all DB Models in Scope and if not then create example data in the rest of the model (you may use Postman).

Write a brief validation document on how to validate the submission.

Scorecard Aid

  • This challenge would be scored using the Basic Code Challenge 0.0.5 scorecard.

  • The major requirement is that scripts export and import data correctly without losing any data. It's expected that if we have some data, export it, clear DB and ES and import data, we would come back to the initial state.

  • Correct indexing in Elasticsearch during importing is also a major requirement.

  • The rest of requirements could be treated as minor.

  • Code quality should be also reviewed as per common best practices and "General requirements" section.



Final Submission Guidelines

Submit a zip file which would include:

  • git patch with changes you’ve made to the code in our repositories

  • Validation document in any common format: markdown, doc, pdf, html and so on.

Additionally, the winner would be required to raise a pull request to the repositories after the challenge is completed.

ELIGIBLE EVENTS:

2020 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30120906