Challenge Overview

 

Challenge Overview

 

Welcome to Infant nutrition backend API challenge. In this challenge, we aim to create a backend API for the infant nutrition dashboard tool.

Project Overview

In this project we will be:

  • Scraping retail sites for product info, ratings, reviews, nutrients and ingredients data

  • Identifying competing products across brands based on ingredients and nutrients data

  • Analyzing user reviews to identify topics, positives, and negatives for each product group and brand

  • Looking for identified items in social media posts to estimate how popular/important each one of them is

  • Providing reports that allow for drill-down per topic, brand, product group or individual product level

Technology Stack

  • NodeJS

  • Mongo

 

Assets

We’re starting a new codebase. It’s up to you to create the base code for the tool. 

Products database backup is available in the forums.

Individual requirements

So far we have been building a scraper tool that creates products data in Mongo collection, and a data extraction tool that creates additional product attributes (review sentiments, review topics, etc) - in short, we have a collection of products saved to the Mongo database. Now we would like to modify the data structure in the database so we can capture the historical data (ex search rank each time the scraper is run, product price changes, etc) and build a simple read API that exposes the data along with a few filtering and grouping options.

The current product document structure is

  • id

  • sku

  • upc

  • gtin

  • source

  • name

  • description

  • descriptionDetail

  • reviews: rating, title, date, textContent, sentiment: positive, negative, neutral, compound

  • Price

  • Ranking: keyword:rank

  • rating: overall, total, fiveStars, fourStars, threeStars, twoStars, oneStars

  • lastUpdated

  • productUrl

  • ingredients: name, amount, unit, referenceValue

  • nutrients: name, amount, unit, referenceValue

  • sentiment: positive, negative, neutral, compound

  • topics: positive, negative

You need to update the document structure to enable persisting historical values for these attributes:

  • Price

  • Ranking

  • Rating

Note: reviews already have date field that can be used for filtering by time interval

It is up to you whether to store the historical values as new attributes in the product document, or in separate collections (both options have tradeoffs between ease of implementation vs query performance). 

The following API endpoints need to be implemented

  1. /docs - serves the swagger UI for the API

  2. Product search (/search)
    Search by product name. Returns only product id, brand, name, detail. Supports pagination and filtering by brand

  3. Product details (/products/:id)
    Returns latest product details (name, description, descriptionDetail, url, ingredients, nutrients, sentiment, topics, images) and historical data for price, ranking and ratings. Images should contain only an id, not the complete image data

  4. Product images (/products/:id/images/:id)
    Returns products image

  5. Product reviews (/products/:id/reviews)
    Returns product reviews

  6. Brand nutrients (/brands/nutrients)
    Aggregates the nutrients data per brand and returns an array of nutrients for each brand with nutrient value = percentage of products that contain that nutrient (product.nutrients.amount>0)

  7. Brand ingredients (/brands/ingredients)
    Aggregates the ingredients data per brand and returns an array of nutrients for each brand with nutrient value = percentage of products that contain that nutrient (product.nutrients.amount>0)

  8. Brand rating statistics (/brands/ratings?startDate&endDate)
    Aggregates historical brand ratings data and returns number of new ratings for each brand in the provided time interval

  9. Brand review statistics (/brands/reviews?startDate&endDate)
    Aggregates historical brand review sentiment data and returns number of new reviews  and, number of reviews with positive, negative and neutral sentiment and average review sentiment for each brand in the provided time interval

  10. Products with largest price changes (/stats/priceChanges?startDate&endDate)
    Returns products with largest price change (pcnt) in the provided time interval - returns top 10 results

  11. Products with largest rating changes (/stats/ratingChanges?startDate&endDate)
    Returns products with largest overall rating change in the provided time interval - returns top 10 results

  12. Products with largest sentiment changes (/stats/sentimentChanges?startDate&endDate)
    Returns products with largest sentiment change in the provided time interval - returns top 10 results

Create a docker file for the app and a docker-compose script that runs the app and starts a mongo DB. Create a script to generate some demo historical data for verification

 

What to submit

  • Submit the full source code for the API and a README with configuration, deployment and verification steps

  • Submit a Postman collection for verifying the API



 

Final Submission Guidelines

See above

ELIGIBLE EVENTS:

2020 Topcoder(R) Open

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30123848