Challenge Overview
Project Overview
The goal of this project is to create a web application that can do the following:
- Extraction of Subsidiaries Information from pre-defined sources for given entity.
- Configure the web sources and search the negative news for Entity Name, Entity Owner, Parent Name and Subsidiaries name.
- Present the consolidated negative news report.
Challenge Overview
In a previous challenge we built a basic negative news search service, for this challenge we need you to enhance this service with some new capabilities on filtering.
General Requirements
- The service must be built as REST API
- The service must be a microservice
- You need clearly document the API using Swagger
- Open source libraries with Apache v2 and MIT licenses are ok, for any others you must get approval from us first.
Previous Challenge Requirements
The input to this service will just be the entity name, which will serve as the keyword to search for negative news from the following sources:
Please note these sources should be configurable instead of hardcoded.
This service needs to support filtering by date range and should support pagination.
Once news are found, the service will also need to identify the category the news belong to and return them in categories. Below are the categories we need to support in this version:
- Finance
- Corporate Governance
- Sustainability
- Business
- Operational
The results will need to be saved in MongoDB database for offline / later use.
Current Challenge Requirements
All features that we implemented in the previous challenge should continue to work after this challenge is done, besides this challenge needs to take care of the following requirements:
-
Currently we have a basic implementation to categorize news which is keyword based, we’d love to see this improved in this challenge.
-
We need to apply the following filters on top of the search results
-
Factors (i.e. categories): the list of factors should be sent as an optional parameter via the API. Please note the API callers should not send free form string / text, instead they should send factor ids which are supposed to be stored in the database.
-
Sources: the list of sources should be sent as an optional parameter via the API. Please note the API callers should not send the free form source urls, instead they should send source ids which are supposed to be stored in the database.
-
Duration: again the API callers should send the id of the duration option.
-
Relationships: there are 4 types of different relationships that should be supported, the API callers should send the id of the chosen relationships:
-
Parent’s Name: when this filter is specified, the news search API should call the data retrieve API to get the entity’s parent name, and only include news that mention the parent’s name
-
Entity Owner: when this filter is specified, the news search API should call the data retrieve API to get the owners list, and only include news that mention any owner in the list
-
Subsidiary Name: when this is specified the news search API should call the data retrieve API to get the list of subsidiaries, and filter any news that mention at least one of these subsidiaries
-
-
We have provided more info about these categories in the forum, you can see it after you register to the challenge.
Technology
Node.js
REST
Microservice
Swagger
Data Science
MongoDB
Final Submission Guidelines
Submit a zip file containing the following:- Full source code
- A detailed README in markdown format on how to configure, deploy and test the code (with verification steps)