Hercules TV Web Apps News and Lifestyle Pages - RSS Content Scraper

Key Information

Register
Submit
The challenge is finished.

Challenge Overview

A previous challenge has implemented a set of REST APIs for handling video assets, including storing them and managing them (create, retrieve, update, delete).  This challenge will take that one step further and will build a simple job that will run at regular intervals to scrape data from a configured RSS feed and put that data into the data store using the video REST API.

Existing API

The existing Node application and deployment details can be found in the forum.

Scraper

The scraper will be implemented as a configurable delayed job.  The job will run at a configurable interval and will read in RSS feeds, looking for assets added since the last time it ran.  Each asset will be parsed and placed into the data store using the REST API.

The scraper will be configured with:

* A URL to the RSS feed 
* A category to use when adding videos
* A provider value to use when adding videos

Sample data

For this challenge, please target the data in one of the Wall Street Journal feeds here:

http://www.wsj.com/public/page/rss_news_and_feeds_videos.html

The category value should be "News" for the scraper, and the provider will be "Wall Street Journal"

* The image in the description should be used as the thumbnail for the video.
* The video URL should be the URL to the *video* on the playback page, not just the playback page itself.  Here's an example:  http://m.wsj.net/video/20150326/032615hubpmmidmess/032615hubpmmidmess_v2_ec664k.mp4
* The duration should be parsed from the mp4 file.

Heroku deploy

Your deployment documentation should extend the existing documentation for the Node services and should cover how to deploy the newly created job to Heroku to run at a regular interval on a separate dyno from the service.

Final Submission Guidelines

Please see above

ELIGIBLE EVENTS:

2016 TopCoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30054317