Challenge Overview
A previous challenge has implemented a set of REST APIs for handling video assets, including storing them and managing them (create, retrieve, update, delete). This challenge will add some new scrapers to support Reuters and BBC videos.
Existing API
The existing Node application and deployment details are in Gitlab, and the URL to the repository can be found in the forum.
Video format
Video details, like the category, sub-category, and provider should come from the scraper configuration. Do *not* hard-code this information.
Thrillist:
The Thrillist Lifestyle scraper will be configured against a URL like this:
https://www.thrillist.com/videos We want to scrape out the individual videos, *including* the videos that show up after you click "Load More" on that page.
Existing API
The existing Node application and deployment details are in Gitlab, and the URL to the repository can be found in the forum.
Video format
The goal of this challenge is to properly "scrape" the video metadata off the configured pages, filling in the metadata for the video data structure in the existing app. For the video URL, we want either an MP4 URL or an HLS URL (.m3u8 extension). You can see HLS or MP4 videos usually by switching your user agent to a mobile device like an iPhone.
Video detailsVideo details, like the category, sub-category, and provider should come from the scraper configuration. Do *not* hard-code this information.
Thrillist:
The Thrillist Lifestyle scraper will be configured against a URL like this:
https://www.thrillist.com/videos We want to scrape out the individual videos, *including* the videos that show up after you click "Load More" on that page.
NBC Today
The NBC Today parser will be configured against an URL like this:
http://www.today.com/video
All we want to grab here is the "Editor's Picks" list
Integration
These additional scrapers must integrate back into the app the same way the other scrapers work. They should be configured using the admin pages to add and edit scrapers, and they should work using the src/feedscript.js --scraperName=... flow that the other scrapers use. Basically, all the admin should have to do is add the scraper in the admin panel and run it. The admin shouldn't have to know what exactly the scraper is doing or have to configure each one with all sorts of custom information.
In addition, make sure your scrapers work with this functionality:
* Configurable category and sub-category
* Scraper limits
* Thumbnail limits for height and width
Don't pull all videos and *then* limit the number of videos added. Only request and parse the number of videos that match the scraper limit.
README
Make sure the README is updated with verification information about the new scrapers and configuration information so they can be easily added.
Unit tests
As with the other scrapers, unit tests are required for these new scrapers.
Heroku deploy
Make sure the Heroku deployment information is up-to-date and that you keep the package.json up to date as well. Don't expect the deployment to be anything other than "npm install" / "npm start" locally and "git push heroku master" for Heroku deployment.
Submission format
Your submission should be provided as a Git patch file against the commit hash mentioned in the forum. MAKE SURE TO TEST YOUR PATCH FILE!