Hercules TV Web Apps News and Lifestyle Pages - YouTube Content Scraper

Register
Submit a solution
The challenge is finished.

Challenge Overview

A previous challenge has implemented a set of REST APIs for handling video assets, including storing them and managing them (create, retrieve, update, delete).  We also built a sample RSS scraper that parses data out of configured feeds and puts video assets in to the data store using the video REST API.  This challenge will implement a new parser to parse out Youtube user videos

Existing Code

The existing application is in Gitlab and access will be provided through links in the forum.

Scraper

The scraper will be implemented as a configurable delayed job.  The job will run at a configurable interval and will read in the Youtube details, looking for assets added since the last time it ran.  Each asset will be parsed and placed into the data store using the REST API.

The scraper will be configured with:

* A username of the user, like "tastemade" (this can go in the URL parameter for the scraper)
* A category to use when adding videos
* A provider value to use when adding videos

Sample data

For this challenge, please target the data in the "Tastemade" page here:

https://www.youtube.com/user/tastemade

The category value should be configurable for each parser, since different YouTube channels will have different categories and providers.  If we can easily just subclass a generic YouTubeParser class and fill in these details, that would be fine.  We won't need to change these at runtime at the moment.  

The parser will be expected to grab the username and use the YouTube API to get details about the playlists and videos in the user.  It is expected that an API key will be configured in the app for YouTube.

1.  Get the channel ID for the username (this only needs to be done once and must be saved in the database for quick reference)
      
https://www.googleapis.com/youtube/v3/channels?part=contentDetails&forUsername=tastemade
2. Get the playlists for the channel ID
    https://developers.google.com/youtube/v3/docs/playlists/list#try-it
3. For each playlist, get the playlist items
    https://developers.google.com/youtube/v3/docs/playlistItems/list 
4. For each playlist item, get the video details.  For "part", use "snippet,player"
    
https://developers.google.com/youtube/v3/docs/videos/list#try-it

The direct mp4 file for playback doesn't appear to be easily accessible from the API, so we can link the user to the embed URL (http://www.youtube.com/watch?v=Z347wjtDgGE as an example) If you can figure out how to get direct access to the mp4 file, that would be a nice additional feature.

The thumbnail URL used should be the largest available in the list of thumbnails.

Heroku deploy

Your deployment documentation should extend the existing documentation for the Node services and should cover how to deploy the newly created job to Heroku to run at a regular interval on a separate dyno from the service.

Existing bugs

There may be a few minor bugs in the code right now - these are not your responsibility to fix, unless they block implementation of the requirements above.  It would be appreciated if you logged them as part of your submission.

Submission format

Your submission should be a Git patch file against commit hash f9090ce94db2c9f8fd7f987ccb940a5529989045.  Make sure to test your patch file before submitting! 

Deployment document


 Your patch file should update the README with information about configuring and using the YouTube parser.
 

 



Final Submission Guidelines

Please see above

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30054460