Challenge Overview
Watson Pattern Explorer Retrieve and Rank Flow
Project overview
The goal is to produce a proof of concept pattern explorer application using Node RED technology. The application should take in data from the Watson Engagement Advisor Instance, run it through a series of processing steps and provide access to the processed data.
In previous contests we have built a Natural language classifier flow and integrated it with the base UI application that allows a user to define new flows and trigger them with user supplied data.
Competition Task Overview
In this contest, we should build a flow that will train and test the Retrieve & Rank service created in Bluemix.
Build a R&R node
Develop a custom node that will handle comunication to an instance of Retrieve and Rank service
http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/retrieve-rank/index.shtml
The node should be able to train the ranker using the provided data, query the trained ranker,delete and list all rankers. Use watson node sdk to implement the functionalities instead of sending raw http requests to the R&R service.
Build a custom flow
Develop a custom flow of a simple data processing pipeline. The flow will be similar to the NLC flow from the previous contest so you can reuse parts of it. It should consist of the following:
- Exposed http endpoint for triggering the flow
- Parsing input parameters
- Creating and populating SOLR collection with the supplied documents
- Training the Ranker with supplied ground truth data
- Testing the Ranker
- Persisting the processing results
- Sending processing status updates
The endpoint should require the following parameters:
- jobID - the id of the processing job that will be used for identifying this particular job
- dataset to use for training and testing the R&R service
- email address - to send results and status updates to
Save the jobID and email address in the global context so it will be available later for persisting the results and logging.
Split the dataset into train and test sets. Use the train dataset to train a Ranker instance. A function node can be used for splitting the dataset. When splitting the dataset, make sure that each document has enough labels in both train and test dataset, and that all relevance labels are well represented in both sets. See Training data quality standards for more info on the data quality.
In a loop check if the training is complete. When it is done, test the Ranker with the test set and report test accuracy as job result. Accuracy can be defined as a percentage of questions that have 2 highest labels returned in top 3 ranked answers. For example if a test question has relevance labels given in terms of pairs (answerID, label) as { (1000,1), (1200,2), (1400,3), (1600,3) }, two answers with top labels are 1400 and 1600. If the answer contains answers 1200,1600,1400,1000 than we can count the answer as ranked correctly. On the other hand if the answer was 1200,1600,1000,1400 the question would count as ranked incorrectly (only one of the top 2 labels is in top 3 results)
Deliver the processing results
The results should be delivered as an email message and persisted to database. You can reuse this part of the flow from the NLC flow.
The email with the final results should contain the following:
Overall accuracy ( as defined earlier)
Precission @ 10
Discounted cumulative gain
See this wikipedia page for a formal definition of the above measures.
Send processing status updates
Add a status node in the flow to catch node status updates. Send these as a notification to the email address available in the global context. This is available in the NLC flow.
Error handling
Create a catch node in the flow to handle possible errors. Update the processing status to “Failed” if the error is fatal (breaks the flow), otherwise just log the error. This is available in the NLC flow.
Verification
Deploy the flow and the UI application. Create flow definition in the UI application and trigger it. Verify that the flow produces the required email messages and database changes. Make sure to verify the failure scenario too (for example error creating ranker service, invalid input data, etc).
For the sample test data, you can use the Cranfield dataset available here. You will need to transform the input data to a more mangeable format first, don't use the raw data. The same dataset is used in the Getting Started Guide of the R&R service, so you can reuse the data available there.
Final Submission Guidelines
The base source is available in the forums. Add the new R&R node to the flow/app/nodes directory, and add all the necessary test data in the ui/test_data directory.
Submit a zip file with all the deliverables.