Register
Submit a solution
The challenge is finished.

Challenge Overview

Watson Pattern Explorer Node RED Flow Contest Specification

1.     Project Overview

1.1     System Description

The goal is to produce a proof of concept pattern explorer application using Node RED technology. The application should take in data from the Watson Engagement Advisor Instance, run it through a series of processing steps and provide access to the processed data.

1.2     Competition Task Overview

1.2.1     NLC node

Develop a custom node that will handle comunication to an instance of Natural Language Classifier (NLC) service.
 https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/nl-classifier/
The node should be able to train the classifier using the provided data (csv file), query the trained classifier,delete and list all classifiers. Use the NLC REST api to implement the functionalities - http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/apis/#!/natural-language-classifier

1.2.2     Custom flow

Develop a custom flow of a simple data processing pipeline. It should consist of the following:

  • Exposed http endpoint for triggering the flow
  • Parsing input parameters
  • Training the language classifier
  • Testing the classifier
  • Persisting the processing results
  • Sending processing status updates

The endpoint should require the following parameters:

  • jobID - the id of the processing job that will be used for identifying this particular job
  • dataset to use for training and testing the nlc node
  • email address - to send results and status updates to

Save the jobID and email address in the global context so it will be available later for persisting the results and logging.

Split the dataset into train and test questions. Use the train questions to train a language classifier instance. A function node can be used for splitting the dataset.

In a loop check if the training is complete. When it is done, test the classifier with the test questions and report test accuracy as job result. Include the list of misclassified questions in the results.

Input dataset will be in the following format

{

                          'questions':

                                    [

                                                {

                                                   'text': String

                                                   'class': String

                                                }

                                    ]

                        }

1.2.3     Deliver the processing results

Develop a custom flow to deliver the results of the processing. Build this functionality as a subflow so we can reuse it to persist results after any of the processing steps. The results should be delivered as an email message. To process emails, use the SendGrid service available in Bluemix.

Additionaly, persist the processing results in a cloudant database. Create a new database object for persistence (jobResult or similar), and use the jobID from the global context to link it with the job request. This should be optional, ie the user should be able to choose to just deliver results by email and skip database.

The email with the final results should contain the following:

Overall accuracy

Classification matrix - class of truth on columns and class of prediction on rows, with count in cells.

List of first 20 misclassified questions

List of 3 most and least accurate question classes

1.2.4     Send processing status updates

Add a status node in the flow to catch node status updates. Send these as a notification to the email address available in the global context.

1.2.5     Error handling

Create a catch node in the flow to handle possible errors. Update the processing status to “Failed” if the error is fatal (breaks the flow), otherwise just log the error.

 

1.4     Verification

For verification, trigger the flow and verify that the flow produces the required email messages and database changes. Make sure to verify the failure scenario too. You can use the sample NLC data for testing the classifier

https://github.com/watson-developer-cloud/natural-language-classifier-nodejs/blob/master/training/weather_data_train.csv

1.5     Technology overview

 



Final Submission Guidelines

Submit a zip file with all the deliverables.

ELIGIBLE EVENTS:

2016 TopCoder(R) Open

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30051168