Challenge Overview

Objective:
Currently, customer team is spending more time in resolving the work items, which were created from each email received from the consumers (end customer).  To serve consumer quickly and better, the customer team want to optimize the work items creation by eliminating the categorizing the emails.
Hence, customer want to introduce a new system with the coloration of machine learning (ML) and existing software systems.
With the introduction of “E-Mail Triage App”, the volume of unsecure email handling would needs be reduced in the overall ecosystem of software systems. TIBCO invokes an API exposed by “E-Mail Triage App” to send the mail content.  The “E-Mail Triage App”, applies the identification logic and replies with the mail category. TIBCO creates the work item with appropriate status in the SFDC.  In addition, there is a twostep feedback loop a) feedback capture process and b) Learning process. 

This challenge is responsible implementing the req-1 (ADS 1.1.2) defined by the architecture of this challenge: https://www.topcoder.com/challenges/30068838/?type=develop&noncache=true. 


Requirement – 01: Identification Logic

“E-Mail Triage App” would need to expose an API, which would be consumed by the current TIBCO system. Through this API, TIBCO would be sending the email extraction in JSON format with the following fields. (while architecting the solution, add the required fields, if required)  
  • From
  • Sent
  • To
  • Subject
  • Body
 

{
   "version":"1.0",
   "source_system":"TIBCO",
   "target_system":"E-Mail Triage App",
   "res_trigger_by":"ABC1234",
   "res_trigger_dt":"2018-08-17,20:45:23",
   "email_box_content":[
      {
         "email_box_identifer":"EMAIL_BOX_IDENTIFIER_01",
         "number_of_emails":"50",
         "batch_identifier":"batch_01",
         "email_content":[
            {
               "email_identifer":"01",
               "category":"SPAM",
               "from":"xxx@bac.com",
               "sent":"2018-08-17,20:45:23 EST",
               "to":"yyy@glic.com",
               "subject":"testing purpose subject",
               "body":"this is for the purpose of generating the api interface definition and format"
            } ]
      }]
}


The E-Mail Triage App should parse the email body content. One email might have been replied or forwarded multiple times, so every time it has been forwarded or replied one message block has been added inside body content, The previous body content start from “From:”.

So when ever TIBCO send email body it will send the full body content, but E-Mail Triage App need to pick up only first body content.

Also note in order to pick up the first body content it need to exclude the email signature, after removing email signature in case it found out the first block is empty (in case of direct forward), then it need to pick up the immediate next body.

Then the parsed email body content is passed to the Watson ML online prediction endpoint.. The format of the request should be as following:

Sample Request From E-Mail Triage App to Watson Service
{
  "values": [["Could I get a quote on this client for Life, DI, dental and vision? You can throw in the worksite. Effective 10/1. What else do you need?"]]
}
And the following response is received from Watson Service:

Below sample Response from Watson Service to E-Mail Triage App

{
  "fields": ["prediction"],
  "values": [["rfp"]]
}

The “values” field above contains the category, and the corresponding status should be sent back to the TIBCO. These mapping should be configurable in the system.

The service should pack the results from Watson ML prediction category and status and return them in response.

Identification Logic Response Format from E-Mail Triage App to TIBCO

{
  "version": "1.0",
  "source_system": "E-Mail Triage App",
  "target_system": "TIBCO",
  "res_trigger_by": "ABC1234",
  "res_trigger_dt": "2018-08-17,20:45:23",
  "email_box_content": [
    {
      "email_box_identifer": "EMAIL_BOX_IDENTIFIER_01",
      "number_of_emails": "50",
      "batch_identifier": "batch_01",
      "email_content": [
        {
          "email_identifer": "01",
          "category": "SPAM",
          "status": "OPEN"
        }]
    }]
}

Please note from Watson service we will receive the following category
  • Non - RFQ Inquiry
  • Non - RFQ
  • RFQ
  • Spam
  • Secure
For now, against each category we will map a status which you need to pass back to TIBCO

Additional details about TIBCO process:

There are currently 5 mailboxes where emails are processed by TIBCO as a batch, in 15-minute intervals. The maximum amount of emails seen at one interval is approximately 50 emails. During the batch process, TIBCO will process the emails individually hence there will one request and response via “E-Mail Triage App” for each email. (Nevertheless, while architecting the solution, consider the best approach) TIBCO will retry “E-Mail Triage App” 3 times in the event of error. TIBCO will have a turn on/off mechanism to invoke the E-Mail Triage App. This is to ensure the existing process is not disrupted. 
Note: One or more TIBCO system may be available. Hence, one or more concurrent hit may occur for EMail Triage APP - APIs. 
After receiving the request from the TIBCO system, the new application would invoke the ML end point to identify the category of the email by passing the received JSON request.
The identification will happen using Watson service Integration. It will respond back to TIBCO with
  • the email category
  • status of the work item,
 
TIBCO creates the work item in the SFDC system. So that the customer team, only looks at the items that are marked with a status of 'Open'. 
Also here is some sample message we need to parse
 

From : ipam@d-ins.com
Sent : 2018-08-17T01:17:24PM
To : ipam@d-ins.com
Subject : Developmental Services of iowa, Inc. Life and DI RFP


[EXTERNAL]

Please see the attached RFP. The census is password protected. I will send the password under a separate email.
Let me know if you need anything further to provide your most competitive proposal.
 
Thank you
Pam Pal
Executive Account Manager
[dbg logo for email]
Over 35 Years in Business
11725 Arbor Street, Suite 240
Omaha, NE 68144
Phone 402-614-6222 ext 1
Fax 402-614-6606
ipam@d-ins.com <mailto: ipam@d-ins.com >
www.D-ins.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dbg-
2 D i n s . c o m _ & d = D w M F A g & c = t g H _ B t X R - 5 0 1 z c d E 8 - d v R g E g 8 a W m d Q -
c S U X q w 1 q n b G U & r = 0 s l 8 4 c 8 3 u O d f J W L n h Z B i A M W D 9 W c I c B p j -
yJEK22W7ZA&m=pUcz8Muev9p0CieAkbBIVdzcByTLFcTTZcv_V7nNYfM&s=A2ySxBALorg_u02f
0WgkHOxq51iZFXTsr_LUt5Z3uyo&e=>
All emails pertaining to the federal and state laws should be discussed with your legal or our legal
counsel. As agents, we provide guidance and not legal advice.



Exception Handling
The functions of JavaScript services are implemented in asynchronous manner, i.e. virtually all functions will take a callback function that will be called to notify function caller of result. If error occurs, callback function will be called with an Error type parameter "error" detailing the error.
Node.js/Express controllers will interpret errors as HTTP status codes
  • 400 The request could not be interpreted correctly or some required parameters were missing.
  • 404 The entity does not exist.
  • 401 The request didn't include authentication information.
  • 403 The request was forbidden because of insufficient permission
  • 500 Something is broken at API server-side.
     
Security
The backend REST APIs requires OAuth2 client credentials flow authentication. Please find more details here: https://tools.ietf.org/html/draft-ietf-oauth-v2-31#section-4.4.1  The client will need to include the access token in all subsequent requests.
The REST APIs should be exposed through HTTPS.
All requests will be validated.
The app should store the credentials (username and password) for basic authentication for Watson ML (file may be an option). The authorization URLs for both APIs should be configurable. The services calling both APIs should pass access tokens in headers and should be able to handle 401 response codes, perform authentication and get access token and then call the actual API endpoints again.
Here the authentication information along with token need to store securely, which need to be validated for every request, we recommend IBM Compose for MongoDB to store all this information securely in encrypted format.
We also need to store information and details for every API request when E-Mail Triage App is invoking Watson for auditing and logging purpose.  We need to store API data along with response URL code as well as the request also received from TIBCO.
Scalability
There is no particular scalability requirement, and the proposed architecture does not prevent the application from being scalable.
Deployment Constraints
The app can be deployed to any environment that supports Node.js.
Application will expose REST API endpoints to be consumed by TIBCO. It will run as a Node.js/Express application.
 
Technology overview
It should be deployable in the IBM Cloud as a PaaS component.
  • JavaScript
  • JSON
  • REST
  • Backend
    • Node.js 8
    • Express 4.16
    • Passport 0.4

Note that the following items are also in scope.
  1. Security
    1. Please add an registration endpoint to allow user to register with username, password and system. The username/password will be used in current /oauth/token to request an token. 
    2. The oauth/token endpoint is in scope. 
    3. Each API invocation need a log in the system, example who invoked, when, IP etc.
    4. Identification Logic: this we need for audit purpose, example request received time, requested forward time, response status. 
  2. Cache Layer: It's only need in req-1, and you should use the email's from-email, to-email, subject and timestamp combination as the cache-key, and the response from the Watson API as the cache-value. Don't call Watson API if the key is cached. 
  3. We need to use IBM Compose for MongoDB to store the user info, token and audit data. 

We will provide you the working Watson API URL. 
 

Final Submission Guidelines

Submission Deliverable
- Source Code
- Detailed deployment guide and verification guide
- Updated architecture deliverable (updated Swagger file)
- Working IBM Cloud deployment for verification

ELIGIBLE EVENTS:

Topcoder Open 2019

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30069524