VA Online Memorial - Content filtering and Sentiment analysis - Ideation

Key Information

Register
Submit
The challenge is finished.

Challenge Overview

The Department of Veterans Affairs' (VA) National Cemetery Administration (NCA) seeks to create an interactive digital experience that enables virtual memorialization of the millions of people interred at VA national cemeteries. This online memorial space will allow visitors to honor, cherish, share, and pay their respects and permit researchers, amateurs, students, and professionals to share information about Veterans.

 

 

The final application will likely have comments section and it is very important that the language used is appropriate. Therefore, we have two tasks in this challenge:

  1. Filter out comments containing profanity

  2. Sentiment analysis

For both tasks you are expected to suggest an approach (write a document) and implement a simple POC demonstrating the ideas. You can use cloud APIs or implement your own algorithms but in either case you must explain why is your approach better than the alternatives. You are not limited to specific technologies for the POC.

 

Note that the tool should support only English language for now.

 

Profanity filtering

The tool should accept a comment as a string, and output the list of all profane words and the comment with profane words filtered out. You can suggest(or implement) a simple filter from a local dictionary, use regex search or use a third party API - it is totally up to you, but you must cover these points in the document

  • Is there any feedback loop (from manual input) and is there a need for one at all ?

  • Cases that your approach will fail to detect

 

Sentiment analysis

Sometimes the comments won’t contain any profane words, but the overall sentiment of the comment will be very negative and that does not play well with the decorum on the site. The idea here is that we would have a several sentiment levels (good to bad) and we’d hide (or flag for human review) comments bellow a threshold. Again, you can use a third party API or suggest implementing the analysis locally, but do discuss these points

  • How to incorporate manual feedback (ie someone manually flags the comment as inappropriate)

  • What does the analysis do with long comments that can contain both positive and negative sentiment and how can we tweak the behavior

 

Your POC implementation just needs to demonstrate basic concepts. We might build on it later or start from scratch, but the document has to have enough details so we can implement the final solution later on.



Final Submission Guidelines

Submit a document explaining the approach
Submit a source code for the POC

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30060421