Topcoder Challenge | Topcoder Community

Challenge Overview

Challenge Objectives

Develop a tool to clean up, parametrize and group benefits data
Create simple reports in Excel

Project Background

Our client, Morel, wants to optimize their products offerings by consolidating their current products set.
In this challenge, we’ll build a tool that cleans the raw benefits data, parametrizes and groups related benefits and outputs simple Excel reports. This effectively reduces the variability of the benefits data
A parallel challenge is working on a different subset of the requirements - creating benefits hierarchy and creating a consolidated product set.
Future challenges will integrate the outputs of these two challenges, improve reporting and create the final product set recommendations

Background

Our client offers a variety of insurance products to large organizations who opt for the one that best suits their needs. This is often achieved through customization of product details and these unique products are then administered and managed. This has resulted in the following challenges:

Variation of processes across the organization
Increased complexity in providing customer service
Reduced ability to self-serve
An inefficient process of claims payment and pricing models
Difficulty in forecasting and providing optionality to customers

The goal of Coverage Optimization is to:

Analyze the variability of parent products with reference to the unique child products formed due to benefits customizations. This analysis will be used to assist Morel in the definition of a target, consolidated product set that can represent a standard set of offerings with ‘configurable’ options
Recommend options for this envisioned consolidated product set along with justifications. This will include impact analysis based on utilization and current state product variability
Suggest a hierarchal and categorical benefit design with measurable reduction in variability of benefit language, resulting in standardization of offerings and optimal benefit package
Increase simplicity of standard benefit options while suggesting configurability options for benefits
Offer an applicable algorithm/process that is easily repeatable on similarly structured but different data that
- Identifies new answers/benefits and aligns them to the suggested benefits appropriately, or
- Signals the need for a new standard benefit option

Data Description

The available dataset is not very large (~70MB). You will have access to the entire data set. Check the sample and metadata file available in the forums for a complete definition of all data fields.

Here is a definition of some terms that will help with understanding the provided data:

Benefit is coverage for various health care services
Coverage code is a unique identifier for a set of benefits provided as a group of services with actual start/end dates for the coverage
Product consists of a set of coverage codes (and hence benefits) and is used internally to align coverage codes to internal rules and procedures.

Benefits data is the core data set for this challenge. Here is how this data is generated. Benefits configuration is arranged in a question/answer format on a website. The benefit has a hard coded question, and then several types of available answers to round out the question. For example, a question might be “Is this is High Deductible Health Plan?” and the user might have a choice of 2 check boxes, radio buttons, or a drop down with Yes/No toggles. The answer then becomes the statement combination of the Q&A, leaving “No, this is not a high deductible health plan.” Another examples the Question might be “The out of pocket maximum is:” and the user enters “$3000”, leaving the answer to be “The OPM is $3000.” Or finally the Answer might be “Enter additional comments here” in which the user might enter free form text and that free form text becomes the answer. These sets of answers are rolled up to a form all of the benefits for a specific coverage code.

The actual data set contains flat data records that:

List the benefits (the answer column) and question identifiers (sequence_id)
Connect benefits to coverage codes
Connect coverage codes to internal products
Start/End date for the coverage

And also these useful columns:

Type_of_tag - information about the type of field presented in the software - radio button, checkbox, text input, dropdown
Value flag - this is only populated for records where type of tag is text - It denotes whether the user entry field is a text field (meaning all open free form text allowed) or it is a numeric field, meaning the answer may contain some text that is automatically generated by the software, but the user can only enter a numeric value.
Top50_flag - This just denotes that the coverage code is for a very important client

Technology Stack

Python
Excel

Code access

We’re starting a new codebase, so you should create the project structure.

Winning submission of the ideation challenge is available in the forums - it contains the details of what we’re trying to build in this project. You should read that document before continuing with the individual requirements section.

Individual requirements

In this challenge we will focus on grouping the individual benefits (answer column) within each benefit_class and answer_tag.

Output of this challenge is a Python tool (CLI) that:

Reads the benefits data file
Cleans the answer data, parametrizes the answers for each benefit class and answer tag and groups the benefits as described in the ideation challenge document
Creates the output reports for each benefit class

The main requirement here is parametrizing and grouping the answers - and you are NOT limited to the methods described in the ideation document (there are quite a lot of ideas and suggestions in the document, but still some cases of benefits that could be grouped are missed). It is up to you to improve the parametrization and grouping of the benefits as you see fit and this will be the a requirement that will get you the most points during review.

There is no objective score/metric for benefits grouping that we can use here for review so the reviews will be manual and based on the output for each of the benefit classes and suggested grouping.

In addition to the reports mentioned in the ideation document, the tool should print a global statistics on total number of unique answers to all tags and total number of unique answers after the grouping. That said, don’t try to achieve the lowest number of benefits after grouping with artificial improvements that rely exclusively on human decision (ex hardcoded benefit texts)

Pay special attention to the possibilities of grouping benefits in “Additional Information” benefit class - this is the class that has the highest variability in the answers and the ideation document does not provide too many details for parameterizing these answers.

Create a README file with details on how to deploy and verify the tool. Unit testing is out of scope. Code style will not be a major factor, but make sure your code follows the PEP-8 guidelines and is split into modules - don’t put everything into one giant module.

Review will be a combination of internal Topcoder and client reviews.

What To Submit

Submit the full source code

Submit the build/verification documentation

Submit a short demo video and sample outputs of the tool

Final Submission Guidelines

See above

Coverage Optimization - Variability Reduction Challenge

Challenge Overview

Background

Data Description

Final Submission Guidelines

Learn

Review style

Final Review

Approval

Challenge links

Toolbox

ID: 30090680