Register
Submit a solution
The challenge is finished.

Challenge Overview

Challenge Overview

Our goal is to implement and/or improve one idea from our previous ideation challenge.

Background

Market Development Funds (MDF) are monetary funds provided to qualified and authorized customer’s channel partners to subsidize the costs of Marketing and Sales activities that drive demand for customer’s products, services, and solutions.

 

Authorized channel partners send out proposals to our customers. Once the proposals are approved, channel partners can utilize the fund for their marketing purposes and can get full audit of MDF activity, like requesting for fund and claim approval. Note, the activity approval process is done through negotiation with a partner prior to the activity being submitted through the customer CRM portal. Therefore, there is no data on activities that are rejected due to being deemed unreasonable costs.

 

This entire MDF process for partners has various stages / process similar to the  mentioned below.

  • Partner Selection and MDF pre-planning for partner

  • Partner activity planning to utilize the MDF. Planning is conducted with the assistance of our customer representative. the cost of the planning has to be approved. It is at this point that the output of the model (focused in this challenge) is utilized.

  • Execution of activities

  • Claim submission

  • Claim validation

Task Detail

The ask from the customer is to build a "reasonable cost model" , to be used, by Team B above in the red circle above, as a guideline during the proposal preparation and submission by the partner. The model should be based upon existing data describing various parameters (Region, market, activity type ..etc) or perhaps something else entirely.

 

The output of the “reasonable cost” model would be: Possible range for the submitted proposal.

 

The challenge in this case, is that while the customer has ample data available for analysis, critical data, such as “proposal rejection reason=Unreasonable/inflated price”, are missing and will not be available. That is to say, there is no specific target column for you to make predictions. Instead, we are looking for some outlier detection method in an unsupervised/user-guided way.

 

Dataset

As part of the challenge, you will be provided with 20k rows of data in a CSV format. Note that, in the sample data provided, all fields referring to names of companies/clients, the real names have been replaced with dummy values to protect the anonymity of the data.

 

The client likes one of the previous ideation solutions. So you may want to follow a similar idea to develop your method. If you believe that there is some smarter idea, feel free to implement it. However, you better compare with the previously proposed method using the data or some conceptual examples. Note that, the Delphi Method mentioned in that solution is not feasible, as the customer will not allow "interviewing" or the participation of their experts in the development of the model.

 

Both the dataset and the previous solution can be found in the forum. Once you register for the challenge, you shall be able to find it.

 


Final Submission Guidelines

Submission

Contents

We mainly require two things in the submission:

  1. Python3 codes. It takes the data CSV file as the input as well as a few parameters. Runtime environment in which the models are tested to be on 64 bit systems preferably Ubuntu ^16.0. Please make sure there are discreet unit tests with a test data dump that re-validates model assumptions and outcomes.

  2. Along with the code, please pack all data that you use together. If you use any additional datasets, please also include them.

  3. A demo video about your solution, including how it runs, what are the results, and what are the possible variations it can support. If you have some GUI, it will be a plus.

  4. A document including the key ideas of your method. High-level descriptions should be good enough. It would be interesting to provide a suggestion about how to evaluate the model’s output in a quantitative way.

  5. A predicted outliers csv file that contains the detected outliers by the default setting of your method. You may want to rank them according to the confidence.

Judging Criteria

You will be judged on document-level accuracy, how well the code is organized and documented. Note that, this contest will be judged subjectively by the client and Topcoder. However, the judging criteria will largely be the basis for the judgement.

 

Accuracy (50%)

  • The client will compare your detected outliers and see if they are accurate.

  • If your method can provide some explainable rules about the outliers, it will be a plus.

  • Elegance of your solution will also be considered as a criterion.

Feasibility (25%)

  • Your method shall provide some meaningful parameters for the client to tune the results. For example, some options to turn off some attributes; some options to accept some ranges of some attributes.

  • The method shall be scalable to 100k-level of data, in terms of both running time and memory cost.

Clarity (25%)

  • Please make sure your report is well-written.

  • Please make sure your code is well-structured documented.

  • The code should be implemented using Python3 only. The deliverable itself must be in Python version ^3.7 and if it has to use mlibs - tensorflow ^2.0.0.

  • Discreet unit tests with a test data dump that re-validates model assumptions and outcomes.

ELIGIBLE EVENTS:

Topcoder Open 2019

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30091600