Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Challenge Objectives

 
  • Implement algorithm (model, training, testing) for predicting likely buyer company for a company on the market from a list of first round buyers

  • Ideation challenge submissions are available as a starting point

 

Project Background

Our client, a global  investment bank, is looking  to build a predictive analytics  algorithm

to  obtain  characteristic  insight based on  historic deal data  taken from their CRM

matching potential closing bidders on assets based on behavior in previous biddings.

 
  • Prediction algorithm will be used in client environment to predict most likely company to close the deal

 

Technology Stack

 
  • Algorithm should be implemented using Python, R, or .NET stack. If you want to use a different technology, ask for confirmation in the forums.

 

Data description

Historic data is provided in two Excel sheets:

  • Bidder profiles - contains data about actual bids - Buyer’s lists from previous transactions, including details such as firm size, revenue, industry, etc. for each bidding firm and firm sold. Data also includes each bidding firm’s progress through each round until the final closing bid. This data is considered as ground truth. There is total of 42 variables (columns)

  • Seller profiles - contains data about firms sold in the bidding process - includes firm size, revenue, industry, etc. There is a total of 37 columns.

Explanations for all columns are provided in the forums.

 

Prediction requirements

 

Main goal is to predict the most likely buyer, given the first round bids. Bidding process for one company is identified by the “ENGAGEMENT_NUMBER__C” column in the data set. First round bids are the ones with ROUND__C=”First”. Closed deals are marked with ROUND_C=”Closing”.

For example - a sale of company X (engagement number 80314) had total of 9 bids - 5 first round, 3 final round and one closing bid. Input for the prediction would be the 5 first round bids and the expected output is likelihood of each of the 5 companies to close the deal.

 

Your prediction model will be use historic data to predict the most likely buyers for a new company to be sold and should have the following properties:

  • Handles null values gracefully - there are a lot of null values in some of the columns of ground truth data

  • Ranks the buyers on the list - predict which buyer is most likely to close the deal

  • Algorithm output should show why the is high/low (in terms of different factors considered)

 

Scoring

Predicting the closing bidder correctly is very important, but algorithm that predicts the actual closer as second most likely closer is better than algorithm predicting that same closer as fifth, so the scoring function will take that into consideration:

Score = w*(1 - (Actual closer predicted rank-1)/( Number of first round bids-1))

where w=(1+number of first round bids/n), n=10. Maximum score (correct prediction, 10 first round bids) is 2, minimum score is 0 (correct closing bidder is predicted to be last).

 

We have provided ideation challenge submissions in the challenge forums. You should use them as a starting point for algorithm implementation. You are free to combine ideas from the submissions or add your own modifications. Here are a few points that will probably have a big effect on accuracy:

  1. Financial and strategic bidders contain different information especially in terms of financial data. So while treating NULL values, simply dropping fields that are less than 50% filled will result in a loss of all financial data fields because typically each field is ONLY filled for strategic bidders OR financial bidders, but these fields when filled are important so cannot just be dropped.

  2. Creating a separate model for each industry or industry group (as suggested by one of the ideation challenge submissions) 

  3. The features introduced to track a buyers bid history should be very useful in prediction.

 

Submissions will be evaluated based on the following criteria:

  • Accuracy (average score) - 60%

  • Code review (quality, documentation, correct usage of libraries, etc) - 20%

  • Documentation - 20%

 

Your submission should contain:

  • Code

  • Algorithm description document - overview of used features from the ideation challenge or your additions to the final model and discussion section explaining achieved accuracy and possible improvements.

  • README file with details on how to deploy and test the submission with the provided data set



Final Submission Guidelines

  • Code

  • Algorithm description document - overview of used features from the ideation challenge or your additions to the final model and discussion section explaining achieved accuracy and possible improvements.

  • README file with details on how to deploy and test the submission with the provided data set

ELIGIBLE EVENTS:

Topcoder Open 2019

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30069738