DEBITRON - Data Models for compulsory debits

Register
Submit a solution
The challenge is finished.

Challenge Overview

CONTEXT
On a daily basis Bank Collections' teams runs report to segregate a list of its customers which are due in one or more products (I.e. Loans, Credit Cards, Mortgage, etc.) AND, at the same time, have enough balance to pay in their savings or check accounts.
This list of cases (lets call it:"DailyDataBase" or DDB) it varies in volume, but can be from 800 to 2000 records per day. In summary this DDB is showing a list of customers which are due in certain amount AND have enough balance at the moment the DDB is generated (beginning of the day).
After the DDB is generated it goes to a job executed by a Bot, which basically: picks each case (one by one) goes to the banking core-system and debits the due amount from the customer's account. The Bot clears the debt of the customers.

We have found that about 25% to 40% to of the collectable amount in the DDB couldn’t be processed by the bot, simply because at the moment it picks the case the saving or checking account has no enough balance (let call those: SD or "Slippery Dude").
Guess what? The DDB has no specific prioritization of the cases (the bank has tried simple sort like by amount –bigger to smaller- but with little impact)
 
CHALLENGE OVERVIEW
This contest is looking for the best data scientist and developers, to create a model which can be trained with historical information and later on, will be able to classify and prioritize our daily DDB based on prediction, in a way that can maximize the daily collection and minimize the SDs.
Please note:
  • The outcome of the challenge needs to be the trainable model in Python
  • It need to be easily deployable for our customer in its premises (please indicate any specific requirement or step-by step to facilitate the implementation of your model)
  • You can use the algorithm of your preference, as long as it contributes to expected outcome.
  • We are providing the sample data set with some dummy cases and all variables (CompiledDDBs_Challenge v3 II.xlsx, attached in the forums)
  • We are providing a data dictionary with the description of each variable (DataDictionary_Challenge_v2.xlsx attached in the forums)
  • We encourage you to use the highlighted variables in your model (our Data Scientist recommendation) - it is up to you how you manage them in your model
  • Only in case you are super convinced to use other non-highlighted variable please make sure you explain the reason
  • The DDB is always an Excel file (don't need to worry on multiple sources)
 
DETAILS
We are providing a sample data set of compiled DDBs updated AFTER Bot processed: you will find a BOT_JOB variable with APPROVED/REJECTED which means Bot was able to debit the amount or not due to insufficient balance.
 
EVALUATION
The model will be evaluated based on the:
  • Maximization of the collected amount, as the first criteria
  • Minimization of the SD, as the second criteria 


Final Submission Guidelines

Submit complete source code for training the model and evaluation
Submit a deployment guide, verification guide and sample outputs.
Submit a short document explaining your approach.

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30060893