Register
Submit a solution
The challenge is finished.

Challenge Overview

Data for this challenge can be downloaded here.
 

Challenge Overview

The objective of this challenge is to analyze two sets of data to find out the discrepancy in data sets. Participant can use any data set models or technology to explain shoots of Unaccounted for Energy (UFE) above 1% and root cause analysis of the shoot.

The days with UFE shoots are starting from January 15 until February 2 of 2017, the measured usage was between 1% and 3% lower than the usage measured by the regional Bulk Electric Independent System Operator (ISO).

 

Data Set 1:

One set of data is from individual meters data of each hour for 75 days approx, starting from December 2016 until February 15, 2017.

 

Data Set 2:  

The other data is hourly reading from the regional Bulk Electric Independent System Operator (ISO) for the same duration as data set 1. This is supposed to be combined hourly summation data of all the individual meters minus Unaccounted for Energy (UFE).

 

Datasets

The following details the enclosed datasets to be used in your investigation. The record layout for the data sets is described in a PDF included with the data.

Independently-Measured Hourly Usage

This dataset may be found in the accompanying file, INDEPENDENTLY_MEASURED_HOURLY_USAGE.txt. It shows the hourly usage as measured by a different method using different, independent data from 15 January 2017 to 15 February 2017. This is considered the valid measurement of usage for this time frame and forms the basis of comparison in the investigation.

Meter-Measured Hourly Usage

This dataset may be found in the accompanying file, METER _MEASURED_HOURLY_USAGE.txt. Each record represents a single day’s worth of usage (divided by hour) for a unique location among approximately 1.45 million locations. As with the Independently-measured hourly usage dataset the date range is from 15 January 2017 to 15 February 2017. 15 January 2017 to 2 February 2017 is the area of concern for the investigation while 3 February 2017 to 14 February 2017 helps illustrate what is expected. 15 February 2017 should be discounted as a data extraction issue.

UFE Calculation:

The Unaccounted for Energy (UFE) is the percentage difference between the two sets of reading. All the summation of hourly individual data minus hourly combined data reading from ISO in percentage difference is considered UFE. In January of 2017, an issue with UFE emerged. The days with UFE shoots are starting from January 15 until February 2 of 2017, the measured usage was between 1% and 3% lower than the usage measured by the ISO.

Challenge Game Plan & Scope

The purpose of this challenge is to analyze the data and understand the root cause of UFE shoots above 1%.

Setup

  1. We will be providing two sets of data to participants.
  2. Participants should utilize R or Python 3 to prepare any code used in the processing.
  3. Please develop a hypothesis based on a working model which explains the 1-3% discrepancy between the UFE shoots and usage measured by the ISO.
  4. Provide links to any supporting external data used.
  5. Your submission should include a summary document outlining your findings and research. The usage of graphics to explain your findings is highly encouraged.


Final Submission Guidelines

  1. Topcoder standard submissions guidelines.

  2. Submit the solution, and steps to create the model/solution etc.

  3. Any links of hosted online solutions for review, which can help the reviewer.

    The submissions will be evaluated subjectively by Topcoder staff based on the rigor of the models, the depth of the analysis provided, and the quality of the supporting materials. The submissions will be ranked in order of quality according to these criteria.

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30064483