Challenge Overview
Problem Statement | |||||||||||||
Prizes The best 5 performers of this contest (according to system test results) will receive the following prizes:
Background OCR of images captured with mobile phone cameras presents a number of algorithmic challenges not encountered with conventionally scanned documents. One of these is that it is very difficult to take a picture with no perspective distortion or rotation, to which OCR algorithms are typically extremely sensitive. It is therefore necessary to carry out a pre-processing step in which any perspective distortion and rotation is removed. The current IDOL OnDemand implementation of OCR relies primarily on finding the border of the document in order to compute the perspective transform. This requires that the border is in view and clearly contrasted against the background. In practice this has been found to be difficult to achieve. This challenge is to implement an algorithm to compute the transform from the text alone, without the need for a border.
Overview You will need to write an algorithm to compute the homography mapping from a supplied image onto a deskewed image, where the supplied image will contain only text and the deskewed image should be free from perspective distortion and rotation. This will be tested against a set of images containing text fragments of different sizes, ranging from whole page to a few words, in different fonts and sizes. The winner will be the entry with the best IDOL OnDemand OCR accuracy achieved on the deskewed images. The scoring section and visualizer tool provide more detail about the scoring. You can test your algorithm locally with a tool that can be downloaded here. You need to sign up for IDOL OnDemand and get an API Key, visit idolondemand.topcoder.com to learn how to get your free API Key.
Implementation and Data Your task is to implement a deskewing method, whose signatures are detailed in the Definition section below. int[] imageData contains the RGB values of all pixels in the row-first order. The size of the photo is defined by numRows and numCols. The length of imageData equals to 3 * numRows * numCols. Let y be the row and x be the column, its Red value is stored in imageData[(y * numCols + x)*3 + 0]. Its green value is stored in imageData[(y * numCols + x)*3 + 1]. And its blue value is stored in imageData[(y * numCols + x)*3 + 2]. All RGB values are integers and lie in [0, 255]. The deskewed method should return a new image whose size is the same as the original one. The format of the returned image is also a int[] of RGB values, the same as the input.
Testing and scoring There are 10 examples, 30 provisional tests, and at least 200 system tests. Please note that the example data provided contains 100 example test cases, when you perform an example submission, only the first 10 example test cases will be executed. You can test with all 100 example test cases locally. For each image, we have its ground truth text. The deskewed image is a homography mapping from the supplied image onto the deskewed image. The scoring algorithm will first run HP IDOL OnDemand OCR for the deskewed image. After that, we count the number of extracted words of 3+ characters in the output that are also found in the ground truth text. Scoring is calculated based on precision and recall for each test case as follows:
Your score for a single test case will be 1000000 * 2 * TP / ( 2 * TP + FP + FN). You can see these scores for example test cases when you make example test submissions. If your solution fails to produce a proper return value, your score for this test case will be 0. The overall score on a set of test cases is the arithmetic average of scores on single test cases from the set. The match standings displays overall scores on provisional tests for all competitors who have made at least 1 full test submission. The winners are competitors with the highest overall scores on system tests.
An offline tester/visualizer tool is available.
Special rules and conditions
| |||||||||||||
Definition | |||||||||||||
| |||||||||||||
Notes | |||||||||||||
- | Memory limit is 2048MB. Time limit is 120 seconds per test case which includes only the time spent in your code. | ||||||||||||
- | There is no explicit code size limit. The implicit source code size limit is around 1 MB (it is not advisable to submit codes of size close to that or larger). | ||||||||||||
- | The compilation time limit is 60 seconds. You can find information about compilers that we use, compilation options and processing server specifications here. | ||||||||||||
Examples | |||||||||||||
0) | |||||||||||||
| |||||||||||||
1) | |||||||||||||
| |||||||||||||
2) | |||||||||||||
| |||||||||||||
3) | |||||||||||||
| |||||||||||||
4) | |||||||||||||
| |||||||||||||
5) | |||||||||||||
| |||||||||||||
6) | |||||||||||||
| |||||||||||||
7) | |||||||||||||
| |||||||||||||
8) | |||||||||||||
| |||||||||||||
9) | |||||||||||||
|
This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2020, TopCoder, Inc. All rights reserved.