Challenge Overview
Goal
Last Fall we ran the OCR Optimization Marathon Match to identify specific phrases in a set of Mud Log Images from the oil and gas industry. The phrases concerned the possible occurrences of hydrocarbons. They were phrases like "show" or "flor" or "stn". This previous challenge may provide you a better understanding about the background.One of the keys to processing the documents is being able to interpret the depth sections of them. Almost none of the information in a Mud Log file is meaningful if you don't understand where in the wellbore the reading was generated. The “Depth & Core” column is the depth section we are talking about, as shown below.
In this challenge, given a Mud Log image, you are going to identify the bounding boxes of the depth information sections as well as the concrete depth values.
Data
We have manually annotated 30 Mug Log images. Annotations of 20 images are released as training data for you.Their annotations are in a CSV file. Every depth information section is a row. There are 7 columns as follows.
- File: the filename of the image.
- Depth Literal Value (integer): This is the OCR result of the depth value from the image.
- Depth (integer): This is the real depth value. In the Mud Log, the depth may not be fully written. For example, after the depth 100, if we find a depth literal value 10, it actually means a depth of 110.
- X1 (integer): the left boundary of the section in pixels.
- Y1 (integer): the upper boundary of the section in pixels.
- X2 (integer): the right boundary of the section in pixels.
- Y2 (integer): the lower boundary of the section in pixels.
For the remaining 10 images, we only provide you the images without any annotations. You are asked to produce a CSV file following the same format.
Please note that the bounding boxes should have no overlaps with each other, although they may touch each other on the edge.
Environment
Ubuntu 16.04 / 18.04 64 bitTesseract v4
Evaluation
We run separate evaluations for different images. We use the average score of all images’ evaluations as the final score.For each image, the evaluation consists of two parts:
- Can you find the sections accurately?
- Can you identify the depth accurately?
Suppose you have N predicted sections, while the annotations have M groundtruth sections. We will check whether your bounding boxes have non-zero overlaps. If there is, your score on this image will be 0. Otherwise, the score is calculated as follows.
First of all, the numbers of sections should be similar.
Score1 = max(0, 1 - |N - M| / M)
Second, the bounding boxes of the depth information sections should be accurate. For each ground truth bounding box, we will greedily match it to a predicted bounding box with the largest positive overlap. The ground truth bounding box will not match to anything if there is no unmatched, predicted bounding box left. The greedy matching process will be run from the top (smaller y) to the bottom (larger y).
Score2 = TotalOverlaps / TotalAreaOfGroundTruth
Last but not least, we will consider the relative error of the Depth. Note that we are NOT comparing the depth literal value. Recall the matchings between the ground truth bounding boxes and the predicted bounding boxes for Score2. We will use the same matching here. For each ground truth bounding box, we calculate the relative error.
Relative Accuracy = max(0, 1 - |PredictedDepth - TruthDepth| / TruthDepth)
When the ground truth bounding box has no matched prediction, the Relative Accuracy is defined as 0.
Score3 = average(Relative Accuracy) for all ground truth sections.
Finally, the Score = (Score1 * Score2 * Score3)^{1/3}. All these 3 scores are between 0 and 1. The bigger, the better.
Some KEY restrictions:
- If your methods involve any parameter tuning, it must be done on the provided training set only.
- Annotating the testing images by yourself is explicitly prohibited. We will check your submission in deep.
Final Submission Guidelines
You are going to submit a file named as “test.csv” that only contains depth information for the testing images. It should follow the same format as the “train.csv”. Any wrong format may lead to a score of -1.Besides, you will need to submit your code and report. Details are as follows.
- Your source code must be in Java.
- You should provide a list of dependencies and instructions about how to install them.
- You should allow us to easily change the training and testing files of the same format. For example, the training and testing file names could be a part of your code’s parameters.
- You should provide a document about how you design your method. For example, you can discuss the models and the features that you have developed or tried, and also justify why you finally ended up with the model.
- You should provide instructions about how to repeat your results. If you cannot avoid some randomness, please justify your reasons.