Challenge Overview
Challenge Objectives
The commercial negotiation team of the client has to research and review multiple means such as news and media articles, documents, and regulatory websites in an effort to stay apprised of current activities. The effort to manually review and organize the many sources of data is time consuming and thus is not completed on a comprehensive basis at regular intervals. Having this data compiled using an automated repository would ensure the team is better informed about current activities as the team meets with counterparts. This data would also provide an opportunity to proactively review permit requests for new wells as they are submitted, craft a commercial solution for tie-back of projects in proximity to one of client's existing platforms, and obtain a competitive advantage over other infrastructure in the region by reaching out early to the potential customer in order to understand the project's needs from the earliest stages of project design.
The ultimate project aims is to build a repository that can provide a "google" search kind of feature to search one place for all the information needed on a commercial deal. This repository will be built by using technology such as Web crawling or reports from the BSEE regulatory website, reading PDF documents, NLP (natural language processing) to query the accurate search words. In this release we will be focusing on the extraction from BSEE only.
Technology Stack
Database Entities
Create the database entities that are required for storing the information specific to data extraction from BSEE regulatory website.
Data Extraction Code
Provide Java that pulls the latest data from each endpoint and stores this in the MS SQL Server database.
Your code should be intelligent enough to detect new or changed information and insert/update only new or updated records. Dropping and reloading data (truncate and load approach) for each endpoint isn't acceptable.
Unit Testing
Provide unit tests performing CRUD operations on the entities created.
Deployment Guide and Validation Document
Make sure to require two separate documents for validation.
A README.md that covers:
Validation of each requirement can be mentioned in this document which will be easier for reviewers to map the requirements with your submission.
- Create the required database models for storing the extraction data in the database.
- Writing Code which will pull down the latest data from each endpoint.
- Unit Testing the entities using CRUD operations
The commercial negotiation team of the client has to research and review multiple means such as news and media articles, documents, and regulatory websites in an effort to stay apprised of current activities. The effort to manually review and organize the many sources of data is time consuming and thus is not completed on a comprehensive basis at regular intervals. Having this data compiled using an automated repository would ensure the team is better informed about current activities as the team meets with counterparts. This data would also provide an opportunity to proactively review permit requests for new wells as they are submitted, craft a commercial solution for tie-back of projects in proximity to one of client's existing platforms, and obtain a competitive advantage over other infrastructure in the region by reaching out early to the potential customer in order to understand the project's needs from the earliest stages of project design.
The ultimate project aims is to build a repository that can provide a "google" search kind of feature to search one place for all the information needed on a commercial deal. This repository will be built by using technology such as Web crawling or reports from the BSEE regulatory website, reading PDF documents, NLP (natural language processing) to query the accurate search words. In this release we will be focusing on the extraction from BSEE only.
Technology Stack
- Java
- Spring
- JPA
- MS SQL Server
Database Entities
Create the database entities that are required for storing the information specific to data extraction from BSEE regulatory website.
- Application for Permits to Drill - https://www.data.bsee.gov/Well/APD/Default.aspx
- Application for Permits to Drill (eWell APD Online query (similar to APD query)) : https://www.data.bsee.gov/Well/eWellAPD/Default.aspx
- Exploration and Development Plans - https://www.data.bsee.gov/Plans/Plans/Default.aspx
- Planned Site - (Delimited File Download) - https://www.data.bsee.gov/Plans/Files/plandelimit.zip
- Scanned Plans - https://www.data.bsee.gov/Other/DiscMediaStore/ScanPlans.aspx
- Platform Structures - https://www.data.bsee.gov/Platform/PlatformStructures/Default.aspx
Data Extraction Code
Provide Java that pulls the latest data from each endpoint and stores this in the MS SQL Server database.
Your code should be intelligent enough to detect new or changed information and insert/update only new or updated records. Dropping and reloading data (truncate and load approach) for each endpoint isn't acceptable.
Unit Testing
Provide unit tests performing CRUD operations on the entities created.
Deployment Guide and Validation Document
Make sure to require two separate documents for validation.
A README.md that covers:
- Deployment - that covers how to build and test your submission.
- Configuration - make sure to document the configuration that are used by the submission.
- Dependency installation - should clearly describe the step-by-step guide for installing dependencies and should be up to date.
Validation of each requirement can be mentioned in this document which will be easier for reviewers to map the requirements with your submission.