Challenge Summary
We need to design a simple tool that helps users with a database cleaning process to eliminate duplicate data following certain criteria defined by the user. It’s for the oil industry but you don’t have to actually be an expert in the field to compete. There is a mockup as a reference and design brief. Jump in now!
Best of luck.
Round 1
Submit your design for a checkpoint feedback.1. Home
2. Review Results
3. Validation Overview
- Please provide a MarvelApp Presentation (see details below).
- Make sure all pages have correct flow! Use correct file numbering. (00, 01, 02, 03).
Round 2
Submit your final design plus checkpoint feedback.1. Home
2. Review Results
3. Validation Overview
- Please provide a MarvelApp Presentation (see details below).
- Make sure all pages have correct flow! Use correct file numbering. (00, 01, 02, 03).
The goal of this competition is to come up with the look and feel of a browser web application for the oil and gas industry. This application will help users cleaning databases of duplicate data.
Design Problem in a Nutshell
Background
A massive amount of information related to wells is stored in a known service called Hadoop. This type of data store is generally used when data volumes are too big for a single disk to store the data and/or when a high performance processing is needed to query and process large data sets.
Duplicate Data
Quartz has a problem with this data, there are many duplicated columns that are getting pushed into this data store and our clients are looking for automated ways to dedupe and clean it.
The deduping process, for the most part, is pretty simple. Duplicate columns are easily identified in the database. They have a root name and then dup_0, dup_1, dup_2, etc is concatenated to the end of the column name.
Solution
Our app will review the data in the columns and resolve these duplications by some set of processing rules. Basically, the application will look at the best column of data and when there is a clear "winner" we'll copy that to the master column in an automatic way.
However, there will be times where user intervention will be needed because the tool won’t be able to resolve those cases automatically. More than one column might be right or there might not be a column of data among the dupes provided that satisfies the minimum conditions. In this case, we'll need human intervention; this is when the design solution comes into place. The goal of this design exercise is to come up with a solution that helps the user go through the manual process of cleaning the duplicate records. A very rough mockup of what we envisioned for the review tool on the mockups.pdf file - NOT TO BE COPIED, of course, it’s just a very rough reference, not optimal AT ALL.
There will be some sort of "Waiting for Review" flag for the affected records that require human intervention. An intuitive UI should allow users to review the records with conflicts.
Concept Design Goals
We are looking forward to seeing a lightweight tool, simplistic. It needs to solve the problem in a simple way considering the data insights that can be delivered to the user, what interests to the user the most.
Among the most sensitive considerations:
- The design must be serious, professional looking and well spaced (easy data reading).
- The design MUST NOT be cluttered either hard to read/follow.
- The design solution must be simple enough to let the user view the data, process it and then see the results.
- The review results graph must be clear to follow.
Screens Requirements
Overall
- Show hover/active states for buttons, dropdowns, breadcrumbs, errors/success states, elements with interaction, etc.
- Please suggest how to organize this content and group them into screens, we are looking forward to seeing your unique proposals, be bold. The following screens orders are just initial suggestions but we think content could be organized in a different way, go for it!
1. Home
This page is the center of operations of the user. Note that we don’t require a login page, however, the user will be indeed logged in, so we will need a global header with user information, logout, etc.
Statistics
We need to show a high-level report of some stats. Open to suggestions:
- How many records have been automatically processed so far?
- When?
- Trends
- Etc...
Status
- The page should display a set (table most likely) of all the available stages and the number of records associated with each stage.
- Each list/row should contain the following information (headers):
-- Well Name
-- Stage Number
-- API
-- Number of records affected
-- Dup Columns (set of conflictive/duplicate data)
-- Status
-- Action
- User should be able to sort/group the data by well and stage number.
- User should be able to select multiple items to apply actions at the same time e.g. process now.
- Each stage should have a status field: Completed, Needs Manual Review, or Not Processed.
- The Not processed rows should have a “Process Now” button or link. After a User clicks “Process Now” there are two possible outcomes. Either the batch will be completed in a completely automated way and the status of the Stage row will be changed to “Completed” OR the there will be some ambiguity with the data which will require manual intervention from an admin. In that case, the record status will be changed to “Needs Manual Review”. Design both scenarios; automatically processed (possibly needs a success message) vs “needs manual review”.
- Completed rows should have a link or icon which pops up a report.
Completed Report
List a summary statistics about the processing status for a specifically selected row:
- How many rows were processed.
- How many columns were in the original dataset.
- How many columns were in the final data set.
- Which columns were used as the master data for each of the duplicate column issues.
- Show a set of columns/rows for the table. The rows with the “Need Manual Review” status should have a link which directs users to the “Review” screen below and should process the data relevant to that particular well and stage.
Data
Not a feature/requirement, context information. See the data.xlsx file for reference of how a data file will be structured. Notice that the data is obfuscated, # and * should be numbers, date, etc. There's no need to dig into this file too much but if you want more context it will be useful to see which data is being analyzed by the tool.
2. Review Results
This is the cornerstone of this application, where the review process takes place.
User should be able to select a well and stage to see the records with conflicts assuming he/she came from direct navigation – OR user can land here through links from the home page (with default selected well+stage).
User can see the information where this batch comes from:
- Well name.
- Stage.
- API.
- Channel.
Features/Actions
- User must have a mechanism to see the records of the conflicting data, the design should be able to allocate N amount of columns in the table.
- User can see a chart with each column represented (column value vs time axis).
- User can see an overall data section with Number of affected rows, Min value, Max Value, Mean Value, Median Value, Number of Missing Values, and a Variance statistic. The main value should be plotted as a full line, the min/max values as dotted lines.
- The provided mockup is a very ROUGH draft of the situation. Please DO NOT JUST reskin that, it’s not optimal at all.
3. Validation Overview
After validating conflicting records, user should be able to see a quick report of the performed actions.
It should include:
- Selected column applied to the master column.
- Chart of this selected column.
- Number or processed rows.
Branding Guidelines
- Use a logo placeholder.
- Color preferences are blues on #006BD5, light grays and whites. Open to suggestions but keep blue as primary brand.
- Fonts open to suggestion.
- Keep things consistent. This means all graphic styles should work together.
Screen Specifications
- Desktop: 1280px width. Height as much as needed.
- Make sure your work is in a vector format, for retina scaling and high fidelity.
MarvelApp Presentation
- Request a MarvelApp prototype from me (mahestro@copilots.topcoder.com).
- Do not use the forums to request for MarvelApp.
- Provide clickable spots (hotzones) to link your screens and showcase the flow of the solution.
- Provide the MarvelApp shareable link in your notes during submission upload.
Stock Artwork (Illustrations, Icons, Photography)
- Stock artwork is allowed for this challenge.
- Make sure to declare all your assets properly or you might fail screening.
Target User
- Oil and Gas industry. Tech-savvy users, they use software tools as daily part of their work.
Judging Criteria
- Is the process easy to learn and use?
- Interpretation of the user experience.
- Is the application visually appealing?
- Cleanliness of your graphics and design.
Submission & Source Files
Preview Image
Please create your preview image as one (1) 1024x1024px JPG or PNG file in RGB color mode at 72dpi and place a screenshot of your submission within it.
Submission File
Submit JPG/PNG for your submission files.
Source Files
All original source files of the submitted design. Files should be created in Adobe Photoshop or Sketch. Layers should be named and well organized.
Final Fixes
As part of the final fixes phase, you may be asked to modify your graphics (sizes or colors) or modify overall colors. We may ask you to update your design or graphics based on checkpoint feedback.
Please read the challenge specification carefully and watch the forums for any questions or feedback concerning this challenge. It is important that you monitor any updates provided by the client or Studio Admins in the forums. Please post any questions you might have for the client in the forums.