Challenge Overview
1. Project Overview
NASA has recorded over 100 terabytes of images, telemetry, models and just about everything one can imagine from all the planetary missions from the past 30 years. The data stored is within NASA's Planetary Data System (PDS). And it is all available free at http://pds.nasa.gov
However, while rich in depth and breadth, the PDS holdings have developed in a disparate fashion over the years with different architectures and formats at the various nodes. Consequently establishing a uniform approach to accessing data within the PDS is a significant challenge.
Working with TopCoder and Harvard IQSS via the NASA Tournament Lab, PDS developed a data base of meta-data that allows user-friendly access to data held at the Small Bodies Node (SBN). As the next step, we would like to extend the import and persistence module to process the Cassini ISS datasets held at the Rings Node.
2. Competition Task Overview
In this competition you have to extend the current functionality of the import and persistence module so that it can support the Cassini ISS datasets. This includes extending the database to accommodated additional meta-data parameters, and modifying the API to populate the expanded database.
Raw data volumes: http://pds-challenge.seti.org/
Supplemental tables of geometric metadata: http://pds-challenge.seti.org/
2.1.1. Changes Required
The current code supports processing of raw data volumes at the SBN. However substantially more information is necessary to support the current PDS challenge. As per the current logic DataSetProcessorImpl#doProcess
For the Cassini data used in this challenge, additional meta-data, stored in supplemental tables is also required. Consequently, the code must be revised not only to accommodate the meta-data in the data label files, but to parse the meta-data from the supplemental tables.
For each volume of data, (COISS_2001, COISS_2002, COISS_2003, etc.) there are four separate supplemental meta-data files (inventory, moon, ring, and saturn). Each table file has it's own label file describing the contents of the table.
Consider COISS_2001, the supplemental meta-data for the inventory of targets within the image field of view is available here - http://pds-challenge.seti.org/
Similarly, again for COISS_2001, the 'moon summary' supplemental meta-data is available here - http://pds-challenge.seti.org/
Example, Line #1 in the tab file is as follows.
"COISS_2001","data/1454725799_1455008789/N1454725799_1.LBL ","S/IMG/CO/ISS/1454725799/N","RHEA","HELENE","TELESTO"
Absolute file Specification Path to the data product label file will be http://pds-challenge.seti.org/volumes/COISS_2xxx/COISS_2001/data/1454725799_1455008789/N1454725799_1.LBL. This label file will have the pointers to the data objects (Images in our case).
2.1.2. Source Code
SVN - https://coder.topcoder.com/tcs/clients/ntl-pds/assets/assembly/pds_projects/import_and_persistence
Please go through the implementations in gov.nasa.pds.processors.impl and gov.nasa.pds.services.impl for understanding how dataset processing and persistence is performend for SBN dataset.
Also you can go through the current architecture which is located here.
2.1.2. Database
The new datasets might need DB schema updates. Make sure that you are strictly following 3NF.
2.1.3 Application Management
Please follow the standards set by the existing code for the application management.
2.1.3.1 Transactions
The services will use Spring to manage their transactions. All modifying methods should provide transactional control.
2.1.3.2 Configuration
The converter and all service implementations will use setter injection for configuration with the use of Spring.
2.1.3.3 Persistence
This module will use JDBC queries to manage all data. It will use JdbcTemplate from the Spring framework.
2.1.3.4 Logging
The services will log activity and exceptions using the Logging Wrapper in this component.
It will log errors at Error level, and method entry/exit information at DEBUG level. It will log errors at Error level, potentially harmful situations at WARN level, and method entry/exit, input/output information at DEBUG level.
Please follow the logging pattern in the current code.
2.1.3.5 Exception Handling
The Base Exception component will be used as the top-level exception for all other exceptions thrown by the components.
2.1.3.6 Scalability
With the use of object recycling and preloading during data reading, the application is quite scalable. Such services as validation preload all definitions that will be used for validation instead accessing the database each time.
Object recycling means that each time a data file is loaded, the Table object is reused, this minimizing the creation of new objects.
3. Technology overview
- J2SE 1.6
- Spring 3.0.5: http://www.springsource.org/
- MySQL 5.5: http://www.mysql.com/
- Apache Ant 1.8.2: http://ant.apache.org/
Final Submission Guidelines
Submission Deliverables
A complete list of deliverables can be found in the TopCoder Assembly competition Tutorial at:
http://apps.topcoder.com/wiki/display/tc/Assembly+Competition+Tutorials
- Source code and configuration files.
- Deployment guide to configure and verify the application.