Challenge Overview
In this contest, we want to create a command line tool to parse the XML files, and extract the Product data and insert them into the database tables.
* The command line tool should take two arguments: the input directory and output directory
* It should watch the xml files in the input directory, process them, and then move the processed files into output directory.
* A field mapping file will be provided to you about how to map the xml fields to the database table columns.
1. An record in Product table should be created (or updated) for each /CATALOG/PRDS/PRD element
2. Each /CATALOG/PRDS/PRD/ATRS/ATR element should be saved to the ProductSpecification table for the corresponding product
3. Each /CATALOG/PRDS/PRD/imgs/img element should be saved to the ProductImage table for the corresponding product
4. The /CATALOG/PRDS/PRD/PRCS/PRC element (of @type = CG_gs) should be saved to the ProductPrice table for the corresponding product
* The parsing error should be properly logged
Please load the data in "category worksheet.xlsx" (attached in forum) into a new ProductCategory table, and create a tool to do this job.
And to populate the Product Types and Marketing Product Category Names in the Product table, you need to lookup the pmoid_category number from the product's ATR: <ATR nm="pmoid_category">3329744</ATR>, and then use it to match the CATEGORY_OID in the ProductCategory, and then assign the corresponding PRODUCT_TYPE and CATEGORY column values as productType and marketingProductType. Note that please make the ProductCategory column follow the naming of the existing Vertica db tables.
* The command line tool should take two arguments: the input directory and output directory
* It should watch the xml files in the input directory, process them, and then move the processed files into output directory.
* A field mapping file will be provided to you about how to map the xml fields to the database table columns.
1. An record in Product table should be created (or updated) for each /CATALOG/PRDS/PRD element
2. Each /CATALOG/PRDS/PRD/ATRS/ATR element should be saved to the ProductSpecification table for the corresponding product
3. Each /CATALOG/PRDS/PRD/imgs/img element should be saved to the ProductImage table for the corresponding product
4. The /CATALOG/PRDS/PRD/PRCS/PRC element (of @type = CG_gs) should be saved to the ProductPrice table for the corresponding product
* The parsing error should be properly logged
Please load the data in "category worksheet.xlsx" (attached in forum) into a new ProductCategory table, and create a tool to do this job.
And to populate the Product Types and Marketing Product Category Names in the Product table, you need to lookup the pmoid_category number from the product's ATR: <ATR nm="pmoid_category">3329744</ATR>, and then use it to match the CATEGORY_OID in the ProductCategory, and then assign the corresponding PRODUCT_TYPE and CATEGORY column values as productType and marketingProductType. Note that please make the ProductCategory column follow the naming of the existing Vertica db tables.
Final Submission Guidelines
- Upload all your source code in a zip file.
- Provide documentation for your application. It should contain complete build, deployment, and execution instructions.
- Screen sharing video is not required for this application.
- You should use the existing code found in the GitHub repositories as the starting point for this application. The details for the GitHub repositories can be found in the Code Document forums attached to this challenge.
- This application uses the Vertica database as a persistence layer. We have a docker script which configures this database for you. The details can be found in the Code Document forums attached to this challenge.
- Provide documentation for your application. It should contain complete build, deployment, and execution instructions.
- Screen sharing video is not required for this application.
- You should use the existing code found in the GitHub repositories as the starting point for this application. The details for the GitHub repositories can be found in the Code Document forums attached to this challenge.
- This application uses the Vertica database as a persistence layer. We have a docker script which configures this database for you. The details can be found in the Code Document forums attached to this challenge.