Challenge Overview
Challenge Objectives
-
Build a CLI tool that will extract data from an XML input file.
Tech Stack
You are free to choose between Python, NodeJS, and C++.
Detailed Requirements
Our client has provided us a huge dataset in XML format which contains various data exported by different applications. We are in the process of building a CLI application that will be able to extract the data from the dataset and will export it into organized CSV files based on the relationships of different entities so we can later use that data in an upcoming marathon match.
You will find the dataset (an XML file) attached to the challenge forum.
In this very first challenge, you need to build a CLI application in one of the accepted programming languages from the above section.
The CLI app must take two input parameters during execution. The input file (XML format) and the output directory, should load the XML file and, for each unique XML tag, create a JSON file with the name <tag name>.json.
The content of each JSON file will be all the available attributes for that particular tag in the following format:
{
name: <the attribute name>,
required: <true if the attribute appeared in all entries>
}
Each XML tag represents a table name and each attribute represents a table column name.
Given the following XML tags,
<ABC attr_1=”lorem” />
<ABC attr_1=”lorem” attr_2=”ipsum” />
<ABC attr_1=”lorem” attr_3=”dolor” />
The produced JSON file would be:
[
{
name: “attr_1”,
required: true
},
{
name: “attr_2”,
required: false
},
{
name: “attr_3”,
required: false
}
]
Running the CLI without any input parameters or with invalid input parameters should print the correct usage instructions.
The CLI must be able to process large files without issues.
You must also include a detailed README.md file with instructions on how to use the CLI application.
Should you have any doubts, feel free to ask on the challenge forum!
What to submit
Submit your solution in a zip file