Aggregate XML Java Batch Job

Register
Submit a solution
The challenge is finished.

Challenge Overview

This component is a new Java batch job which will read in zero or more input files from a single directory and generate a single XML file in the format of the provided XSD.

This batch job is meant to aggregate the output files of another batch job into a single XML file.

Design Approach:

The XMLAggregator has an organizationChunkSize configuration parameter, this is number of organizations that will be loaded into memory to process at a time.

First, all input files will be scanned once to get all organization ids. Then organizations are processed chunk by chunk, for every chunk, a scan of input files is performed, and all organizations of the current chunk will be loaded into memory for processing.

This organizationChunkSize provides a mechanism to trade off between memory and time. If we want to save memory, then configure this organizationChunkSize to a less value, if we want to make it time efficient, then configure this organizationChunkSize to a larger value.

If user configures the organizationChunkSize to 1, this means there is a scan of input files for every organization, in this case, the component further minimizes the memory usage by writing output on the fly. But the disadvantage is that there may be many scans of input files.



Final Submission Guidelines

N/A

Review style

Final Review

Community Review Board

Approval

User Sign-Off

Challenge links

ID: 30026823