Challenge Overview
Project Overview
VTARC currently has a requisition team that receives requests for supplies from various internal departments. They must then research available products and vendors that meet the criteria of the request. Currently this research is done primarily via Google. VTARC recently developed their own back end search engine to make this research more automated and efficient.
On top of this new search engine is a web crawler. This takes a list of urls and search terms, crawls those urls, and utilizes the back end search engine to generate results that will be passed to the UI.
In this competition you will re-architect this web crawler to make it faster and more robust.
Competition Task Overview
Attached to this competition you will find the current Python source code for the web crawler.
Examine this code and then devise a new architecture that will:
Detailed Requirements
- Make the application more robust and stable
- Increase the speed of crawling. The first step here will be to make the application multi-threaded, with each thread crawling one url. You should also come up with other methods for speeding up the crawl, if possible.
- Change the output from a simple Red/Yellow/Green status for the overall task to a detailed percentage complete
- Define and document the interface with the new front end. This will include a list of urls as well as a list of search terms. Each search term will also have an indicator for 'required', 'optional', or 'do not include'.
- Define and document the interface with the back end search engine (called ALNLP in the code).
- Define the Assembly Specification(s) to build this new sysytem.
Open Source Library
None have been identified, but please ask in the forum if you find one that
TC Components
No, as this is Python
Technology Overview
- Python 3
References
None
Documentation Provided
- REFGUI.zip
Final Submission Guidelines
Submission Deliverables
- Application Design Specification
- TCUML containing interface/class definitions, assembly diagram, sequence diagrams, etc.
- Assembly Specifications (NO COMPONENTS)
- Must provide sufficient details because this project is assembly direct
Submission Guidelines
For each member, the final submission should be uploaded to the Online Review Tool.