Challenge Overview
NASA’s Aeronautics Research Mission Directorate (ARMD) is tasked with innovating at the cutting edge of aerospace. Their work includes Innovation in Commercial Supersonic Aircraft, Ultra-efficient Commercial Vehicles and Transitioning to Low-Carbon Propulsion while also supporting the development of launch vehicles and planetary entry systems. These strategic thrusts are supported by advanced computational tools, which enable reductions in ground-based and in-flight testing, provide added physical insight, enable superior designs at reduced cost and risk, and open new frontiers in aerospace vehicle design and performance.
The advanced computational tools include the NASA FUN3D software which is used for solving nonlinear partial differential equations, known as Navier-Stokes equations, used for steady and unsteady flow computations including large eddy simulations in computational fluid dynamics (CFD). Despite tremendous progress made in the past few decades, CFD tools are too slow for simulation of complex geometry flows, particularly those involving flow separation and multi-physics (e.g. combustion) applications. To enable high-fidelity CFD for multi-disciplinary analysis and design, the speed of computation must be increased by orders of magnitude.
NASA is seeking proposals for improving the performance of the NASA FUN3D software running on the NASA Pleiades supercomputer. The desired outcome is any approach that can accelerate calculations by a factor of 10-1000x without any decrease in accuracy and while utilizing the existing hardware platform.
How To Compete
This challenge is being supported by HeroX and Topcoder and proposals are being accepted for 2 separate contest opportunities:Ideation (HeroX challenge) - Ideas and approaches may include, but are not limited to exploiting algorithmic developments in such areas as grid adaptation, higher-order methods and efficient solution techniques for high performance computing hardware. See HeroX for full details.
Topcoder Architecture Challenge Overview
The High Performance Fast Computing Challenge is about optimizing source code to improve NASA’s FUN3D Computational Fluid Dynamics suite, such that flow analysis which previously took months to compute, can now be done in days or hours. We are hoping this challenge will identify solutions that will improve FUN3D to be 1000x faster. We think there are multiple approaches to finding solutions to help NASA achieve 1000x performance improvement and not all require you to be an aeronautical engineer. One approach might involve algorithmic improvements where the contestants focus their energy on the implementation of the Navier-Stokes equations in the main step solver, written in Fortran. This subroutine gets called for each node on the grid repeatedly until the flow reaches steady state. Even small improvements here can equate to large improvements of the overall flow analysis time as this code can be run trillions of times. Perhaps there are new compiler features that this existing flow solver sub-routines are not taking full advantage of, thus shave off a few ms on each loop. The first approach might require a background in aerospace engineering, fluid dynamics or math, while the second approach might be submitted by a Fortran expert. However, neither requires you to have access to a large scale multi-node cluster to participate. Other participants might focus on the pre-processing, which divides the computations up into available cores, or optimizing and minimizing the overhead of inter-node communication and processing. These participants might be experts on domain decomposition and large cluster optimization.Challenge Objective
The primary objective of this challenge is to collect as many actionable ideas as possible. We are calling these ideas Improvement Candidates or ICs for short. An IC is the result of a contestant who has obtained FUN3D, analyzed the performance bottlenecks, and has identified a possible modification that might lead to reducing the overall computational time of a flow analysis job. At the end of the challenge, we will collect all the ICs and they will be cataloged, classified, grouped, and validated while maintaining the connection to the original submission and submitter. In order to compete in this challenge, you must be a US Citizen and at least 18 years of age.How to Win
There are two ways to win:Challenge Scoring
Grand Prize Scorecard $15,000 1st place, $10,000 2nd place
20% Recommendations Quality: To what extent have the Improvement Candidates convinced the reader that the solution(s) is feasible and will lead to measurable performance improvements.
60% Net Improvement: To what extent does the submitter describe and justify the net improvement factor. ( this is the sum of all ICs )
10% Overall Detail: High Level of detail describing the implementation of the ICs.
Improvement Candidate Prize Pool points $10,000 to be divided equally to submitters based on the following point system
1 point Base Rule:Submission Overview
In order to identify and qualify improvements to the performance of FUN3D, competitors will naturally follow three phases of the discovery process starting with analyzing the current software source code. Once that is complete the participant will theorize on ideas that may lead to net performance improvements. After these improvement candidates have been identified, the contestant will attempt to prove them by some sort of demonstration. This may be in the form of a snippet sample code that outperforms the provided source or it may be in the form of academic discussion comparing actual artifacts from FUN3D source versus alternatives approaches or code. We are asking all proof, demonstration, details and supporting discussion be put into one of three categories:DO1: (demonstrable objective 1) if it can be supported by demonstration and does not consider multiple cores.
DO2: (demonstrable objective 2) if it can be supported by demonstration and does consider multiple cores or pre-processing
ND1: (non-demonstrable 1) if it can not be demonstrated, results are inconclusive, it disproved an IC claim, or it was simply not tested.
The submission artifact itself should be a document in pdf, word, or markdown format and should at a minimum, include the sections described below.
Section 1: AS1
1.1 Executive Summary: The executive summary that contains both the count of ICs that will be discussed and an estimate of the total net improvement factor if all the ICs were adopted. This net improvement factor should be in the format “10x” and should be the final sentence in the executive summary.1.2 Enumerated list of Improvement Candidates - bulleted list improvements in the format IC1, IC2, IC3 … with a sentence or two descriptions. Following the description should be the classification DO1 or DO2 or ND1 inside square brackets [ ], optionally you may give relative improvement factors to each of the ICs.
Section 3: DO1:
Code examples and discussion to support ICs that can demonstrate objective improvement regardless of CPU cores.
Section 4: DO2:
Code examples and discussion to support ICs that can demonstrate objective improvement considering multiple CPU coresSection 5: ND1:
Discussion about ICs that are not demonstrable.These 3 sections [3-5] s should contain all the supporting evidence and discussion to your recommendations and should contain the same IC number (id) that you enumerated in the section above. They may reference external files attached as sample code and they should also contain a detailed discussion of how it should be implemented. You may omit section if you have no IC that fit into it. These sections are also used for the 10k Prize pool so you should not leave anything to chance.
Section 6: Open Discussion:
This section is free and open to express your opinion about anything that does not fall neatly into the other sections. You may include a bio here if you like or discuss your experience with the challenge.SectionPrizes
A prize purse totaling $35,000 cash prizes is available:1st Place $15,000
2nd Place $10,000
Qualified Improvement Candidate Prize Pool $10,000
Participation Eligibility:
The Prize is open to US Citizens, age 18 or older. If you are a NASA employee, a Government contractor, or employed by a Government Contractor, your participation in this challenge may be restricted.Submissions must be made in English. All challenge-related communication will be in English.
To be eligible to compete, you must comply with all the terms of the challenge as defined in the Challenge-Specific Agreement, which will be made available upon registration.
Intellectual Property
Competitors who are awarded a prize for their submission must agree to grant NASA a an irrevocable, royalty free, perpetual, sublicensable, transferable, and worldwide license to use and permit others to use all or any part of the solution including, without limitation, the right to make, have made, sell, offer for sale, use, rent, lease, import, copy, prepare derivative works, publicly display, publicly perform, and distribute all or any part of such solution, modifications, or combinations thereof and to sublicense (directly or indirectly through multiple tiers) or transfer any and all such rights. See the Challenge-Specific Agreement, which will be made available upon registration, for full details on intellectual property.Registration and Submissions:
Submissions must be made online (only), via upload to Topcoder, on or before 5:00pm EST on June 29, 2017. All uploads must be in zip format. No late submissions will be accepted.Selection of Winners:
Based on the winning criteria, prizes will be awarded per the weighted Judging Criteria section above.Judging Panel:
The determination of the winners will be made by Topcoder based on evaluation by relevant NASA specialists.FUN3D Software:
Description
The FUN3D software is written predominantly in Modern Fortran. The software is evolving steadily in multi-language directions for reasons other than performance. Currently, a standard computational task in the CFD area takes from thousands to millions of computational core-hours.FUN3D is:
- Code developed by the US Government at US taxpayer expense
- Flow analysis solver is written in Fortran, other components are written in C++ and Ruby
- Code which can be applied to a wide range of fluid dynamic problems, and
- Has a number of code features which represent leading-edge technology
- Is export controlled research code
Instructions to download
FUN3D can be downloaded at https://software.nasa.gov/software/LAR-18968-1FUN3D Application Guidance
- FUN3D has strict export laws so only US citizens may apply for the software and compete in this challenge.- Use only personal non-affiliated emails like Gmail or Yahoo. Company or .edu emails should not be used since you are competing as an individual and they imply affiliation.
- There is a question in the application that asks you to explain the purpose for which the software is to be used. You may simply put: “HeroX HPFCC Challenge”
- You should plan to install the Fun3D on your own computer, NOT one provided by your company or school.
- No other individuals besides yourself should be using this software and you should answer affirmative to the question asking if it will be used in-house only.
- You should use your full name for the recipient (Company/ University) Name question.
- Once your request has been accepted you will get an email notification informing you will need to sign the Software User Agreement. Once you log back into the NASA Software portal and you should see your request in the Pending state, there will be “action” button with an option to sign the SUA. This does not mean your application is complete. Once you sign, the final step is to verify your address.
- Once you sign the Software agreement (SUA) you will be sent a letter via the US mail which will contain a passcode. You may respond to the email provided in the letter and include the passcode. After you send the email which includes the passcode you should receive instructions to download the software within one business day.
You will need to get started on this right away, as this approval process will take several weeks. While you are waiting for the software request to be processed you can download and read the FUN3D documentation and manual.
NASA Pleiades supercomputer:
The FUN3D software is typically run on the NASA Pleiades supercomputer. Pleiades is a distributed-memory Silicon Graphics Inc. SGI ICE cluster connected with InfiniBand® in a dual-plane hypercube technology.
Pleiades Current System Architecture
- 161 racks (11,472 nodes)
- 7.25 Pflop/s peak cluster
- 4.09 Pflop/s LINPACK rating
- 132 Tflop/s HPCG rating
- Total CPU cores: 246,048
- Total memory: 938 TB
Information about Pleiades can be found here.
Questions about this challenge should be posted to the Challenge Forum or emailed to support@topcoder.com
Questions about FUN3D software, and the Pleiades supercomputer may be emailed to NASA
at: hq-fastcomputingchallenge@mail.nasa.gov
Final Submission Guidelines
The submission should be a single zip file containing a document in pdf, word, or markdown format and should at a minimum, include the sections described below. You are also encourage to submit a quick video that summarize you finding but this is not required.
Section Details
AS1
Architecture Analysis and Improvement Candidate IdentificationDecompose the current FUN3D components and provide an overall analysis of any components that may be a candidate for improvement. Improvements may consist of refactoring, replacement, or marshaling. Floating point processing, Inter-processing communications, pre-processing or any other capability that improves performance are ideal topics for consideration.
These are candidates for improvements you will be given the chance to prove, disprove or leave them unproven in other sections. Each Improvement Candidate should be enumerated and given a designation ICn where n is a sequential number starting at 1. This designation will be used later in the submission and will be cataloged against all submissions.
DO1
Demonstrable Improvement - omitting core considerationThe primary objective of this section is to compare existing architecture patterns that are used in FUN3D with alternative patterns and demonstrate that the alternative pattern shows measurable improvements. This section is used to support and prove the theories presented for the Improvement Candidates in section AS1. The comparison of the performance of these two patterns may be in the context of FUN3D source code or they may be in standalone examples however this is the demonstration of code optimization. Not all items AS1 will have a corresponding section in DO1 but all items in DO1 will have a corresponding Improvement Candidate in section AS1. All Items in this section can demonstrate some level of improvement regardless of the number of CPU cores.
DO2
Demonstrable Improvement - multi-core considerationThis section is identical to DO1 except that it does take into consideration the use of multiple cores. The optimization may be in the form of the decomposition domain, inter-processor communications, or step processing. This section should call out an existing pattern in the existing code base that utilizes multi-core processing and articulates an improvement. This section should be able to demonstrate a measurable improvement.
ND1
Non-demonstrable, inconclusive, not tested and failed improvement candidates.Every improvement candidate (IC) in section AS1 should be elaborated in either section DO1, DO2 or ND1 (this section). This section is reserved for improvement candidates that have no evidence of improvement. The reason the improvement candidate is not demonstrable should be identified in the discussion of each item. For example, if a submitter identified an improvement candidate that leverages a pattern that utilized a large number of CPUs but the submitter does not have access to a cluster with a large number of CPUs the Improvement candidate is Non-demonstrable. Inconclusive and failed improvement candidates also belong in this section. Failed improvement candidates are also important to this challenge since they identify items that will not improve the overall compute times.