NERSC Issues ‘NESAP for Data’ Call for Proposals
Deadline for Submissions is November 1
October 4, 2016
NERSC is now accepting applications for participation in the new NESAP for Data program, an extension of the popular NERSC Exascale Science Applications Program (NESAP) that was launched in 2014.
Through NESAP, NERSC has been partnering with code teams and library and tool developers to prepare and optimize their codes for the Cori manycore architecture. Like NESAP, the NESAP for Data program will join application teams with resources at NERSC, Cray, and Intel and will last beyond the final acceptance of the Cori system. The program will be jointly managed by NERSC’s Data Analytics and Services (DAS), Data Science Engagement (DSE), and Application Performance (AP) groups.
While the initial NESAP projects involved mostly simulation codes, NESAP for Data is targeting data-intensive science applications that rely on processing and analysis of massive datasets acquired from experimental and observational sources such as telescopes, microscopes, genome sequencers, and light sources. The goal is to enable such applications to take full advantage of the Intel Xeon Phi Knights Landing (KNL) chipset on Cori.
“Cori is a system that has been designed to run both data and massively parallel applications,” said Kjiersten Fagnan, group lead of the DSE Group and CIO at the Joint Genome Institute. “NESAP for Data is designed to give experimental facilities and data-intensive projects in-depth consulting and support to ensure that these workloads will run well on Cori and future systems. We anticipate these data applications will push the limits of our systems in ways we haven't before.”
For example, some experiments need to stream large amounts of data directly into memory on the Cori system; to do this successfully will require a deep understanding of the experiment, processing and NERSC infrastructure.
“NERSC staff have been working with more than 20 teams for the past two years to prepare codes for Cori via the NESAP program, and we are in a unique position to facilitate a productive collaboration between HPC technology vendors like Cray and Intel and the scientists and engineers that make up our applications teams,” said Jack Deslippe, acting group lead for the AP group. “Over the last several years, NERSC has built up a team of expert performance engineers to work with and lead these collaborations. The effort is paying off with a number of early success stories porting applications to the KNL architecture.”
In this next NESAP round, NERSC is specifically targeting data-centric applications and will explore the ability of novel aspects of the Cori architecture, including the Burst Buffer, high-bandwidth memory and manycore processors to accelerate data-centric workloads, Deslippe added.
Application teams in NESAP for Data will have access to the following:
- A partner from NERSC’s Data and Analytics Services team or Data Science Engagements Group will assist with code profiling and optimization. NERSC’s Application Readiness team also will provide input and consulting.
- Access to Cray and Intel resources to help with code optimization.
- Early access and significant hours on the full Cori system.
- Select application teams may be assigned a post-doctoral researcher to work on issues related to optimization and scaling on KNL.
Application teams in NESAP are responsible for:
- Working with your NERSC Application Readiness partner to produce profiling and scaling plots as well as vectorization and memory bandwidth analyses.
- Assigning someone in your group to work on optimizing, refactoring, testing, and further profiling your code to transition to the Cori node architecture.
- Producing an intermediate and final report detailing the application’s science and performance improvement as a result of the collaboration.
NERSC will use the following criteria to evaluate submissions:
- An application’s computing usage within the DOE Office of Science.
- Representation among DOE Office of Science program offices.
- The ability for application to produce scientific advancements.
- The ability for code development and optimizations to be transferred to the broader community through libraries, algorithms, kernels or community codes.
- Resources available from the application team to match NERSC/Vendor resources.
“NESAP is a really successful program, and part of what we can take advantage of is the development of tools and expertise and knowledge that happened over the last couple of years in NESAP and apply that to the data-intensive codes,” said Rollin Thomas, a big data architect in the DAS group who is coordinating the NESAP for Data call for proposals. “We hope that people who apply will see that NESAP has generated a lot of interest and has built up a portfolio of expertise and best practices and that they really know how to approach the problem of scaling on KNL.”
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.