NERSC Implements Organizational Changes to Better Address Evolving Data Environment
February 23, 2015
Contact: Jon Bashor, jbashor@lbl.gov, 510-486-5849
Sudip Dosanjh, director of the Department of Energy’s National Energy Research Scientific Computing Center, announced several organizational changes to help the center’s 6,000 users more productively manage their data-intensive research. The changes took effect Monday, Feb. 23.
NERSC’s Storage Systems Group will become part of the Services Department, with the intent of building greater collaborations with the Data & Analytics Services Group. Additionally, Katie Antypas, who heads the Services Department, is also being named as the NERSC Deputy for Data Science.
“Data science is a cross-cutting thrust for NERSC and Katie will be responsible for organizing our work in this area and furthering our data strategy,” Dosanjh said. “This effort will require close collaboration between the Storage Systems and Data & Analytics Services groups at NERSC, in addition to other groups in NERSC, Computational Research Division and ESnet.”
Antypas said the line between storage and memory is becoming increasingly blurred and the focus now needs to be on how data is moved, managed, and analyzed on the deepening memory and storage hierarchies within the Center.
“From our users’ perspective, this approach will provide a more coherent structure and result in improved tools and capabilities to help them manage, and move data between the different layers of memory and storage,” Antypas said. “When you look at the architectures coming down the road, it’s evident that the lines between memory and storage are blurring. For example, in our newest system, Cori, there will be a Burst Buffer, a layer of flash storage between the system’s memory and file system, so it just makes sense that our Storage group and our Data and Analytics group will need to work together to make it and future services a success.”
As the DOE Office of Science’s primary center for scientific computing, NERSC is also a leading data center and has become a net importer of data, much of it observational data, like that gathered by telescopes, and experimental data, including results from experiments at light sources and particle accelerators.
Last year, scientists and engineers from NERSC, ESnet and the Computational Research Division worked with researchers at other DOE national laboratories to develop a series of science data pilot projects. Two of the projects focused on combining simulation capabilities with experimental data from the Advanced Light Source at Berkeley Lab, using ESnet and supercomputers at NERSC and Oak Ridge National Laboratory. Another project sought to automate the distribution, analysis and return of cosmological simulation datasets between five supercomputing centers in order to keep up with an ever-increasing flow of data.
“Historically, NERSC has done a good job of handling data challenges in high performance computing, providing both state-of-the art systems and specialized support services,” said Prabhat, head of NERSC’s Data & Analytics Services Group. “And now we are positioned to tackle this new class of data challenges involving simulation, experimental, and observational data. Our goal is to better utilize our existing infrastructure so we can plan and prioritize for handling these new modalities of data.”
Jason Hick, who leads the Storage Systems Group, said he’s excited by the potential to further support users as his group expands their role.
The Storage Systems Group has supported NERSC’s science data portals, or science gateways, which provide researchers access to large datasets, such as observational data from telescopes and neutrino observatories. The portals also support closer collaboration by research team members.
Hick’s team has also provided specially tuned data transfer nodes, or DTNs, for moving data as quickly as possible between sites by eliminating potential bottlenecks where the network connects to facilities.
“Usually, we associate DTNs with the hardware, but it’s really a service and we are looking for ways to improve this service to our users by abstracting the functionality they want from a specific piece of hardware and providing it from a larger resource pool,” Hick said. “This is one example of the increasing demand we’re seeing for data services and we see this new organizational model as an integrated approach to meeting those needs.”
“More than just being moves on an org chart, these changes will help us meet the evolving nature of the data our users rely on for their scientific discoveries,” Antypas said. “Not only are datasets getting larger, the synthesis of data from experiments and simulations is now a vital part of the discovery process. By being able to effectively help our users access, manage and derive meaning from their data in this changing environment, NERSC will play an even greater role in their scientific productivity.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.