NERSCPowering Scientific Discovery for 50 Years

Superfacility

Mission Statement

The Superfacility concept is a framework for integrating experimental and observational instruments with computational and data facilities. Data produced by light sources, microscopes, telescopes, and other devices can stream in real-time to large computing facilities where it can be analyzed, archived, curated, combined with simulation data, and served to the science user community via powerful computing, storage, and networking systems. Connected with high-speed programmable networking, this superfacility model is more than the sum of its parts. It allows for discoveries across data sets, institutions, and domains and makes data from one-of-a-kind facilities and experiments broadly accessible.

The NERSC Superfacility project is designed to identify this concept’s technical and policy challenges for an HPC center. It coordinates and manages the work to address these challenges in partnership with target science teams. It is designed to ensure that the solutions developed are widely useful (rather than one-off engagements), scale to multiple user groups, and are scalable for NERSC staff to support.

Services in Development

Data Management and Sharing

We are working to develop and deploy tools that can be used to handle the large volumes of data generated by superfacility partners.

Data Transfer

  • Globus is our tool of choice for large data transfers. We have several optimized data transfer nodes that can access every file system at NERSC.
  • We are working to offer a new interface into HPSS that eliminates much of the difficulty of bundling and uploading files.
  • A command line tool to do parallel transfers between file systems at NERSC (including HPSS) has been deployed on NERSC systems.
  • Batch system integration of data movement is being explored

Data Discovery

  • The NERSC Data Dashboard lets you see where your data is on the Project file system.
  • A PI Dashboard is under development to allow PIs to address common issues (like permission drift) for the data they control

Data Sharing

  • Spin, a service platform for deploying science gateways, has been successfully deployed.
  • Globus Sharing has been enabled for data on the Project file system

The Superfacility API

We recognize that automation is an important driver for the experiment and observational facilities we work with. Automated experiment pipelines need to interact with NERSC without a human in the loop - moving data, launching compute jobs, and managing access. In response to this emerging and increasing need, NERSC has developed a rest-based API interface to many common functions and queries on our systems. For example, a user can query NERSC center status, submit or query the status of jobs, transfer data into and within NERSC, and get information about other users in their project.

The API is under active development - we are continually adding and refining functionality based on the needs of our partner facilities. For up-to-date information, please see the NERSC API documentation.

The Superfacility Demo Series (May 2020)

In May 2020, the Superfacility Project held a series of virtual demos of tools and utilities developed to support the needs of experimental scientists at ESnet and NERSC.

Date/Time Topic/Speaker Abstract Recording
May 6, 2020, noon PT

SENSE: Intelligent Network Services for Science Workflows

Xi Yang and the SENSE team

The Software-defined network for End-to-end Networked Science at Exascale (SENSE) is a model-based orchestration system that operates between the SDN layer controlling the individual networks/end-sites and science workflow agents/middleware. The SENSE system includes Network Resource Manager and End-Site Resource Manager components, which enable advanced features in the areas of multi-resource integration, real-time responsiveness, and workflow middleware interactions.

The demonstration will show the status of ongoing work to integrate SENSE services with domain science workflows, such as those envisioned for DOE Superfacility operations. A common vision for these integrations is the provisioning of SENSE Layer 2 and Layer 3 services based on knowledge of current and planned data transfers. SENSE allows workflow middleware to redirect traffic at granularities ranging from a single flow, specific end-system, or an entire end-site onto the desired SENSE provisioned services. The SENSE Layer 2 services provide deterministic end-to-end resource guarantees, including the network and Data Transfer Node (DTN) elements. The SENSE Layer 3 service provides the mechanisms for directing desired traffic onto specific Layer 3 VPN (L3VPN) for policy and/or quality of service reasons.

SENSE video

 

SENSE slides

May 13, 2020, noon PT

Data Management Tools and Capabilities

Lisa Gerhardt and Annette Grenier

The PI Dashboard is a web portal that will allow PIs to address many of the common permission issues that come up when dealing with shared files on the Community File System.

GHI is a new GPFS / HPSS interface that offers the benefits of a more familiar file system interface for HPSS. Often, users want to store complex directory structures or large bundles in HPSS, which can be difficult to do with the traditional HPSS access tool. GHI can easily move data between HPSS and the GPFS file system with a few simple commands.


NERSC has written several command line data transfer scripts to users integrate data transfers into their workflows. We’ll do a brief demo of these scripts.

Data Management Tools (video)

 

Data Management Tools (slides)

May 20, 2020, noon PT

Superfacility API: Automation for Complex Workflows at Scale

 

Gabor Torok, Cory Snavely, Bjoern Enders


The Superfacility API aims to enable the use of all NERSC resources through purely automated means using popular development tools and techniques. An evolution of its predecessor, NEWT, the newly-designed API adds features designed to support complex, distributed workflows such as placing future job reservations and registration of API callbacks for asynchronous processes. It will also allow users to offload tedious tasks such as large data movement via simple REST calls.

While the Superfacility API is designed for non-interactive use, this demonstration will use a Jupyter notebook to step through a working example that calls the API to conduct a simple workflow process. The discussion will include additional information on planned API endpoints and authentication methods.

Superfacility API (video)

Superfacility API (slides)

May 27, 2020, noon PT

Docker Containers and Dark Matter: An Overview Of the Spin Container Platform with Highlights from the LZ Experiment

Cory Snavely, Quentin Riffard, Tyler Anderson

Spin is a container-based platform at NERSC designed for deploying science gateways, workflow managers, databases, API endpoints, and other network services to support scientific projects. Spin leverages the portability, modularity, and speed of Docker containers to allow NERSC users to deploy pre-built software images or design their own quickly. The underlying Rancher orchestration system provides a secure, managed infrastructure with access to NERSC systems, storage, and networks.

One project making use of Spin as part of its engagement with the Superfacility project is the LZ Dark Matter Experiment, which is preparing to operate a 10-ton, liquid-xenon-based detector a mile underground at the Sanford Underground Research Facility (SURF) in South Dakota. The collaboration of some 250 scientists and 37 research institutions is busily readying the detector and associated software and data systems.

Services that will run in Spin to support the LZExperiment range from databases to data transfer monitoring and have been exercised during mock data challenges. In this demonstration, NERSC staff will give an overview of the Spin platform and show how a simple service is created in a few seconds. LZ staff will then describe the science of dark matter detection and give an overview of their work in Spin so far, focusing on the Event Viewer, a science gateway that allows researchers to examine significant detector events.

Docker Containers and Dark Matter (video)

Docker Containers and Dark Matterink (slides)

June 3, 2020, noon PT

Jupyter

 

Matthew Henderson (w. Shreyas Cholia and Rollin Thomas)

Large scale “Superfacility” type experimental science workflows require support for a unified, interactive, real-time platform that can manage a distributed set of resources connected to High Performance Computing (HPC) systems. Here, we demonstrate how the Jupyter platform plays a key role in this space - it provides the ease of use and interactivity of a web science gateway while providing scientists the ability to build custom, ad-hoc workflows in a composable way. Using real-world use cases from the National Center for Electron Microscopy (NCEM), we show how Jupyter facilitates interactive data analysis at scale on NERSC HPC resources.

Jupyter Notebooks combine live executable code cells with inline documentation and embedded interactive visualizations. This allows us to capture an experiment in a fully contained executable Notebook that is self-documenting and incorporates live rendering of outputs and results as they are generated. The Notebook format lends itself to a highly modular and composable workflow, where individual steps and parameters can be adjusted on the fly. The Jupyter platform can also support custom applications and extensions that live alongside the core Notebook interface.

We will use real world science examples to show how we create an improved interactive HPC experience in Jupyter, including:
- Improvements to the NERSC JupyterHub Deployment
- Scaling up code in a Jupyter notebook to run on HPC resources through the use of parallel task execution frameworks
- Demonstrating the use of the Dask task framework as a backend to manage workers from Jupyter
- Enabling project-wide workflows and collaboration through sharing and cloning notebooks and their associated software environments
We will also discuss related projects and potential future directions.

Jupyter (video)

Jupyter (slides)

Papers and Posters Related to the Superfacility Model

Articles about the Superfacility Model

Talks about the Superfacility Model