DOE Establishes Probe Testbed for Storage-Intensive Applications
October 12, 1999
OAK RIDGE, Tenn. -- Today's scientific researchers are generating staggering volumes of data, with data sources ranging from computational simulations of global and regional climate, to digital instrumentation of physical experiments and satellite imagery. Add projects such as human genome mapping, with massive demands for rapid user access, and it's easy to see why strategies for optimizing data storage and retrieval are vital for research laboratories.
The Department of Energy's Oak Ridge National Laboratory (ORNL) and the National Energy Research Scientific Computing Center (NERSC) in Berkeley, Calif., are tackling the storage challenge with Probe, a newly established distributed testbed for storage-intensive applications. It combines the high-speed networking of the latest Energy Sciences Network (ESnet) technology with the R&D 100 award-winning High Performance Storage System (HPSS). Probe will have significant installations at ORNL and NERSC, providing access to researchers around the country.
The Probe testbed is available for researchers from the scientific community to perform comparative evaluations of the latest technologies in storage hardware and software. By linking the two testbed systems together over the network, researchers will be able to evaluate the effects of network latency in remote storage access and develop new protocols for effectively using distributed storage systems. The testbed will also provide a platform for the developers of new storage and networking hardware and software to test their devices in high-demand facilities.
"With the flood of genome data coming from human, model organism and microbial sequencing projects, better and faster tools that facilitate comparative genome analyses will be absolutely critical," said Marvin Frazier of DOE's Office of Biological and Environmental Research. "New physical systems and novel design configurations will be crucial to enabling these research efforts to fulfill their potential."
Researchers can modify or augment the configuration of Probe as needed, for instance, to perform comparative evaluations of equipment from various vendors or to test the throughput of a proposed configuration. Probe will be used to study strategies for exploiting wide-area, high-bandwidth networks connecting terascale data archives across the country. With a variety of network technologies installed, Probe can be used to explore new methods for high-speed transfers from storage to remote visualization systems.
"Extracting scientific understanding from petabytes of data will be a critical challenge for the Department of Energy and the U.S. scientific community in the next decade," said Dan Hitchcock of DOE's Office of Science. "The Probe testbed will enable important experiments in the technology needed to address this challenge. We believe that having this experimental testbed available to the data storage and management research community will accelerate progress in this important area."
Probe is funded by DOE. HPSS, a software system providing gigabyte/second throughput and petabyte-scale capacity, is a development of IBM Global Services and five DOE installations (ORNL, Los Alamos, Lawrence Livermore and Sandia National laboratories and NERSC). HPSS is in production use at many of the leading research institutions, supercomputer centers and universities around the world.
ORNL is a DOE facility engaged in many areas of research, including high-performance computing, networking and storage, heterogeneous distributed computing, collaborative technologies, applied mathematics and physics-based modeling and simulation. ORNL is managed by Lockheed Martin Energy Research Corporation. NERSC, established in 1974, conducts research in such fields as scientific data management, distributed computing, data-intensive computing and advanced systems architecture. ESnet is a high-speed network serving thousands of researchers at more than 30 DOE sites. It allows scientists around the country to utilize unique DOE facilities and high-performance computing resources.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.