DESI Early Data Release Holds Nearly Two Million Objects
NERSC makes the first batch of data from the Dark Energy Spectroscopic Instrument available for researchers to mine
June 13, 2023
By
The universe is big, and it’s getting bigger. To study dark energy, the mysterious force behind the accelerating expansion of our universe, scientists are using the Dark Energy Spectroscopic Instrument (DESI) to map more than 40 million galaxies, quasars, and stars. Today, the collaboration publicly released its first batch of data, with nearly two million objects for researchers to explore.
The 80-terabyte data set comes from 2,480 exposures taken over six months during the experiment’s “survey validation” phase in 2020 and 2021. In this period between turning the instrument on and beginning the official science run, researchers made sure their plan for using the telescope would meet their science goals – for example, by checking how long it took to observe galaxies of different brightness and by validating the selection of stars and galaxies to observe.
“The fact that DESI works so well and that the amount of science-grade data it took during survey validation is comparable to previous completed sky surveys is a monumental achievement,” said Nathalie Palanque-Delabrouille, co-spokesperson for DESI and a scientist at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab), which manages the experiment. “This milestone shows that DESI is a unique spectroscopic factory whose data will not only allow the study of dark energy but will also be coveted by the whole scientific community to address other topics, such as dark matter, gravitational lensing, and galactic morphology.”
Today the collaboration also published a set of papers related to the early data release, which include early measurements of galaxy clustering, studies of rare objects, and descriptions of the instrument and survey operations. The new papers build on DESI’s first measurement of the cosmological distance scale that was published in April, which used the first two months of routine survey data (not included in the early data release) and also showed DESI’s ability to accomplish its design goals.
Capturing the Faint Light of Distant Celestial Objects
DESI uses 5,000 robotic positioners to move optical fibers that capture light from objects millions or billions of light-years away. It is the most powerful multi-object survey spectrograph in the world, able to measure light from more than 100,000 galaxies in one night. That light tells researchers how far away an object is, building a 3D cosmic map.
“Survey validation was very important for DESI because it allowed us – before starting the main survey – to adjust our selection of all the objects, including stars, bright galaxies, luminous red galaxies, emission line galaxies, and quasars,” said Christophe Yeche, a scientist with the French Alternative Energies and Atomic Energy Commission (CEA) who co-leads the target selection group. “We’ve been able to optimize our selection and confirm our observation strategy.”
Researchers took detailed images in 20 different directions on the sky, creating a 3D map of 700,000 objects and covering roughly 1% of the total volume DESI will study. With the instrument and survey plan successfully tested, the main DESI survey is now filling in the gaps between those observations.
As the universe expands, it stretches light’s wavelength, making it redder – a characteristic known as redshift. The further away the galaxy, the bigger the redshift. DESI specializes in collecting redshifts that can then be used to solve some of astrophysics’ biggest puzzles: what dark energy is and how it has changed throughout the universe’s history.
While DESI’s primary goal is understanding dark energy, much of the data can also be used in other astronomical studies. For example, the early data release contains detailed images from some well-known areas of the sky, such as the Hubble Deep Field.
“There are some well-trodden spots where we’ve drilled down into the sky,” said Stephen Bailey, a scientist at Berkeley Lab who leads data management for DESI. “We’ve taken valuable spectroscopic images in areas that are of interest to the rest of the community, and we’re hoping that other people will take this data and do additional science with it.”
Two interesting finds have already surfaced: Evidence of a mass migration of stars into the Andromeda galaxy, and incredibly distant quasars, the extremely bright and active supermassive black holes sometimes found at the center of galaxies.
“We observed some areas at very high depth. People have looked at that data and discovered very high redshift quasars, which are still so rare that basically, any discovery of them is useful,” said Anthony Kremin, a postdoctoral researcher at Berkeley Lab who led the data processing for the early data release. “Those high-redshift quasars are usually found with very large telescopes, so the fact that DESI – a smaller, four-meter survey instrument – could compete with those larger, dedicated observatories was an achievement we are pretty proud of and demonstrates the exceptional throughput of the instrument.”
Turning Data into Useful Information
Survey validation was also a chance to test the process of transforming raw data from DESI’s ten spectrometers (which split a galaxy’s light into different colors) into useful information.
“If you looked at them, the images coming directly from the camera would look like nonsense – like lines on a weird, fuzzy image,” said Laurie Stephey, a data architect at the National Energy Research Scientific Computing Center (NERSC), which processes and stores DESI’s data. “The magic happens in the processing and the software being able to decode the data. It’s exciting that we have the technology to make that data accessible to the research community and that we can support this big question of ‘what is dark energy?’”
A Breakthrough Project for Python
DESI’s early data was a unique project for NERSC. All of the experiment’s code, including the computational heavy lifting, is written in the programming language Python rather than the traditional C++ or Fortran.
“That was the first time that using pure Python was shown to be a feasible approach for a major experiment at NERSC, and since then, Python has become increasingly common in our user workload,” Stephey said.
The DESI early data release is now available to access for free through NERSC.
Much More to Come
There is plenty of data yet to come from the experiment. DESI is currently two years into its five-year run and ahead of schedule on its quest to collect more than 40 million redshifts. The survey has already cataloged more than 26 million astronomical objects in its science run and is adding more than a million per month.
DESI is supported by the DOE Office of Science and by the National Energy Research Scientific Computing Center, a DOE Office of Science user facility. Additional support for DESI is provided by the U.S. National Science Foundation, the Science and Technology Facilities Council of the United Kingdom, the Gordon and Betty Moore Foundation, the Heising-Simons Foundation, the French Alternative Energies and Atomic Energy Commission (CEA), the National Council of Science and Technology of Mexico, the Ministry of Science and Innovation of Spain, and by the DESI member institutions.
Kitt Peak National Observatory is a program of NSF’s NOIRLab.
The DESI collaboration is honored to be permitted to conduct scientific research on Iolkam Du’ag (Kitt Peak), a mountain with particular significance to the Tohono O’odham Nation.
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.