Charlson Kim

FES Requirements Worksheet

1.1. Project Information - Plasma Science and Innovation Center

Document Prepared By	Charlson Kim
Project Title	Plasma Science and Innovation Center
Principal Investigator	Charlson Kim
Participating Organizations	U. Washington, U. Wisconsin-Madison, Utah State, NRL
Funding Agencies	DOE SC DOE NSA NSF NOAA NIH Other:

2. Project Summary & Scientific Objectives for the Next 5 Years

Please give a brief description of your project - highlighting its computational aspect - and outline its scientific objectives for the next 3-5 years. Please list one or two specific goals you hope to reach in 5 years.

The goal is to develop practical and accurate methods to capture the physics needed for predictability in user-friendly codes that take full advantage of state of the art computers that have thousands or more processors. The new methods are being incorporated into the 3D codes and results are compared with data from all participating experiments. The methods will be further refined as needed until predictive capability is achieved for all experiments. A long term goal is the development of design tools that will lead the rapid and cost effective advancement of fusion experiments and of basic plasma science in general.

3. Current HPC Usage and Methods

3a. Please list your current primary codes and their main mathematical methods and/or algorithms. Include quantities that characterize the size or scale of your simulations or numerical experiments; e.g., size of grid, number of particles, basis sets, etc. Also indicate how parallelism is expressed (e.g., MPI, OpenMP, MPI/OpenMP hybrid)

NIMROD - 3D extended MHD initial value, finite elements and spectral spatial representation, implicit time advance, with PIC and continuum method in development. The largest fluid computations involve millions of spatial grid points evolved over tens of thousands of timesteps, evolving on the order of a dozen independent variables. The hybrid kinetic-MHD delta-f PIC module typically runs millions of particles for linear simulations although 100's millions of particle simulations have been successfully demonstrated. The parallelism is accomplished through MPI.

HIFI/SEL - 3D extended MHD initial value, modal finite elements for all 3 dimensions, similar fluid capabilities as NIMROD. Parallelism also through MPI (primarily through PETSC packpage)

PSITET - 3D equilibrium solver using tetrahedral elements uses hybrid OpenMP/MPI.

3b. Please list known limitations, obstacles, and/or bottlenecks that currently limit your ability to perform simulations you would like to run. Is there anything specific to NERSC?

limited by scaling capabilities of sparse linear solvers

3c. Please fill out the following table to the best of your ability. This table provides baseline data to help extrapolate to requirements for future years. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions.

Facilities Used or Using	NERSC OLCF ACLF NSF Centers Other:
Architectures Used	Cray XT IBM Power BlueGene Linux Cluster Other: SGI Altix
Total Computational Hours Used per Year	600000 Core-Hours
NERSC Hours Used in 2009	300000 Core-Hours
Number of Cores Used in Typical Production Run	100-300
Wallclock Hours of Single Typical Production Run	10-30
Total Memory Used per Run	100s? GB
Minimum Memory Required per Core	1-4 GB
Total Data Read & Written per Run	10s GB
Size of Checkpoint File(s)	? GB
Amount of Data Moved In/Out of NERSC	? GB per
On-Line File Storage Required (For I/O from a Running Job)	? TB and Files
Off-Line Archival Storage Required	? TB and Files

Please list any required or important software, services, or infrastructure (beyond supercomputing and standard storage infrastructure) provided by HPC centers or system vendors.

SuperLU and supporting software, VISIT

4. HPC Requirements in 5 Years

4a. We are formulating the requirements for NERSC that will enable you to meet the goals you outlined in Section 2 above. Please fill out the following table to the best of your ability. If you are uncertain about any item, please use your best estimate to use as a starting point for discussions at the workshop.

Computational Hours Required per Year	500000
Anticipated Number of Cores to be Used in a Typical Production Run	100s
Anticipated Wallclock to be Used in a Typical Production Run Using the Number of Cores Given Above	20
Anticipated Total Memory Used per Run	100s? GB
Anticipated Minimum Memory Required per Core	2 GB
Anticipated total data read & written per run	10s? GB
Anticipated size of checkpoint file(s)	? GB
Anticipated Amount of Data Moved In/Out of NERSC	? GB per
Anticipated On-Line File Storage Required (For I/O from a Running Job)	? TB and Files
Anticipated Off-Line Archival Storage Required	? TB and Files

4b. What changes to codes, mathematical methods and/or algorithms do you anticipate will be needed to achieve this project's scientific objectives over the next 5 years.

improved solvers

improved PIC algorithms

4c. Please list any known or anticipated architectural requirements (e.g., 2 GB memory/core, interconnect latency < 1 μs).

low latency always helps

4d. Please list any new software, services, or infrastructure support you will need over the next 5 years.

improved workflow services - easy way of setting up/queuing batches of runs with on the fly postprocessing and posting of data to accessible website.

web-based, checkbox driven way of specifying files and directories for backup services

continued dedicated graphics machine with VISIT server

module support of NIMROD executables (and other flagship codes)

direct support for massive undertaking of the anticipated paradigm shift in parallel computing forshadowed in the next question

4e. It is believed that the dominant HPC architecture in the next 3-5 years will incorporate processing elements composed of 10s-1,000s of individual cores, perhaps GPUs or other accelerators. It is unlikely that a programming model based solely on MPI will be effective, or even supported, on these machines. Do you have a strategy for computing in such an environment? If so, please briefly describe it.

aside from hiring a graduate student and hoping for the best, i have no strategy yet devised for the unspecified machine of unspecified architecture using an unspecified API

New Science With New Resources

To help us get a better understanding of the quantitative requirements we've asked for above, please tell us: What significant scientific progress could you achieve over the next 5 years with access to 50X the HPC resources you currently have access to at NERSC? What would be the benefits to your research field if you were given access to these kinds of resources?

Please explain what aspects of "expanded HPC resources" are important for your project (e.g., more CPU hours, more memory, more storage, more throughput for small jobs, ability to handle very large jobs).

a raw hardware power increase would most likely bring any significant physics research to a grinding halt as the codes are retooled to "take advantage" of "the increased performance" of the new machine.

the smaller scale experiments run comparably smaller simulations compared to large tokamak simulations.
more throughput with longer runtimes for modest sized jobs would be the most beneficial