NERSCPowering Scientific Discovery for 50 Years

Minutes

Minutes of the ERSUG/EXERSUG Meeting

Pacific Northwest National Laboratory

Richland, WA

Jan 12-13, 1995

Prepared by: Brian Hingerty

Vice-Chair/Secretary

beh@ornl.gov

Agenda

Added topics from exersug (Jack Byers)

 

--Exersug interaction with ESSC, DCCC

--How to get Exersug more involved.

--Exersug membership-- how to encourage new blood?

--Exersug change of chair; need to elect new vice-chair

--Fallout, reaction to Exersug Letter.

--Fallout, reaction to Exersug contribution to OER Summaries

kind of a shortened version of green book, our deadline i think Dec 30

--Various prioritizing, dialog issues raised during Exersug letter writing.

-- what positive/negative effects to expect by pushing for SMPs?

-- push too hard-- endanger MPP

-- push too hard-- oversell beyond what they can really do.

-- don't push hard enough-- get inappropriate mix, weaker models

---from braams:

Please make sure that there

is a suitable occasion there for a frank discussion about priorities. I

would like us to be informed about and to contribute to changes in NERSC's

thinking about the MPP acquisition, in view of market developments over

the past nine months.

-- jardin-- nersc cost-effectiveness

-- Herrmannsfeldt: history lesson: re effectiveness of present model

strong central facility vs spreading money out to local sites.

--All-- what proper mix of high-end smps at nersc vs probably lower end

platforms at local sites?

 

Morning Session (Thursday, Jan 12, 1995)
GENERAL NOTES:
- Jack Byers has graduated to emeritus status at the end of this meeting
  and Brian Hingerty has become the new chairperson.
- Rick Kendall was chosen as the new vice/chair-secretary as of the end
  of this meeting.
- Reaction to the Fusion letter at DOE was not favorable as per J-N Leboeuf.

Welcoming Remarks - Jack Byers
------------------------------

Jack Byers' comments on need for a new green book for DOE
---------------------------------------------------------

EXERSUG members:
Note Kitchens' appeal for stronger ERSUG.
We all need to work on this.  E-mail between ourselves and Kitchens
clearly isnt enough.  We need plans suggestions mechanisms we can use
that we now don't have or don't use.

Also following is more push for us to get going on the green book.
I will need help.  Am starting to work with McCoy and MIrin (from NERSC)
on division of work between NERSC and EXERSUG.
I am presently struggling with my version of statement from users
point of view of needs, requirements and trying to see that it
fits it in with a statement of NERSC vision by Mike McCoy.
When he and I get to some partial agreement I will send this to you
for editing modification etc.  My present idea is that the users
statement ought to be independent of NERSC or Osc or anybody.
If that makes sense, the NERSC vision would naturally stand as a response
to the users statement of needs.

It might make sense to plan to have the ERSUG users statement targeted
elsewhere also, ie, not to use it only for the green book. This
might serve as an initial action in making ERSUG stronger. Ideas for targets?

I will need help from all of you at least on the science accomplishments
sections in your areas. If you cant do this yourselves, please at least
take responsibility for farming it out to people in your discipline.
Potter has agreed to do  the  climate section.

I have a lot of good material (3 ERPUS papers) on QCD.  I will take a first cut
at pulling out a statement of needs and accomplishments from those papers.
But I will need a high energy physics person to at least edit that and perhaps
even rewrite what I can do.

There is some more material from ERPUS that you might use as starting
points, though the most complete ERPUS seemed to be the QCD papers and
the ocean modeling paper by Malone. Contact me for a list of what
I have. I haven't got anything from the ERPUS papers of Leboeuf, Hammett,
Colella, Kendall and others i think.

You also should look at the previous green book to see what is involved.
if you don't have copies, E-mail kitchens for them.

There is a possibility that NERSC will hold a workshop to bring the
green book together, early next year.  This is NOT to suggest that we
are off the hook, but rather to point out that all of the rough drafting must
be complete by then, and probably we should try to have each individual
science section completed in final form, so that the meeting could then fill in
the holes, stitch together the pieces, and make coherent summary statements.

Washington View - Tom Kitchens
------------------------------
-A new science committee is being formed for oversight and guidance
-10% cuts are coming pretty much across the board
-Distributed Computing Committee (DCCC) needs more interaction from
 the users (ERSUG and EXERSUG). How will what they are developing
 affect what the users need etc.

Welcome from PNL - Rick Kendall
-------------------------------
-evening meeting in conference room 2 of the Tower Inn

NERSC Production Environment: Plans for 1995-96
-----------------------------------------------

General Overview - Bill McCurdy
-------------------------------
-microprocessors - revolution (SMP's are available) symmetric multiprocessors
-high end computers not being sold
-defense programs in DOE interested in computers
-DCCC evolving (AFS, X-windows etc)
-need for integrated environment
-unified production environment
-SMP -symmetric multiprocessors (shared memory - 32 processors)

Mike McCoy - Unified Production Environment
-------------------------------------------
-SMP from SGI 12 nodes available in 6 months
-RFP Jan 95 (winner June/July)
-PEP delivery Aug 95
-FCM Aug 96
-draft write-ups available by request (mgm@nersc.gov)
-DCA - development computer assimilation

Unification of the Production Environment - Moe Jette
-----------------------------------------

Systems Administration Thrusts
------------------------------
-Provide our clients with the ability to exploit the diverse computational
 and storage resources of NERSC with the ease of a single computer system.

Local Services
--------------
-authentication
-storage
-computation
-batch
-networking

NERSC Services
--------------
-authentication
-storage
-computation
-batch
-networking

Hardware Components
-------------------
-Diverse Computation Resources
 Vector Supercomputers
 Massively Parallel Supercomputers
 Workstations (SAS and Desktop)

-Diverse Storage Resources
 Andrew File System (AFS)
 Distributed File Service (DFS)
 Common File System (CFS)
 National Storage Lab (NSL) technology based
 High Performance Storage System (HPSS)

-High Speed Interconnect
 Local Area Network (LAN)
 Wide Area Network (WAN,ESNET)

Software Components
-------------------
-Uniform operating system (present)
 UNIX (preferably POSIX compliant)

-Global Authentication (1996-first half)
 Single-use passwords and kerberos

-NERSC Resource Allocation and Accounting (1995- first half)
 Centralized User Bank (CUB)

-Global File Systems (1995-1996)
 AFS Server (present)
 DFS Server (1995-first half)
 AFS and DFS Integrated with Archive (1996+)

-Global Batch and Interactive Computing (1995-2nd half)
 Network queuing environment (NQE)
 portable batch system (PBS)
 load sharing facility (LSF)
 global job submission, monitoring and execution

Other System Components
-----------------------
-Security (1995-1st half)
 protection of client and system information

-Integrated management (1995-1st half)
 project leader
 uniform environment for system administrators
 uniform environment for clients
 integrated "trouble ticket" system

-Licensed software (1995-2nd half)
 convenient access to third-party software

Historical NERSC Development Paradigm
-------------------------------------
SAS-pre-processing
    post-processing

Supercomputer
-------------
code generation
compilation
libraries
run
debug
performance analysis
post-processing

characteristics:
----------------
explicit file transfers
code development done on supercomputer
some pre- and post-processing done on SAS
slow response
limited tool set

Integrated Development Paradigm
-------------------------------
NERSC Interface
---------------
authentication
code generation
compilation
libraries
run
debug
performance analysis
pre-processing
post-processing

NERSC Services
-------------
supercomputers
other compute servers
global file system
load sharing
remote execution
common integrated toolkit
compatibility tools
global resource allocation and accounting

Characteristics:
---------------
global file system
single NERSC login
integrated, common software development tool kit
remote execution of user and system processes

Development Environment Milestones
----------------------------------
-porting codes to massively parallel systems
-access to massively parallel systems
-support of special parallel processing (SPP)
-cray C90 for capability computing
-provide common home directories (through AFS)
-enhance SAS development environment
-document "how to use the unified production environment"
-partner with our clients
-acquire a symmetric multiprocessor (SMP)
-encourage MPP vendors to support integrated tool-kits
-encourage load sharing
-POSIX compliance
 
Development Environment
-----------------------
-many tools on all platforms
 -compilers
 -linkers/loaders
 -math libraries
 -C++ class libraries
 -debuggers
 -performance analyzers

-some tools only on selected platforms
 -source code generators  (GUI builders)
 -documentation preparation tools
 -computer aided software engineering (CASE) tools
 -source code control systems
------------------------------------------------------

Break

------------------------------------------------------

Mass Storage - Steve Louis
---------------------------
Project Leader, High Performance Storage
----------------------------------------

The Storage Role at NERSC
-------------------------
-Key element of the Unified Production Environment
 -Long-Term and High-Performance Storage (HPSS)
 -Medium-Term and Mid-Scale Storage (AFS)
-Provide solutions for use of new storage hardware
 -New storage integration architectures (NSL)
 -Cooperative software development (HPSS)
-Provide services that cannot be duplicated locally
 -Capacities in the hundreds of terabytes
 -Transfer rates in the hundreds of megabytes
 -Continuous 24-hour/day 7-day/week operation

NERSC Strategic Storage Goals
-----------------------------
-High service quality: reliability,availability, security (COTS)
-Scalable I/O facilities to narrow the "storage gap" (HPSS)
-Archival storage as local/shared file system (UPE,HPSS)
-Support for heterogeneous client environments (UPE,HPSS)
-Support for large data management systems (HPSS,HPCC)
-Policies that balance resources with user demand (UPE,CUB)
-Flexible administration of quotas and charging (UPE,CUB)
-Import/export mechanisms for "user-owned" media (???)

Milestones for Storage Hardware
-------------------------------
-Acquisition of NSL-Technology Base System
 -IBM 3494 robotic systems (Nov 94)
 -IBM RS/6000 and 100GB disk (Jan 95)
 -NSL-Unitree commercial software (Feb 95)
 -System integration and startup (Mar 95)
-Interim upgrades to Base System
 -Additional Ports on HIPPI switch (now)
 -New SCSI-2 or HIPPI disk array (Spring 95)
 -New NTP tape and storage units (Summer 95)
 -Possible early conversion to HPSS (Fall 95)
-Acquisition of a Fully Configured Storage System in FY 96
 -Implementation Plan to DOE (Mar 95)
 -Specifications written (Jun 95)
 -Vendor solicitation (Aug 95)
 -Vendor selection and award (Nov 95)
 -Delivery of hardware/software (Jan 96)
 -System integration and startup (Mar 96)

Description of IBM RISC System 6000 Model 7015-R24
--------------------------------------------------
Typical HPSS Configuration
--------------------------
-Available from Steve Louis (louis@nersc.gov)

NERSC Storage Infrastructure Costs
----------------------------------
Budget Year           Disk Cost in $/MB       Tape Cost in $/MB
-----------           -----------------       -----------------
FY86                    $16.70                   N/A
FY87                    $45.28 (1)               N/A
FY88                    $14.10                   $0.320
FY89                    $12.39                   $0.265
FY90                    $10.94                   $0.236
FY91                    $ 9.44                   $0.104
FY92                    $ 7.44                   $0.092
FY93                    $ 5.44                   $0.045
FY94                    $ 3.34                   $0.026
FY95                    $ 2.50 (est.)            $0.015 (est.)
FY96                    $ 1.75 (est.)            $0.005 (est.)
FY2001                  $ 0.35 (est.)            $0.0001 (est.)

(1) Includes new mainframes, software, servers, adapters, controllers

What Could I Have ONLINE (1) for $1,000,000 (2)
-----------------------------------------------
Year            Disk Devices (3)          Tape Devices (4,5)
----            ----------------          ------------------
1996            2 TB on 200 drives        1 TB on 50 drives
2001            15 TB on 400 drives       25 TB on 50 drives
2006            100 TB on 1000 drives     500 TB on 50 drives

1)Represents accessible data without mount operations
2)Drive costs only (excludes servers, controllers, robotics)
3)10GB/disk in '96; 37.5 GB/disk in '01; 100 GB/disk in '06
4)20 GB/tape in '96; 500 GB/tape in '01; 10,000 GB/tape in '06
5)CFS's current ONLINE to Total tape ratio is 1:1,000

-------------------------------------------------------------------

User Services and Information Systems - Jean Shuler
-------------------------------------

Building on a Foundation
------------------------

Traditional Role
----------------
 -Provide technical consulting services - act as a user advocate
 -Bring issues to attention
 -Coordinate and collaborate with NERSC scientists and researchers
 -Develop and provide technical training
 -Provide Software Quality Assurance - administer logging and
  tracking system
 -Provide current, accurate information to documentation system
 -Provide searching/browsing software for information development
  and retrieval system- move all documentation to Web server

Navigation Tools - provide help
----------------

User Service - Greater emphasis on collaborations with NERSC staff to best
------------   utilize the new technologies

User Services and Information Systems
------------------------------------
New Focus Area Goals
--------------------
 -Provide new Single Interface Information Delivery System based on
  standards
 -Develop training utilizing new media - video on demand, video
  tele-conferencing, etc.
 -Provide technical expertise for collaboration and coordination with
  NERSC scientists and researchers - parallel programming techniques,
  visualization, optimization

Use of Netscape - for users to obtain information
---------------

Information Delivery System
---------------------------
 -Video on demand
 -Applications training
 -NERSC documentation
 -MAN pages
 -CrayDoc system
 -Vendor Help Packages (NAG)
 -NERSC online documentation
 -Web information database
 -Logging and tracking system

*The goal of the information delivery system is to deliver the needed
 information anytime, anywhere.

Examples of Databases to be linked with browsing and search tools
-----------------------------------------------------------------
-NERSC developed documentation (Intrograph, Accounting)
-Web databases (NERSC Home Page, ERPUS talks, ESnet)
-BUFFER newsletter
-REMEDY logging and tracking system
-MAN pages
-Bulletin Boards
-CrayDocs and other vendor help packages

User Services - Building on a Foundation
----------------------------------------
In this Age of Information and Technological Revolution we will meet
the needs of the research community through:
-Delivering information in a faster, more effective manner through a
 single interface
-Providing technical expertise for collaboration and coordination with
 computational scientists and researchers
-Provide new training methods and media for facilitating information
 exchange
-Providing continued traditional consulting and support services

------------------------------------------------------------------------

SMP's...where do they fit in, what do they do? - Brent Gorda
----------------------------------------------
-Symmetric Multiprocessors - SMPs
-gaining power
-similarities and differences to existing systems discussed

------------------------------------------------------------------------
SPP Workshop - Bruce Curtis
------------
-Held at NERSC Dec 1994
-Attendees: Brown,Greef (LBL)         Hingerty (ORNL)
            Dimits,Byers (LLNL)       Minkoff (ANL)
            Mankofsky (SAI)           Reutter (LBL)
            Gai,Kendall,McCarthy,Schenter (PNL)
            Pavlo,Vahala,Vahala (IPP,W&M,ODU)

-Topics included:
 -SPP Environment
 -Vector Performance
 -MPP Overview
 -Message Passing
 -Parallel Performance
 -I/O Optimization
 -System Time
 -Case Studies

-Copies of slides, video tapes available now. Document will be available
 soon. Send U.S. mail address to curtis@nersc.gov

-Next Workshop (Proposed)
 -May 1995
 -Targeted for applicants for SPP96, instead of 'winners' (but current
  SPP users welcome)
 -1 1/2 days presentation, 1 1/2 day (optional) hands-on

-SPP96 to coincide with Fiscal Year 1996, and will be aligned with ERDP.

-SPP 1995
 -Aydemir --Nonlinear Gyrofluid Calculation of Tokamak Transport
            1000 CRUs 4GB
 -Bell --Numerical Simulation of the Three-Dimensional Reacting Flow
         in a Pulse Combustor using an Adaptive, Cartesian, Multi-fluid
         Algorithm 2000 CRUs 4GB
 -Cahill --Studies in Lattice Gauge Theory 250 CRUs 1GB
 -Cohen --Toroidal Gyrokinetic PIC Simulation Using Quasi-ballooning
          Coordinates 3600 CRUs 22GB
 -Chen --Simulation of Alpha/Energetic-particle Driven Instabilities
         in Tokamak Plasmas 300 CRUs 1GB
 -Dunning --Study of Solvent Cation Interactions;Chemistry on Oxide
            Surfaces using Ab Initio Molecular Dynamics; Determination
            of Physical and Electronic Properties of Fluorinated
            Polymers 10000 CRUs
 -Fu --Gyrokinetic MHD Hybrid Simulation of MHD Modes Destabilized by
       Energetic Particles 1500 CRUs
 -Hammett --Gyrofluid Simulations of Tokamak Plasma Turbulence
            1000 CRUs .5GB
 -Hingerty --Atomic Resolution Views of Carcinogen Modified Closed
             Circular DNA that can Super-coil 1000 CRUs 1GB
 -Huesman --Experimental Medicine: Clinical Diagnostic, and Isotopic
            Imaging Studies 250 CRUs 1GB
 -Kogut --Simulations of Quenched QCD at Finite Density and Temperature
          16000 CRUs 40GB
 -LeBoeuf --High Resolution Gyro-Landau Fluid Plasma Turbulence
            Calculations at the Core of Tokamaks 4000 CRUs
 -K.-H. Lee --High Resolution Imaging of Electrical Conductivity Using
              Low Frequency Electromagnetic Fields 300 CRUs
 -W.W. Lee --Gyrokinetic Simulation of Tokamak Plasmas - Investigation
             of Micro-turbulence and Core Transport Using Three-
             Dimensional Toroidal Particle Codes 4000 CRUs 18GB
 -Lester --Quantum Monte Carlo for Molecules 500 CRUs 2GB
 -Mankofsky --3D EM and EM-PIC Simulation with ARGUS 250 CRUs 1GB
 -Soni --Hadronic Matrix Elements of Heavy-Light Mesons
         14800 CRUs 392.4 GB
 -Stevens --Benchmarking Comparison of Computational Chemistry Codes
            with MPPs 1500 CRUs 3GB
 -Vahala --Lattice Boltzman Approach to Turbulence in Divertor Plasmas
           800 CRUs 8GB

                               SPP 1995
                               --------

Started December 1, 1994. During the first month of SPP95, the performance
of jobs has been substantially improved: averaging about 13 cpus vs.
about 8 cpus for the first month of the previous two years of SPP. The
workload has been moderate, however, probably due in part to the holidays.

                             Recent Runs
                             -----------

       P.I.                Avg. CPUs              Total GF/wall
                                                      sec
       ---                 ---------              -------------

    Cohen                    14.4                     5.3
    Leboeuf                  13.4                     4.5
    Soni                     14.5                     5.0
    Vahala                   15.3                     9.0


------------------------------------------------------------------------
Adjoin for lunch
Closed ExERSUG Lunch Meeting - Hills Street Deli
Afternoon Session - Thurs Jan 12,1995
LAN - Tony Hain - Video-conference - White Room
---
-Local Area Network (LAN)
-video-conference from NERSC
-on M-Bone

------------------------------------------------------------------------

ESNET - Jim Leighton - Video-conference - White Room
-----
Reports and Issues of Current Interest
--------------------------------------
-ESNET report
-bytes double in 6-8 months!
-T3 expansion (leading edge) can have problems
-WAN (Wide Area Network)

------------------------------------------------------------------------

Preparation for the MPP - Tammy Welcome
-----------------------
NERSC provides several means whereby researchers can prepare for
MP computing.
-Collaboration with NERSC staff
-MPP Access Program
-MPP Workshop

NERSC is collaborating with researchers to parallelize C90 capability codes
---------------------------------------------------------------------------
dtem- LeBoeuf (ORNL), 2.1% of C90
 -fluid simulation of plasma turbulence
 -developed new convolution algorithm (used also in KITE) which minimizes
  memory usage, programmed inner loops in assembly language achieving 10X
  speedup for that phase of the code
 -currently parallelizing for T3D using message passing

xg3e- Cohen (LLNL), 1.5% of C90
 -gyrokinetic PIC plasma simulation
 -ported to T3D using PVM
 -currently retro-fitting new production code into existing framework

lu.x- Soni (Brookhaven), 7.9% of C90
 -lattice quantum chromodynamics
 -Soni will be collaborating with MILC directly
 -NERSC may help tune application performance on the PEP

..to parallelize applications for the MPP access program

kite- Lynch/Leboeuf (ORNL)
 -fluid simulation of plasma turbulence
 -ported to T3D using PVM
 -tuned convolution algorithm (see dtem)
 -in future will tune matrix transpose communications

..and to parallelize and enable applications for the h4p

ParFlow/SLIM - Ashby/Tompson(LLNL)
 -chemical migration (SLIM) in ground water simulation (ParFlow)
 -ported SLIM to C90 with plans to parallelize for T3D

ardra - Dorr (LLNL)
 -simulation of nuclear well logging devices
 -ported to T3D using PVM
 -tuned to minimize communication overhead and maximize single
  processor performance

mdcask - Diaz De La Rubia (LLNL)
 - 3-D molecular dynamics modeling ion beam implantation
 - ported to T3D using PVM
 - developed distributed application running over WAN that allows
   interactive program control/input and permits real-time
   visualization of data
 - work described in invited paper at the April HPC Symposium

..more parallelization and enablement

camille - Mirin (LLNL)
 - global climate model
 - ported to T3D using portable message passing macro library
 - currently tuning with shmem calls

icf3d - Kershaw (LLNL)
 -study interaction of radiation (diffusive) with matter - to be used
  mainly for Inertial Confinement Fusion
 -currently being developed in C++ on t3D using shmem

 -development of Parallel Data Distribution Preprocessor
  - cgscf 0 Mailhiot (LLNL) - simulation of advanced materials design

 -development of dynamic time-sharing scheduled environment on the Cray T3D

MPP Access Program provides computer resources for the development of
---------------------------------------------------------------------
parallel applications
---------------------

Initially, user develops parallel applications using a small test case
User debugs application
Resources permitting, user scales-up application to a larger number of
processors and larger problem sets

The goal is to have these MP applications ready for production when NERSC's
first MP computer system arrives in the latter part of 1995.

9 proposals have been award allocations on 4 parallel platforms

       MPP Access Program, Round 1 (Sept 94 - Sept 95)

      PI               Project

CM-5 at LANL
------------
Banerjee           Direct Simulation of Turbulence-Surface Interaction
Vinals             Numerical Studies of Non-equilibrium Processes in
                   Condensed Matter Physics and Materials Science

Paragon at ORNL
---------------
Brown              Combustion Research
Cotton             The Parallelization of an Atmospheric Simulation Model
Herrmannsfeldt     Accelerator Design and Analysis in 3D
Watson             Parallel Mathematical Software

T3D at LLNL
-----------
Depristo           Large-Scale Molecular Dynamics with Explicit
                   Density Functionals
Dory               High-Resolution Plasma Fluid Turbulence Calculations
                   on the T3D
Stevens            Benchmarking Comparison of Computational Chemistry Codes
                   with MPPs

KSR1 at ORNL
------------
Watson             Parallel Mathematical Software


Proposals for Round 2 are due Jan 20, 1995

Only for allocations on the 256 processor LLNL T3D
Allocation begins March 1995 and ends Sept 30, 1995
Evaluation criteria and instructions for proposals in following:
-December issue of the Buffer
-/afs/nersc.gov/u/ccc/mpp/Public/mppaccess.ps (postscript)
-/afs/nersc.gov/u/ccc/mpp/Public/mppaccess.text (ascii text)
-NERSC World Wide Web page http://www.nersc/gov
-nersc.Parallel.Processing and nersc.PI.info news groups

Allocation decisions made by OSC
PIs responsible for short project status report


The MPP Workshop will prepare NERSC researchers for the arrival of the PEP
--------------------------------------------------------------------------
system in late 1995
-------------------
3-week summer workshop in JUne
20-25 participants
Access to LLNL T3D or pre-PEP system
classroom instruction + exercises
+ guest lectures + personal project --> MPP application

The workshop will consist of classes on basic concepts
in parallel processing... (week of June 11-17)

-MPP architecture overview
-Programming models overview
 - MPI, PVM, and HPF
-Operating Environment
-Tool Use
-Exercises illustrating new concepts

..lectures on advanced topics... (week of June 18-24)
-Approaches to parallel programming
 -Parallel languages and libraries
 -Frameworks, templates
-Scientific chromodynamics
-Computational fluid dynamics
-Molecular dynamics
-Plasma physics
-Climate modeling
-High-end graphics for high-end computing

..and work on personal projects (weeks of June 18-24 and June 25-30)
-NERSC staff available to assist in projects, both during and after workshop
-Attendees will make enough headway on project to continue development
 after workshop
-Attendees will maintain access to LLNL T3D or pre-PEP system until
 arrival of PEP system

Do the benefits of this workshop outweigh the costs?
-time, travel
-boot-strapped onto parallel machine
-parallel development experience
-immersion-free from distractions
-building bridges with staff
-Livermore night life (!)

---------------------------------------------------------------------------
Break
---------------------------------------------------------------------------

Follow-up on Throughput on C90 - Bruce Griffing
------------------------------
NQS Throughput
--------------
At the last ERSUG meeting some sites voiced concern that their NQS jobs were
not progressing through NQS in a timely manner.
NERSC agreed to analyze PNL's throughput and report back.

Response Team Members
---------------------
Bruce Curtis
Bruce GRiffing
Moe Jette
Bruce Kelly
Alan Riddle
Clark Streeter

Some Observations
----------------
We had to identify performance metrics and gather the data that would let us
do the analysis:
  Velocity (cpu time/wall time once job begins execution)
  Wait time (time between submission and initial execution)
  Held time (time between when job is check-pointed because
   allocation was depleted and new allocation is infused. This
   affects velocity.)

It is extremely difficult to reconstruct the C90 environment at any moment
in time using the information logged by UNICOS and NQS.

We had to make many iterations refining NQSTAT because of the many end-cases
in the data we encountered. The public version that you can run is improved
as a result.

Additional Observations
-----------------------
Just before the holidays we received a summary of some PNL jobs spanning
a three month period. The summary included items such as time of submission,
wait and run times, and a brief completion status.

We have just begun to analyze what can be known about the failing cases.

In failing cases where lack of disk space was involved we can't tell from
the logs which file system(s) were involved. People should be using
/usr/tmp for cases where there is a requirement for large amounts of disk
space and/or long-running jobs.

Some Conclusions
----------------
The NQS data is very noisy and post-analysis is very labor intensive.

In a class of large jobs PNL's velocities were lower than that of other
comparable jobs.

NQS is not discriminating against PNL jobs.

We aren't done with the analysis completely, but it appears at this stage
that Gaussian and Crystal suffer lower velocities.

We are analyzing a Gaussian test case. We will do the same with other test
cases as they are made available to us. The goal is to improve defaults
and make recommendations to users and/or developers.

A Recommendation
----------------

It is apparent that it is very difficult to reconstruct the facts months
later so getting information as quickly as possible is essential. Please
contact us quickly if you suspect a problem. For some problems being able
to see it in real time makes resolution much, much easier.

Then we can tell if it's some systematic problem, or a problem with a user
script or technique that could be fixed or improved.

ERDP - Energy Research Decision Packages for FY96
-------------------------------------------------

The ERDP X-Windows and text mode applications for requesting allocations
will open for business on April 3, 1995.

The process will close on August 18, 1995.

-----------------------------------------------------------------------

Update on Distributed Computing and DCCC - Roy Whitney
----------------------------------------

DCCC Vision
     Mission
     Task Forces and Working Groups
     Bartering
     History
     Charter and Membership
     Connections to Other Groups
     Near Term Goals

DCCC Mission
   Develop a distributed computing foundation with the goal of establishing
   an infrastructure and environment capable of  supporting the anticipated
   Distributed Information and Computing Environment (DICE) needs of DOE
   program collaborators on a national basis with a production quality
   level of support.


Key Distribution Task Force
   Bill Johnston (LBL) Chair
   The KDTF will examine issues related to the deployment of secure keys to be
   used by e-mail technologies such as PEM and PGP and authentication services
   such as digital signature, and if appropriate, recommend strategies for such
   deployment.  The KDTF will also be charged to be sure that their efforts are
   compatible with those of the IETF and Federal inter-agency key distribution
   task forces.
   Joint task force with the ESCC ( ESnet Site Coordinating Committee
   kdtf@es.net


Distributed Computing Environment Working Group
   Barry Howard (NERSC) Chair
   The DCEWG will examine and identify the recommended appropriate elements
   of a distributed computing environment, including such components as
   OSF/DCE, the Common Open Software Environment (COSE), the Common Object
   Request Broker Architecture (CORBA) and Load Sharing. The DCEWG will also
   be responsible for recommending strategies and pilots for implementing
   these components.
   dcewg@es.net

AFS/DFS Task Force
   Task Force reports to the DCEWG
   Troy Thompson (PNL) Chair
   The ADFSTF will develop plans for the implementation of DFS in a WAN
   environment and for the migration of existing ESnet AFS to DFS. The
   group  may choose to implement a DFS pilot project.
   adfstf@es.net

Distributed System Management Working Group
   John Volmer (ANL) Chair   
   The DSMWG will develop strategies, tools and pilot projects for
   effectively providing systems management to distributed heterogeneous
   systems. This group will also interact with the DCEWG for the effective
   systems management of DCEWG layer tools.
   dsmwg@es.net

Application Working Group
   Dick Kouzes (PNL) Chair
   The AWG will develop strategies, tools and pilot projects for
   collaboratory use in areas such as the following:
      National Information Infrastructure (NII) focused projects
      Information services including data storage and retrieval,
              project documentation, and multi-media lab notebook
      Distributed collaboratory tools including multi-media communications
              and software development
      Collaboration social organization issues including effective standard
              operating procedures
   awg@es.net

DCCC Architecture Task Force
   Arthurine Breckenridge (SNL) Chair
   The ATF is to recommend a high level architecture for a distributed
   collaboration environment which will eventually provide production-level
   support of research efforts in DOE.
   The architecture should be developed in such a manner that it complements,
   and possibly helps to define, the DOE NII activities.
   It should also address the non-technical (social, political, and
   budgetary) issues to facilitate the establishment of such an environment.
   atf@es.net

Real-Time Working Group
   Suggested by Tommy Thomas (DOE)
   Considerable Interest
   Connect to IEEE Computer Applications in Nuclear and Plasma Sciences
   (CANPS) Committee

Connection to super-lab Project
   super-lab --> LANL-LLNL-SNL
   Defense Programs Activity
   Hank Shay (LLNL) is facilitating coordination with DCCC
   Major supercomputing effort
   Goal is to minimize duplication of efforts and maximize utility of all
   distributed computing efforts

Bartering
   Examples which could be pieces for EPICS* like collaborations:
      Authentication Services (ANL, LANL, NERSC, PNL, Sandia)
      Electronic Places and Caves (ANL)
      Mass Storage (LBL, NERSC/LLNL)
      MBONE Video & High Speed Information Retrieval (LBL)
      Network Monitoring (NERSC, SLAC)
      On-line Data Acquisition System (CEBAF)
      Systems Management (ANL, FNAL)
   The DCCC will initiate a survey of the laboratories for potential pieces
   to put into this collaboration.

   * Experimental Physics and Industrial Control System


History
   Started with need for DCCC and ended up with CCIRDA
   CCIRDA --> Coordinating Committee for Informatics Research, Development,
   and Application
   ESSC and the Chairs of  EXERSUG & SCIE agree to support a first DCCC meeting
   First meeting: September 22-23, 1994 at CEBAF
   Roy Whitney (CEBAF) elected DCCC Chair
   December 1994: ESSC agrees to charter the DCCC

Charter and Membership
   Group to propose Charter:
      Steve Davis (PPPL)
      Jim Leighton (NERSC)
      Sandy Merola (LBL)
      Roy Whitney (CEBAF)
   Membership by participation
   Need to involve computer scientists from academic and commercial areas

DICE Consortium
   Distributed Information and Computing Environment (DICE) Consortium
   Method for integrating
      DOE Facilities
      Universities
      Commercial Interests
   Tool for distributing results of DCCC, ESCC and possibly other DOE
   groups work

Connections to Other Groups
   ESnet Steering Committee - Parent
   ERSUG & EXERSUG - Close and continuous exchange
   Participation in Task Forces and Working Groups
   SCIE - Close and continuous exchange
   ESnet Site Coordinating Committee - High bandwidth exchange
   super-lab - Joint efforts
   DICE Consortium - universities and commercial interests

Near Term Goals
   Next Meeting February 16-17, 1995 - General Atomics
   Productive Task Forces and Working Groups
   Active Bartering initiated
   DICE Consortium initiated
   Quality Exchanges with other Groups

DCCC Summary
   DOE laboratories and collaborators must work together to prosper.

   ERSUG/EXERSUG, ESSC &  SCIE; DCCC, ESCC & DICE Consortium; and CCIRDA
      are excellent examples of how this can be accomplished.

   We have the tools and the talent.
-------------------------------------------------------------------------

Open Discussion - Exersug involvement with DCCC
---------------
It was determined that the ExERSUG chair should attend at least one
meeting per year of the DCCC. It was also determined that the ExERSUG
chair should attend the ESSC meetings as an observer.

-------------------------------------------------------------------------

End of Day - Adjoin
Evening Session Thursday Jan 12, 1995
Dinner at Mexican Restaurant followed by closed ExERSUG-NERSC-OSC
meeting at Tower Inn.

Rick Kendall elected new Vice-Chair/Secretary representing DOE Office
of Basic Energy Sciences

Adjoin for the evening about 9PM
Morning Session Friday Jan 13, 1995
Science Talk by Host Institution - Thom Dunning, Jr.
--------------------------------
Hanford Site is contaminated - remedial action needed - large program
EMSL (Environmental Molecular Sciences Laboratory) program - new building
crown ethers - extract ions
cytochrome P450 - detoxifier
1 GHz NMR facility coming - remote users (distributed computing)

-----------------------------------------------------------------------

Visit to PNL Computing Facilities (host: Rick Kendall)

-----------------------------------------------------------------------

Open Discussion - Bill McCurdy (NERSC)
---------------
Invitation to Discussion - Bill McCurdy
A)The case for SMP's
B)The role of the current capability platforms
C)Dial-up services for NERSC customers (SLIP,XRemote,PPP)
D)Other topics - suggestions welcome
  -How to improve attendance from ERSUG community
  -Next ERSUG meeting
  -ExERSUG membership
  -Search for ExERSUG Vice-Chair

SMPs - Symmetric Multiprocessors
- F-machine replacement (several SMP's)
- migration environment to MPPs
- easier migration
- MPP software environment hard to deal with
 - not permanent problem
- benchmarking needed for SMP's
- SMP loaner from SGI

Next Meeting: ERSUG 6/13-14
              MPP Workshop 6/19-23 and 26-30
              SPP Workshop 6/7-9
              visualization workshop 6/12

-----------------------------------------------------------------------

Proposal: NERSC will discontinue providing Dialup Service to customers
- Barry Howard (NERSC)

Current service provides TELNET access via 800 number to NERSC and Internet
Future proposed service would require use of commercial access service

For the User:
Only TELNET access provided
Growing number of X applications require dialup support for SLIP, PPP,
or XRemote (NCD) protocols

For NERSC:
Cost - 800 phone service cost is increasing ($10K/month currently)
       Anticipate hugh increase if X access supported

Security:
     - Some versions of XRemote don't have display access restrictions
       (Xhost, Xauthority)

No longer a unique service

Commercial Internet Access Providers
-Full range of services offered, including TELNET, PPP, SLIP
-Wide variety of providers; some national and many local
-State of the art equipment
-Additional providers coming on line each month (InterNIC: 130 existing)
-Reasonable costs

Provider (SLIP/PPP)   Area      Setup    10hrs/month     40hrs/month
                    Covered      Cost

California Online   California    20.         25.            25.
Portal Info Network nationwide    20.         30.            58.
Performance System
Int'l               nationwide    200.        29.            51.

What NERSC proposes to provide
-No upgrade or expansion of current service
-Provide information on how to find list of commercial Internet providers
 and what criteria to apply when selecting one

References:
PC World, Jan 95
PC Magazine, 11 Oct 94
MacUser, Sept 94
MacUser, Dec 94

-Provide assistance to sites interested in providing local dialup service
-For more information: Barry Howard, howard@nersc.gov
                       Neal Mackanic, mackanic@nersc.gov
 
-----------------------------------------------------------------------
End of meeting - adjourn approximately noon.