Table of Contents
- Response Summary
- Respondent Demographics
- Overall Satisfaction and Importance
- All Satisfaction, Importance and Usefulness Ratings
- Hardware Resources
- Software
- Security and One Time Passwords
- Visualization and Data Analysis
- HPC Consulting
- Services and Communications
- Web Interfaces
- Training
- Comments about NERSC
Response Summary
Many thanks to the 209 users who responded to this year's User Survey. The respondents represent all six DOE Science Offices and a variety of home institutions: see Respondent Demographics.
The survey responses provide feedback about every aspect of NERSC's operation, help us judge the quality of our services, give DOE information on how well NERSC is doing, and point us to areas we can improve. The survey results are listed below.
You can see the FY 2004 User Survey text, in which users rated us on a 7-point satisfaction scale. Some areas were also rated on a 3-point importance scale or a 3-point usefulness scale.
|
|
||||||||||||||||||||||||
|
The average satisfaction scores from this year's survey ranged from a high of 6.74 (very satisfied) to a low of 3.84 (neutral). See All Satisfaction Ratings.
For questions that spanned the 2003 and 2004 surveys the change in rating was tested for significance (using the t test at the 90% confidence level). Significant increases in satisfaction are shown in blue; significant decreases in satisfaction are shown in red.
|
Areas with the highest user satisfaction include the HPSS mass storage system, HPC consulting, and account support services:
7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied
Item | Num who rated this item as: | Total Responses | Average Score | Std. Dev. | Change from 2003 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |||||
HPSS: Reliability (data integrity) | 5 | 16 | 97 | 118 | 6.74 | 0.67 | 0.13 | ||||
CONSULT: Timely initial response to consulting questions | 1 | 5 | 38 | 125 | 169 | 6.70 | 0.55 | 0.15 | |||
CONSULT: overall | 3 | 4 | 38 | 132 | 177 | 6.69 | 0.60 | 0.35 | |||
Account support services | 1 | 2 | 1 | 2 | 38 | 136 | 180 | 6.68 | 0.72 | 0.29 | |
OVERALL: Consulting and Support Services | 3 | 8 | 40 | 146 | 197 | 6.67 | 0.63 | 0.30 | |||
HPSS: Uptime (Availability) | 1 | 3 | 1 | 25 | 89 | 119 | 6.66 | 0.70 | 0.12 | ||
CONSULT: Followup to initial consulting questions | 4 | 5 | 34 | 122 | 165 | 6.66 | 0.66 | 0.17 |
Areas with the lowest user satisfaction include the IBM SP Seaborg's batch turnaround time and queue structure as well as services used by only small numbers of users (the math and visualization servers, grid services and training classes presented over the Access Grid):
7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied
Item | Num who rated this item as: | Total Responses | Average Score | Std. Dev. | Change from 2003 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |||||
Live classes on the Access Grid | 1 | 14 | 11 | 6 | 32 | 5.16 | 1.44 | 0.49 | |||
Grid services | 18 | 3 | 5 | 9 | 35 | 5.14 | 1.31 | ||||
Escher SW: visualization software | 1 | 1 | 9 | 4 | 6 | 21 | 5.10 | 1.58 | 0.35 | ||
Math server (Newton) | 1 | 8 | 1 | 4 | 3 | 17 | 5.00 | 1.32 | -0.20 | ||
Newton SW: application software | 1 | 1 | 2 | 8 | 5 | 5 | 22 | 4.82 | 1.76 | ||
SP: Batch queue structure | 17 | 9 | 18 | 17 | 30 | 53 | 20 | 164 | 4.66 | 1.85 | -1.03 |
SP: Batch wait time | 26 | 16 | 36 | 14 | 27 | 32 | 10 | 161 | 3.84 | 1.90 | -1.40 |
The largest increases in satisfaction over last year's survey came from training classes attended in person, visualization services, the HPSS and Seaborg web pages and software bug resolution:
7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied
Item | Num who rated this item as: | Total Responses | Average Score | Std. Dev. | Change from 2003 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |||||
TRAINING: NERSC classes: in-person | 13 | 5 | 11 | 29 | 5.48 | 1.40 | 0.60 | ||||
SERVICES: Visualization services | 2 | 22 | 4 | 12 | 19 | 59 | 5.41 | 1.37 | 0.60 | ||
WEB: HPSS section | 4 | 11 | 30 | 49 | 94 | 6.32 | 0.85 | 0.58 | |||
WEB: Seaborg section | 3 | 10 | 47 | 85 | 145 | 6.48 | 0.72 | 0.48 | |||
CONSULT: Software bug resolution | 2 | 11 | 7 | 33 | 47 | 100 | 6.12 | 1.08 | 0.48 |
The areas rated significantly lower this year include the IBM SP, Seaborg, and available computing hardware:
7=Very satisfied, 6=Mostly satisfied, 5=Somewhat satisfied, 4=Neutral, 3=Somewhat dissatisfied, 2=Mostly dissatisfied, 1=Very dissatisfied
Item | Num who rated this item as: | Total Responses | Average Score | Std. Dev. | Change from 2003 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |||||
SP: Batch wait time | 26 | 16 | 36 | 14 | 27 | 32 | 10 | 161 | 3.84 | 1.90 | -1.40 |
SP: Batch queue structure | 17 | 9 | 18 | 17 | 30 | 53 | 20 | 164 | 4.66 | 1.85 | -1.03 |
SP: Seaborg overall | 4 | 7 | 7 | 2 | 26 | 62 | 60 | 168 | 5.77 | 1.47 | -0.66 |
OVERALL: Available Computing Hardware | 3 | 2 | 14 | 8 | 34 | 87 | 47 | 195 | 5.65 | 1.29 | -0.48 |
Survey Results Lead to Changes at NERSC
Every year we institute changes based on the previous year survey. In 2004 NERSC took a number of actions in response to suggestions from the 2003 user survey.
- [Web site] Needs lots of improvement. Most pages cram lots of info in a single page, hard to find what you want, etc. Beyond the home page, the website has an 80's look.
-
NERSC response: NERSC reorganized its web site last year, merging the previous www.nersc.gov, hpcf.nersc.gov and pdsf.nersc.gov web sites into a newly designed www.nersc.gov web site. The design goals were to have one site that meets the needs of our users, our DOE sponsors, and the general public.
Four of the web interface ratings on this year's survey show increased satisfaction over last year: The HPSS and Seaborg ratings increased by 0.6 and 0.5 points, and the overall web site and software ratings increased by 0.3 points.
- IBM's bugfixing is slow!
The compilers and debuggers need to be improved. -
NERSC response: In this past year NERSC has established an excellent working relationship with IBM's compiler support group. Representatives of the compiler group now attend the NERSC/IBM quarterly meetings. This has resulted is quicker resolution of bug reports. We think that IBM is now doing an excellent jobs in resolving compiler problems. Also, in the past year compiler upgrades have resulted in compilers that produce better runtime performance. There are currently very few outstanding compiler bugs.
This year's rating for software bug resolution increased by one half point. - totalview is basically unusable.
-
NERSC response: NERSC has established a good working relationship with Etnus, the vendor that supports TotalView. TotalView upgrades this past year have resulted in more usability features, and Totalview can now debug a wider range of parallel codes and is more stable.
last year we received 8 complaints about TotalView and debugging tools. This year we received two such complaints. User satisfaction with performance and debugging tools on Seaborg increased by 0.3 points and on the PDSF by 0.5 points (these increases were not statistically significant, however). - You need to better develop the floating license approach and make it easier to use for remote users.
-
NERSC response: During 2004, NERSC finalized consolidation of license management for all visualization software hosted at the Center. The new system, which consists of a set of license servers, also supports remote use of licensed visualization software. See Remote License Services at NERSC.
- Please offer more on-line video/on-site courses
It would be nice if NERSC can provide more tutorials. -
NERSC response: In 2004 NERSC organized 20 user training lectures in 7 separate events. All were presented via the Access Grid and were captured as streaming videos (using Real Media streaming) so that users can replay them at any time. These lectures have been added to the tutorials page for "one stop" access to training materials. See NERSC Tutorials, How-To's, and Lectures.
- more interactive nodes on pdsf
Would prefer faster CPUs at PC farm.
make it faster and bigger diskspace -
NERSC response: The PDSF support team has made it possible to run interactively on the batch nodes (there is a FAQ that documents these procedures). They also recently purchased replacement login nodes that are being tested now and should go into production in December 2004. They are top of the line opterons with twice as much memory as the old nodes.
64 3.6 GHz Xeons were added to the PDSF cluster in November 2004. This is 25% more CPU's, and they are almost twice as fast as the older CPU's. We also added about 20 TB additional disk space.
- The queue configuration should be returned to a state where it no longer favours jobs using large numbers of processors.
NERSC should move more aggressively to upgrade its high end computing facilities. It might do well to offer a wider variety of architectures. For example, the large Pentium 4 clusters about to become operational at NCSA provide a highly cost effective resources for some problems, but not for others. If NERSC had a greater variety of machines, it might be able to better serve all its users. However, the most important improvement would be to simply increase the total computing power available to users. -
NERSC response: NERSC coordinates its scheduling priorities with the Office of Science to accommodate the Office's goals and priorities. This year, the office continued to emphasize capability computing, including large jobs and INCITE jobs. Since NERSC is totally subscribed this means that some other work receives lower priority. In 2004, the Seaborg queue structure still favored large jobs, and was over subscribed more than the previous year. However, several measures have been implemented which should help improve turnaround for all jobs:
- Per user run limits were decreased from six to three, and per user idle limits (the number of jobs that are eligible for scheduling) from ten to two. This provides fairer access to Seaborg's processors.
- The OMB (Office of Management and Budget) goal for FY 2005 is that 40 percent of Seaborg's cycled should be delivered to jobs using at least 1/8 of its computational processors (in FY 2004 this goal was 50 percent).
- In early calendar year 2005 NERSC will deploy a new Linux cluster with 640 dual 2.2 Ghz Opteron CPUs available for computations. The target workload for the cluster is jobs that do not naturally scale to 1/8th or more of the computational processors on Seaborg.
- Thanks to additional funding from the Office of Science, NERSC is in the process of procuring additional computational capability for the 06 and 07 allocation years.
The majority of this year's respondents expressed dissatisfaction with Seaborg turnaround time, and about one quarter were dissatisfied with Seaborg's queue policies. Ratings in these two areas dropped by -1.4 and -1 points. The rating for available hardware dropped by 0.5 points. In general, the users who ran smaller concurrency jobs were more dissatisfied than users who ran larger codes.
Users are invited to provide overall comments about NERSC:
118 users answered the question What does NERSC do well? 68 respondents stated that NERSC gives them access to powerful computing resources without which they could not do their science; 53 mentioned excellent support services and NERSC's responsive staff; 47 pointed to very reliable and well managed hardware; 30 claimed that NERSC is easy to use and has a good user environment; and 26 said everything. Some representative comments are:
NERSC supplies a lot of FLOPs reliably, and provides very competent consultants. It is a good place to use parallel codes that scale well on the available machines.
NERSC does a truly outstanding job of supporting both a small community of "power" users as well as a large community of mid-range users. Both are important, and, as a result of NERSC's success in supporting both communities, the Center facilitates an extraordinary amount of scientific productivity.
NERSC has excellent people working there. I'm VERY happy with everyone I've come across. People have been knowledgeable and professional. I compute at NERSC because it's really big. Seriously, the number of processors allows us to do research on problems that we simply cannot do anywhere else. In that regard I consider NERSC a national treasure. One really silly request, how about a NERSC T-Shirt! I'd buy one.
Overall, the services and hardware reliability are excellent. I think that NERSC sets the standard in this regard.
94 users responded to What should NERSC do differently?. The area of greatest concern is Seaborg job turnaround time and queue management policies. Forty five users expressed dissatisfaction with turnaround time and 37 requested a change in Seaborg job scheduling policies, of which 25 expressed concerns with favoring large jobs at the expense of smaller ones. Twenty five users requested newer processors or more computing resources overall. Fifteen expressed dissatisfaction with the allocations process. Some of the comments from this section are:
Change the batch queues so that ordinary jobs execute in days, not weeks.
The queue wait times have been extremely long (about 2 weeks recently), and this has almost completely stalled my research.
The current focus only on jobs which can exhibit high degrees of parallelism is, in my opinion obviously, misguided. Some problems of great scientific interest do not naturally scale to thousands of processors.
NERSC should return to its original mission of providing the production environment which allowed the scientists to maximize their research. That is NERSC should give satisfying the user priority over satisfying the DOE and the OMB.
Given the amount of computer time that I am allocated, I cannot make use of the large number of processors on Seaborg. Unless everyone is allocated enough time to make use of hundreds of processors, NERSC should give more consideration to providing resources for smaller codes.
Also, the job priority system discriminates against smaller jobs (less than 32 nodes) - i.e. MAJORITY of users!
For the last 24 years NERSC has been the place where "I could get things done". With the initiation of the INCITE program that changed. The machine was effectively taken over by the 3 INCITE groups and work at NERSC stopped. After the upgrade my large calculations no longer run at NERSC and I had to move those computations to a p690 in Hannover, Germany.
The computer code I use becomes more complex from day to day to use the best physics you can. However this increases the computing time. The great management and support at NERSC combined with new hardware would be an irresistible package.
NERSC has done an outstanding job of serving the community. In order for this to continue, NERSC needs continued support from the DOE for its staff and the services they provide, and NERSC needs support for a new high end system to replace seaborg.
Similarly, I've described my problems with the ERCAP proposal process. I feel it gives short-shrift to science, and focuses on code optimization to the exclusion of scientific returns.
I think the INCITE program was ill conceived. Betting that the performance of a tiny subset of the scientific community will payoff enormously better than the community as a whole seems to me like trying to time the stock market. It may work once, but the opportunity costs are enormous.
77 users answered the question How does NERSC compare to other centers you have used? Thirty nine users stated that NERSC was an excellent center (no comparison made) or was better than other centers they have used. Reasons given for preferring NERSC include excellent hardware and software management, good user support and services. Seven respondents said that NERSC was not as good as another center they used. The most common reason for finding dissatisfaction with NERSC is job turnaround time.
Here are the survey results:
- Respondent Demographics
- Overall Satisfaction and Importance
- All Satisfaction, Importance and Usefulness Ratings
- Hardware Resources
- Software
- Security and One Time Passwords
- Visualization and Data Analysis
- HPC Consulting
- Services and Communications
- Web Interfaces
- Training
- Comments about NERSC