Supported Services Model
NERSC supports various services at differing levels. This document outlines the level of support users can expect for a given service and the procedures NERSC follows to monitor and respond to incidents.
For all production services at NERSC
- We support production services at one of three service levels: 24x7, 16x5, and 8x5.
- Monitoring is performed by NERSC Operations using automated tools.
- Outages are announced in center status and adhere to our published system status policies.
- User documentation is available online.
- NERSC Operations defines on-call procedures. When there are service issues, a NERSC staff member is contacted.
Service levels definitions
- Critical services that affect the entire center
- The availability of entire computational systems
- NERSC staff are directly notified of failures 24 hours a day, seven days a week.
- Medium to high priority services – may affect specific projects or services, but not all of NERS
- NERSC staff are directly notified of failures between 7 a.m.-11 p.m., Monday through Friday
- NERSC staff are notified by email for failures that fall outside those service hours; a ticket is opened and assigned to the service point of contact, who will follow up the next business day
- Low to Medium priority - may affect specific projects or services, but not all of NERSC
- NERSC staff are directly notified of failures between 8 a.m.-5 p.m. Monday-Friday
- Email notification for failures during night hours/weekends; ticket opened and assigned to the service point of contact; follow up next business day
Service levels definitions
- Critical services that affect the entire center
- The availability of entire computational systems
- NERSC staff are directly notified of failures 24 hours a day, seven days a week.
- Medium to high priority services – may affect specific projects or services, but not all of NERS
- NERSC staff are directly notified of failures between 7 a.m.-11 p.m., Monday through Friday
- NERSC staff are notified by email for failures that fall outside those service hours; a ticket is opened and assigned to the service point of contact, who will follow up the next business day
- Low to Medium priority - may affect specific projects or services, but not all of NERSC
- NERSC staff are directly notified of failures between 8 a.m.-5 p.m. Monday-Friday
- Email notification for failures during night hours/weekends; ticket opened and assigned to the service point of contact; follow up next business day
Service levels by system
Name | Service level |
---|---|
Perlmutter | 24x7 |
HPSS | 24x7 |
Community File System | 24x7 |
Data Transfer Nodes | 24x7 |
Superfacility (SF) API | 24x7 |
Website (www.nersc.gov) | 24x7 |
NERSC authentication (LDAP, FedID) | 24x7 |
Science databases | 16x5 |
Spin | 16x5 |
Iris | 8x5 |
Globus and grid tools | 8x5 |
ThinLinc | 8x5 |
Consulting and user support | 8x5 |
Account support | 8x5 |
Jupyter | 8x5 |
How to report issues
M-F, 8 a.m. - 5 p.m., Pacific Time
- Open a ticket in the NERSC help portal.
- Email the consulting team or accounts support
M-F, after 5 p.m. and before 8 a.m., Pacific Time, and weekends
- Inside the US call: +1 800 666-3772 (800 66-NERSC), option 1
- Outside the US call: +1 510 486-6821
How to report issues
M-F, 8 a.m. - 5 p.m., Pacific Time
- Open a ticket in the NERSC help portal.
- Email the consulting team or accounts support
M-F, after 5 p.m. and before 8 a.m., Pacific Time, and weekends
- Inside the US call: +1 800 666-3772 (800 66-NERSC), option 1
- Outside the US call: +1 510 486-6821
Experimental services
NERSC also supports a set of services that are considered experimental in nature. These services may be under active development or testing and are not considered stable enough to be in production.
Experimental services are defined by the following characteristics:
- Low priority services: These are typically services in development.
- No guaranteed level of support: Services may be taken down for maintenance on short notice.
- No operations monitoring: Users must contact the person responsible for the service directly if there are issues.