1.7 Downtime Policy
Historically, the physical requirements of the Komas Datacenter’s HVAC system dictated that CHPC have quarterly downtimes. CHPC utilized this mandatory work schedule as our ‘quarterly downtime' and extended it to equipment in our hardened SSB datacenter, and our smaller INSCC datacenter. However, as of 2014 when all of the CHPC hardware from the Komas and SSB datacenters was moved to the new downtown datacenter (DDC), this type of maintenance in not needed, and therefore we have revised our maintenance and downtime strategy.
- CHPC will schedule maintenance in one of three manners:
- Rolling reboots
- Whenever possible CHPC will apply updates and do other maintenance in a rolling fashion in such a manner as to have either no or as little impact as possible to CHPC services.
- Partial outages
- We will do these as needed but in a manner that impacts only some CHPC services at a time
- We will work with the appropriate users for the choice of the day/time.
- Full outages
- These are outages that will affect all CHPC services, such as outages of core datacenter networking services or datacenter power maintenance.
- We will offer two windows a year, typically during the spring and fall break weeks. It should be noted that we do not anticipate that these outages will always be needed.
- CHPC's goal for these downtimes is to have them completed within one working day. However, unforeseen or unusual circumstances may result in a multiple-day downtime.
- Rolling reboots
- The description of which services will be down, the date, time duration and impact to users is detailed in email and flyers.
- Email notifications are sent to firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, and “komas data center list”. The first notification is sent two weeks prior to downtime. A second reminder is sent a few days prior to actual downtime. These lists should have users 'official' U of U email address. Users are responsible for checking their official Umail account. Please notify firstname.lastname@example.org if you would like to use a different email address. You can see where these notifications are being sent by using the Profile link on the CHPC Web site. Note: you will need to authenticate with your uNID and campus password to access the Profile link.
- News of the downtimes is posted on http://www.chpc.utah.edu when email notifications are sent.
The "message of the day" (motd) on the clusters is updated with dates as soon as they are set. You can see this when you login or any time by issuing the command:
> cat /etc/motd
- Flyers posted in INSCC, Sutton, WBB, Arapeen and HSEB
- Scheduling Considerations
- Downtime dates are set after considering services impacted, known needs, and academic schedules.
- Please notify CHPC's Associate Director immediately if you have concerns or critical circumstances in conflict with the scheduled time frame
- See 1.8 System Availability vs Staff Availability for more on availability policies.
Unscheduled “EMERGENCY” Downtimes
- Unscheduled downtimes can occur for any number of reasons, including:
- Loss of cooling and/or power in the data center.
- Loss of supporting infrastructure (i.e. hardware).
- Critical need to make changes to hardware or software that negatively impacts performance or access.
- Application of critical patches that can’t wait until the next scheduled maintenance.
- For safety or security issues that require immediate action.
- To make configuration “policy” changes that are deemed necessary immediately.
- The scope and impact of downtime detailed in emails.
- Email to email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, and “komas data center list” sent out as information is available.
- News is posted on www.chpc.utah.edu when email notifications are sent.