CHPC News

This section is here to provide the user of an overview and index of what can be found in the News directories.

  • News Items: History of all news announcements made by CHPC this year including notifications of presentations, downtimes and policy changes.
  • 2012 News Items: History of all news announcements make by CHPC in 2012 including notifications of presentations, downtimes and policy changes.
  • 2011 News Items: History of all news announcements make by CHPC in 2011 including notifications of presentations, downtimes and policy changes.
  • 2010 News Items: History of all news announcements make by CHPC in 2010 including notifications of presentations, downtimes and policy changes.
  • 2009 News Items: History of all news announcements make by CHPC in 2009 including notifications of presentations, downtimes and policy changes.
  • 2008 News Items: History of all news announcements make by CHPC in 2008 including notifications of presentations, downtimes and policy changes.
  • 2007 News Items: History of all news announcements make by CHPC in 2007 including notifications of presentations, downtimes and policy changes.
  • 2006 News Items: History of all news announcements make by CHPC in 2006 including notifications of presentations, downtimes and policy changes.
  • 2005 News Items: History of all news announcements make by CHPC in 2005 including notifications of presentations, downtimes and policy changes.
  • 2004 News Items: History of all news announcements make by CHPC in 2004 including notifications of presentations, downtimes and policy changes.
  • Downtimes: Downtimes in the current year.
  • 2012 Downtimes: Downtimes in 2012.
  • 2011 Downtimes: Downtimes in 2011.
  • 2010 Downtimes: Downtimes in 2010.
  • 2009 Downtimes: Downtimes in 2009.
  • 2008 Downtimes: Downtimes in 2008.
  • 2007 Downtimes: Downtimes in 2007.
  • 2006 Downtimes: Downtimes in 2006.
  • 2005 Downtimes: Downtimes in 2005.
  • 2004 Downtimes: Downtimes in 2004.
  • Current System Status: Summary of Current System Status of HPC Systems
  • Newsletters: CHPC Newsletters, past and present (PDF and HTML)
  • Presentations

Most Recent News Items:

CHPC wiki (wiki.chpc.utah.edu) will be down starting at Noon June 12th for updates

Posted: June 11, 2013

All Clear: Update was completed as of 2:07 p.m. June 12th, 2013.

We will be taking wiki.chpc.utah.edu for a brief outage to update the Confluence software to a newer version. The process will begin at Noon tomorrow, June 12th, 2013.


All Clear: CHPC File Server Outage - CHPC_FS

Posted: June 7, 2013

The CHPC_FS file server had problems this morning which affected some home directories and web servers. We now have the all clear that it is up and happy again. If you are mounting on your desktop file system on CHPC_FS you should reboot your desktop before reporting issues to us.

The hardware which failed has been replaced and the file server is now up and functioning properly.

As usual, please report any issues to issues@chpc.utah.edu


CHPC DOWNTIME: All clear for tasks scheduled for June 4th, 2013

Posted: June 4, 2013

All clusters are online and scheduling jobs except apexarch (HIPAA) and turretarch (UCS).

A few other systems will also remain offline until tomorrow as scheduled.


UPDATE on DOWNTIME

Posted: June 4, 2013

The updates on the redbutte and drycreek file systems has been completed and they have been brought back online. All home and group directories affected should now be available. IF you are not seeing the expected file systems from your desktop system, please reboot and see if that resolves the issues.

Work is continuing on the cooling maintenance at the Komas Datacenter. We will send another message when this has been completed and the clusters are back online


DOWNTIME: June 4th and June 5th, 2013 - Updates and status

Posted: June 4, 2013

June 4th, 2013

7:00 a.m. Downtime Started:

  • HPC Cluster downed ((UP, EM, SDA, APEX, IBRIX, UCS, netapp))
  • meteo/atmos/wx nodes downed
  • allocation manager downed

7:45 a.m.

  • time2, time3, and first part of Phase I VM move powered down.
  • kachina, swasey, homerfs, pxe, bamboo powered down.

8:30 a.m.

  • Movers loading truck with equipment moving from SSB data center to DDC.

9:20 a.m.

  • CMSS arrived to begin maintenance work on HVAC system in Komas data center.

9:30 a.m.

  • Movers loading the last of the gear from SSB. Then heading up to Komas (for Apex Arch, UCS, and NetApp) before delivering to the DDC.

9:45 a.m.

  • Uplink to INSCC has been moved to new Arista switch
  • All ToR switches at DDC are new connected to new Arista core
  • Routing interfaces for the following have been moved from Komas core to Router Core (UCS, Apex)
  • Routing interfaces for hidden arch have been moved from SSB to Router Core

11:00 a.m.

  • Movers leaving Komas for DDC

12:20 p.m.

  • Updates on Fileservers completed; directories are back online

1:00 p.m.

  • Physical move complete to DDC
  • Work on wiring continues

2:00 p.m.

  • Some service machines in DDC up and running (alloc)

3:00 p.m.

  • More service machines in DDC up and running (time1, time2, pxe)
  • trmm, wx, atmos up at DDC
  • CMSS work complete in Komas, starting to bring up clusters

4:00 p.m.

  • Telluride up and scheduling jobs
  • Kachina and Swasey up

6:30 p.m.

  • Updraft up and scheduling jobs

7 p.m.

  • Ember up and scheduling jobs

June 5th, 2013

12:45 p.m.

  • Apexarch and UCS servers up and scheduling jobs

Upcoming Downtime - Tuesday June 4, 2013

Posted: May 28, 2013

Duration: All day, starting at 6:30am

CHPC will have a downtime on June 4, 2013 starting at 6:30am to do maintenance on the cooling system in the Komas Datacenter. During this downtime we will also start the move of equipment from Komas and the SSB Datacenter to the new Downtown Datacenter (DDC), as was previously announced. Firmware updates on SOME file systems will also be completed (see below).

CLUSTERS --- During this downtime Ember, Updraft, Sanddunearch and Telluride will be down for most of the day. Reservations are in place to drain the batch queues by June 4th at 6:30am, so any job that will not finish before the start of the downtime will not be allowed to start until the downtime is finished. A notice will be sent out when the clusters have been brought back online and the batch queues restarted.

FILE SYSTEMS AFFECTED --- The redbutte and drycreek file systems will be taken offline at about 8:30am for firmware updates; a notice will be sent out once this work has been completed. This space includes:

  • HOME directory space for the following groups: All ATMOS faculty, Smithp, Cheatham, Baron, Steele, Yandell, Zhdanov and Gregg
  • The following group spaces: All ATMOS faculty group spaces (named piname-groupx, where x=1-3), baron-group1, steele-group1, yandell-group1, gregg-group1, cheatham-group1-3, chpc-vis1, avey-group1, sandick-group1, stoll-group1, voelk-group1, molinero-group1, arup-storage2, chpc-group1, bowen-group1, and smiskovic-group1

MOVE TO DDC --- The following equipment will be moved to the DDC during this downtime:

  • Atmospheric Sciences cluster (atmos, meteo and wx, and nodes, except gl nodes) - Expect an extended downtime for these servers of approximately 2 days beginning June 4th at 7am
  • kachina.chpc.utah.edu and swasey.chpc.utah.edu - Expect extended downtime of 2 days
  • phase I of VM Farm - No downtime expected
  • Apexarch cluster and homerfs - Expect extended downtime of 2 days
  • UCS Nodes and attached storage - Expect extended downtime of 2 days


MPI libraries updates

Posted: May 27, 2013

We have updated the MPICH2 and MVAPICH2 libraries on CHPC clusters and administered desktops to versions 3.0.4 and 1.9 respectively. The main advantage of the new versions is support of most of the MPI 3.0 standard, including non-blocking collective communication.

There should not be any need for changes on user's end (compilation or running), but, if you encounter any problems please open a ticket at issues@chpc.utah.edu.


Upcoming Downtime - Tuesday June 4, 2013

Posted: May 24, 2013

Duration: All day

CHPC will have a downtime on June 4, 2013 starting at 6:30am to do maintenance on the cooling system in the Komas Datacenter. During this downtime we will also start the move of equipment from Komas and the SSB Datacenter to the new Downtown Datacenter (DDC), as was previously announced.

During this downtime Ember, Updraft Sanddunearch and Telluride will be down for most of the day. Reservations are in place to drain the batch queues by June 4th at 6:30am. Work will also be done on a number of the file systems. Details on the specifics of these file system outages will be given next week.

The equipment that will be moved to the DDC during this downtime:

  • Atmospheric Sciences cluster (atmos, meteo and wx, and nodes, except gl nodes) - Expect an extended downtime for these servers of approximately 2 days beginning June 4th at 7am
  • kachina.chpc.utah.edu and swasey.chpc.utah.edu - Expect extended downtime of 2 days
  • phase I of VM Farm - No downtime expected
  • Apexarch cluster and homerfs - Expect extended downtime of 2 days
  • UCS Nodes and attached storage - Expect extended downtime of 2 days

MPI workshop at CHPC

Posted: May 21, 2013

Duration: June 17 9am-3pm, June 18 9am-3pm

CHPC will be a satellite site for Pittsburgh Supercomputer Center two day workshop focusing on MPI programming. This is an excellent opportunity to expand on MPI programming skills beyond the short presentations that we teach.

For more details and schedule of the workshop, visit http://www.psc.edu/index.php/training/mpi-programming. The location of the MPI workshop at the University of Utah will be at the INSCC Auditorium, INSCC 110. Local staff will be on site to address local and we will also be able to ask questions to the speakers via the webcast.

We encourage everyone, not just University of Utah affiliates, to attend this workshop. If you have any local questions, please, send them to issues@chpc.utah.edu.

The registration is available through the XSEDE site, portal.xsede.org/course-calendar/-/training-user/class/124. If you don't have an XSEDE portal account, you can create it for free. The MPI workshop is also free and open to the public.