DOWNTIME -- Oct 14 starting at 6:30am

Posted: October 3, 2013

Duration: All day, starting at 6:30am

Overview: During this downtime the quarterly cooling maintenance on the Komas Datacenter will be performed, the Redbutte File Server will be relocated from the SSB Datacenter to the Downtown Datacenter, the /uufs/ application file system will be moved to new hardware, and system updates will be applied to the HOMERFS file system.

Detailed Impact to Users:

  • Updraft and Sanddunearch clusters will have their batch queues drained by 6:30am in preparation for the clusters to be shut down for the cooling maintenance in the Komas Datacenter. They will be brought back online once the maintenance is complete.
  • Telluride, Apexarch and Turretarch clusters will have their batch queues drained by 6:30am in order to move the application file system to new hardware. These clusters will be brought back once this move has been completed.
  • The protected environment file system, HOMERFS, will have firmware updates applied. This means that Apexarch and SWASEY will be unavailable until this is completed and returned to service.
  • The Redbutte file server will be moved from SSB to the Downtown Datacenter. Our goal is to have this up before the end of the day (but it will most likely be late). While we do not expect any problems with this move CHPC recommends that any critical data in the GROUP spaces, which are either not backed up or only backed up with quarterly archives, should be moved or copied elsewhere as a precaution. Many groups have space on other the salt flat or dry creek file servers; another alternative for temporary storage would be one of the scratch file systems. The following groups are impacted:
    • HOME directory: Baron, Cheatham, Cliu, Garrett, Gregg, Horel, Jenkins, Jiang, Krueger, Lin, Mace, Paegle, Perry, Reichler, Smithp, Steele, Steenburgh, Strong, Whiteman, Yandell, Zhdanov, Zipser, and Zpu.
    • GROUP spaces: cliu-group1, garrett-group1, horel-group, krueger-group1, mace-group1, steenburgh-group1, lin-group1, reichler-group1, strong-group1, whiteman-group1, zpu-group1, cheatham-group1, cheatham-group2, avey-group1, baron-group1, gregg-group1, sandick-group1, steele-group1, stoll-group1, voelk-group1, and yandell-group1
  • Note that kingspeak (and ember if it is back in service by this time) will not have their queues drained or be shut down. HOWEVER, if you are in one of the groups with a HOME directory listed in the previous impact item, any running jobs and any interactive sessions will hang when the file server is turned off for the move. We recommend that users in these groups log out of any interactive sessions, and plan on having their running jobs exit before 6:30am on Oct 14. Any idle jobs in the batch queue should have batchholds placed on them.