UPDATE on EMERGENCY OUTAGE of /scratch/ibrix
Posted: May 2, 2013
Here is an update:
/scratch/ibrix/chpc_gen has being brought back online - but will be taken down again next week, most likely on Monday. /scratch/ibrix/icse will remain down. The batch queues have been restarted and they will remain up even after /scratch/ibrix is taken back down next week. Please note that if you had a job running when the file system was taken down this morning it may have hung and might die when the file system is mounted. Also, if you find that there is a machine where the mount of this file system is missing, please send us an issue report.
On Monday a HP service engineer is expected to arrive to continue work to resolve the ongoing issues with the /scratch/ibrix file system. Unfortunately, we cannot make an estimate of how long this will take, however we expect it might take multiple days. At this point, HP engineering and CHPC staff have no indication that the data on /scratch/ibrix/icse is at risk.
When the engineer arrives /scratch/ibrix/chpc_gen will be taken down again. It will most likely be down for multiple days. We suggest that users SELECTIVELY move data that they need for the next few days from /scratch/ibrix/chpc_gen to other locations such as group file systems, /scratch/serial, /scratch/general, or home directories (listed in order of preference). Please keep in mind that these alternate locations are much smaller than /scratch/ibrix/chpc_gen space so we cannot have users moving all of their the data off the /scratch/ibrix/chpc_gen space.
The batch queues will not be paused when work is resumed on the /scratch/ibrix file system. Users can have jobs which use /scratch/ibrix/chpc_gen over the weekend but any use of this file system needs to be complete by Monday 8am as we will not be able to give advance notice of when the file system will be taken down. Any jobs that will run after Monday morning should not use this space as that will cause the job to die/hang when the system is taken offline.