You are here:

Storage Services at CHPC

CHPC currently offers four different types of storage: home directories, group space, scratch file systems and a new archive storage system.  All storage types except for the archive storage system are accessible from all CHPC resources.  Data on the new archive storage space must be moved to one of the other spaces in order to be accessible. Home directories and group spaces can also be mounted on local desktops.  See the Data Transfer Services page for information on mounting CHPC file systems on local machines along with details on moving data to and from CHPC file systems. 

In addition we have limited tape back up systems for both home directories and group spaces.  

Note that the information below is specific for the general environment.  In the protected environment (PE) all four types of storage exist.  However, the nature of the storage, the pricing and policies do vary in the PE.  See the Protected Environment page for more details. 

***Remember that you should always have an additional copy or possibly multiple copies, on independent storage systems, for any crucial/critical data. While storage systems built with data resiliency mechanisms (such as RAID and erasure coding mentioned in the offerings listed below or other similar technologies)  allow for multiple component failures, they do not offer any protection for large scale hardware failures, software failures leading to corruption, or for accidental deletion or overwriting of data.  Please  take the necessary steps to protect your data to the level you deem necessary.***

Home Directories

CHPC provides several options for home directory file systems.

General HPC  home directories

The General HPC home directory storage system is the default home directory file system that is available to groups free of charge.   If your group has not purchased storage or you do not fit into one of the other categories listed below, then this is the space where your CHPC home directory will be provisioned. This file system has a 50 GB per user quota which is enforced. CHPC can provide temporary increases to this space. THIS SPACE IS NOT BACKED UP. It is assumed important data will be copied to a departmental file server or another location.

Owner home directories - New solution as of August 2017

CHPC currently allows CHPC PIs with sponsored research projects to buy-in to storage at a price determined based on cost recovery. The current limit for this space is 1TB/group, and all members of the research group will have home directories on this space.  The hardware for the current home directory solution was purchaed in the summer of 2017.  Initially, it was sold at a price of  $1250/TB for the 5 year warranty period of the hardware.   The current price is prorated to $1000/B for the remaining warranty lifetime.

THis solution is based on an offering from Dell known as their Compellent solution.  In this solution there are two RAID disk based copies, one of which is the primary storage with which users will normally interact.  The second copy is used as a fail over, and is effectively a replicated copy of the primary side.  In case of hardware issue with the primary copy, the fail over will become the active working copy until repairs can be made. For increased performance, there are solid state drives that will be used in an transparent manner as a first tier for I/O to this space in front of the larger capacity traditional spinning drives. Note that these features, including the fail over copy, will be present for all home directories, including the default 50GB ones provided to users whose groups do not purchase the larger home directory space.

Note that having a redundant copy is not a backup solution as any changes in the primary side will be synced to the secondary side, for example deleting or overwriting a file.  The price of this solution also includes a back up, with nightly incremental and weekly full back ups and a two week retention window. 

We will continue to prorate the cost of this storage based on the remaining warranty time.  When it is time to refresh the hardware, CHPC will contact all groups who have purchased space about the new pricing policy.  This file system is backed up to tape, with nightly incremental and weekly full back ups and a two week retention window. Please contact us by emailing and request to meet with us to discuss your needs and timing. 

Group Level Storage File Systems

CHPC currently allows CHPC PIs with sponsored research projects to buy-in to file storage at a price determined based on cost recovery.  A more detailed description of this storage offering is available.  The current pricing is $150/TB for the lifetime of the hardware which is purchased with a 5 year warranty.  CHPC purchases the hardware for this storage in bulk (currently 320TB at a time) and then sells it to individual groups in TB quantities, so depending on the amount of group storage space you are interested in purchasing, CHPC may have the storage to meet your needs on hand.  Please contact us by emailing and request to meet with us to discuss your needs and timing.

Archive backups of group level storage is available by request for the cost of the backup tapes. These backups are performed quarterly. We recommend that groups purchase a quantity of tapes to allow for two copies, so that backups can be alternated between the two sets. Contact us at for current pricing and to request an archive of your group space.

Scratch File Systems

There are various scratch file systems which are available on the HPC clusters. THESE FILE SYSTEMS ARE NOT BACKED UP. This space is provided for users to store intermediate files required during the duration of a job on one of the HPC clusters. On these scratch filesystem, files that have not been accessed for 60 days are automatically scrubbed. There is no charge for this service.

The current scratch file systems are:

  • /scratch/general/lustre - a 700TB lustre parallel file system accessible from all all CHPC resources
  • /scratch/kingspeak/serial - a 175 TB NFS system accessible from all CHPC resources except lonepeak
  • /scratch/lonepeak/serial - a 33 TB NFS system accessible from all interactive nodes and from the comput nodes of Lonepeak

Temporary File Systems

Linux defines temporary file system at /tmp or /var/tmp where temporary user and system files are stored. CHPC cluster nodes set up temporary file systems as a RAM disk with limited capacity. All interactive and compute nodes have also a spinning disk local storage at /scratch/local. If an user program is known to need temporary storage, it is advantageous to set environment variable TMPDIR which defines the location of the temporary storage and point it to /scratch/local. Local disk drives range from 40 to 500 GB depending on the node, which is much more than the default /tmp size. /scratch/localcan also be used for storing intermediate files during calculation, however be aware that getting to these files after the job finishes will be difficult since they are local to the (compute) node and not directly accessible from cluster interactive nodes.

Archive Storage 

CHPC now has a new archive storage solution based around object storage, specifically ceph, a distributed object store suite developed at UC Santa Cruz.   We have an initial raw capacity of 1.15PB, with a cost of $80/TB raw space.  In order to calculate the cost per TB of usable space you must consider the replication configuration.  Initially, we will be offering an 6+3 erasure coding configuration which results in a price of $120/TB of usable capacity for the 5 year lifetime of the hardware.  As we currently do with our group space, we will operate this space in a condominium model by reselling this space in TB chunks.  This space is a stand alone entity, and will not be mounted on other CHPC resources.

One of the key features of the archive system is that users manage the archive directly, unlike the tape archive option. Users can move data in and out of the archive storage as needed -- they can archive milestone moments in their research, store an additional copy of crucial instrument data, or retrieve data as needed. This archive storage solution will be accessible via applications that use Amazon’s S3 API.  GUI tools such as transmit  (for Mac) as well as command-line tools such ass3cmd and rclone can be used to move the data. In addition Globus can be used to access this space; however note that the globus ceph plugin is a new tool that is still be developed and should be treated as such.

It should also be noted that this archive storage space is for use in the general environment, and is not for use with regulated data;  CHPC is actively working on vetting this solution for human genomic data that is covered by NIH’s dbGaP policies.

Backup Policy

The backup policies of each type of storage have been described above. 

Mounting CHPC Storage

For making direct mounts of home and group space on your local machine see the instructions provided on our Data Transfer Services page. 

Additional Information

For more information on CHPC Data policies, visit: File Storage Policies

Last Updated: 2/16/19