CHPC provides several options for home directory file systems.
General HPC home directories
The General HPC home directory storage system is the default home directory file system that is available to groups free of charge. If your group has not purchased storage or you do not fit into one of the other categories listed below, then this is the space where your CHPC home directory will be provisioned. This file system has a 50 GB per user quota which is enforced. CHPC can provide temporary increases to this space. THIS SPACE IS NOT BACKED UP. It is assumed important data will be copied to a departmental file server or another location.
Owner home directories - New solution as of August 2017
CHPC currently allows CHPC PIs with sponsored research projects to buy-in to storage at a price determined based on cost recovery. The current limit for this space is 1TB/group, and all members of the research group will have home directories on this space. The hardware for the current home directory solution was purchaed in the summer of 2017. Initially, it was sold at a price of $1250/TB for the 5 year warranty period of the hardware. The current price is prorated to $1000/B for the remaining warranty lifetime.
THis solution is based on an offering from Dell known as their Compellent solution. In this solution there are two RAID disk based copies, one of which is the primary storage with which users will normally interact. The second copy is used as a fail over, and is effectively a replicated copy of the primary side. In case of hardware issue with the primary copy, the fail over will become the active working copy until repairs can be made. For increased performance, there are solid state drives that will be used in an transparent manner as a first tier for I/O to this space in front of the larger capacity traditional spinning drives. Note that these features, including the fail over copy, will be present for all home directories, including the default 50GB ones provided to users whose groups do not purchase the larger home directory space.
Note that having a redundant copy is not a backup solution as any changes in the primary side will be synced to the secondary side, for example deleting or overwriting a file. The price of this solution also includes a back up, with nightly incremental and weekly full back ups and a two week retention window.
We will continue to prorate the cost of this storage based on the remaining warranty time. When it is time to refresh the hardware, CHPC will contact all groups who have purchased space about the new pricing policy. This file system is backed up to tape, with nightly incremental and weekly full back ups and a two week retention window. Please contact us by emailing firstname.lastname@example.org and request to meet with us to discuss your needs and timing.
Group Level Storage File Systems
CHPC currently allows CHPC PIs with sponsored research projects to buy-in to file storage at a price determined based on cost recovery. A more detailed description of this storage offering is available. The current pricing is $150/TB for the lifetime of the hardware which is purchased with a 5 year warranty. CHPC purchases the hardware for this storage in bulk (currently 320TB at a time) and then sells it to individual groups in TB quantities, so depending on the amount of group storage space you are interested in purchasing, CHPC may have the storage to meet your needs on hand. Please contact us by emailing email@example.com and request to meet with us to discuss your needs and timing.
Archive backups of group level storage is available by request for the cost of the backup tapes. These backups are performed quarterly. We recommend that groups purchase a quantity of tapes to allow for two copies, so that backups can be alternated between the two sets. Contact us at firstname.lastname@example.org for current pricing and to request an archive of your group space.
Scratch File Systems
There are various scratch file systems which are available on the HPC clusters. THESE FILE SYSTEMS ARE NOT BACKED UP. This space is provided for users to store intermediate files required during the duration of a job on one of the HPC clusters. On these scratch filesystem, files that have not been accessed for 60 days are automatically scrubbed. There is no charge for this service.
The current scratch file systems are:
- /scratch/general/lustre - a 700TB lustre parallel file system accessible from all all CHPC resources
- /scratch/kingspeak/serial - a 175 TB NFS system accessible from all CHPC resources except lonepeak
- /scratch/lonepeak/serial - a 33 TB NFS system accessible from all interactive nodes and from the comput nodes of Lonepeak
Linux defines temporary file system at
/var/tmp where temporary user and system files are stored. CHPC cluster nodes set up temporary
file systems as a RAM disk with limited capacity. All interactive and compute nodes
have also a spinning disk local storage at
/scratch/local. If an user program is known to need temporary storage, it is advantageous to set
environment variable TMPDIR which defines the location of the temporary storage and point it to
/scratch/local. Local disk drives range from 40 to 500 GB depending on the node, which is much more
than the default
/scratch/localcan also be used for storing intermediate files during calculation, however be aware
that getting to these files after the job finishes will be difficult since they are
local to the (compute) node and not directly accessible from cluster interactive nodes.
CHPC now has a new archive storage solution based around object storage, specifically ceph, a distributed object store suite developed at UC Santa Cruz. We have an initial raw capacity of 1.15PB, with a cost of $80/TB raw space. In order to calculate the cost per TB of usable space you must consider the replication configuration. Initially, we will be offering an 6+3 erasure coding configuration which results in a price of $120/TB of usable capacity for the 5 year lifetime of the hardware. As we currently do with our group space, we will operate this space in a condominium model by reselling this space in TB chunks. This space is a stand alone entity, and will not be mounted on other CHPC resources.
One of the key features of the archive system is that users manage the archive directly,
unlike the tape archive option. Users can move data in and out of the archive storage as needed -- they can archive
milestone moments in their research, store an additional copy of crucial instrument
data, or retrieve data as needed. This archive storage solution will be accessible
via applications that use Amazon’s S3 API. GUI tools such as transmit (for Mac) as well as command-line tools such as
s3cmd and rclone can be used to move the data. In addition Globus can be used to access this space; however note that the globus ceph plugin is a new
tool that is still be developed and should be treated as such.
It should also be noted that this archive storage space is for use in the general environment, and is not for use with regulated data; CHPC is actively working on vetting this solution for human genomic data that is covered by NIH’s dbGaP policies.
The backup policies of each type of storage have been described above.
Mounting CHPC Storage
For making direct mounts of home and group space on your local machine see the instructions provided on our Data Transfer Services page.
For more information on CHPC Data policies, visit: File Storage Policies