Git is a decentralized version control system and content management tool. It allows developers and teams to manage projects by maintaining all versions of files, past and present, allowing for reversion and comparison; facilitating exploration and experimentation with branching; and enabling simultaneous work by multiple authors without the need for a central file server. It can be used offline for version control and revision history or in conjunction with a remote repository to make working in teams easier and safer.
It is important to note that Git itself is not a tool for backing up files. The loss of a local Git repository in connection with a file system failure is permanent unless a remote copy of the repository exists.
There is a short training video that parallels some of the topics discussed in this page.
While it's possible to use
git on most systems without configuration, we strongly recommend using the most recent version of Git, which can be accessed through the
module load git
This should prevent any problems that may arise as a result of version incompatibilities.
Remote Git repositories
The Center for High Performance Computing maintains a GitLab Community Edition server for users who are interested in collaborating and sharing internally. You can log in with your University of Utah credentials (your ID and password).
Alternatively, third-party hosting services can be used; some of the most popular are GitHub, GitLab, and Bitbucket. Each has its strengths and weaknesses, so seek out reviews, policies, and recommendations before you start.
This is intended for users who have some experience with Git. If you haven't seen these commands before, consider reading through the brief tutorial first.
|git help operation||Read more about operation (e.g.
|git init||Create a Git repository in the current directory (if it doesn't already exist)|
|git clone URL destination||Copy the project at URL into the (new) directory destination|
|git remote add remote_name URL||Add a remote named remote_name with location URL; the primary remote is typically named origin|
|git config user.name "Firstname Lastname"||Set your name to Firstname Lastname (use
|git config user.email "firstname.lastname@example.org"||Set your email to email@example.com (use
|git status||Display the status of the current branch (shows which files are present in the staging area)|
|git diff --cached||Display what will be committed; alternatively, use
|git log --stat --summary||Display an overview of the project history, including the summary (commit message) and changes|
|git add filename other_file||Add filename and other_file to the staging area|
|git rm --cached filename||Remove filename from the staging area|
|git commit -m "message"||Create a new commit with description message;
|git pull remote_name branch_name||Fetch commits on branch branch_name of the remote remote_name; when set up, you can use
|git push remote_name branch_name||Push commits on branch branch_name to the remote remote_name; when set up, you can use
|git checkout -b branch_name||Create (and switch to) new branch branch_name|
|git checkout branch_name||Switch to (existing) branch branch_name|
|git branch||Display the branches available; marks the current branch|
|git branch -d branch_name||Delete the branch branch_name|
|git merge branch_name||Merge commits in branch branch_name into the current branch (if there are no conflicts)|
This is a small sample of how you might set up a Git repository on GitLab to share your work with others. For an in-depth explanation of the steps, refer to the tutorial section.
- Create or locate a remote repository on GitLab (or another service). The URL of this project will be of the form https://gitlab.chpc.utah.edu/gitlab-user/project-name.
- Create a local repository in a directory on your computer.
- Without an existing (remote) repository:
$ module load git $ cd your_directory $ git init $ git remote add origin https://gitlab.chpc.utah.edu/gitlab-user/project-name
Crucially, gitlab-user is not necessarily your university ID. To determine what should be used here, sign in to GitLab and locate your user ID. This can be changed in your settings and it may be a good idea to use your university ID. You can also refer to the "Create a project on GitLab" section to determine the URL you should use.
- From an existing repository:
$ module load git $ git clone https://gitlab.chpc.utah.edu/gitlab-user/project-name your_directory $ cd your_directory
Again, gitlab-user may be something other than your university ID. Refer to the URL of the project on GitLab to determine what to use.
- Without an existing (remote) repository:
- Stage and commit your files. Refer to the "Edit and stage your files" section for more information about adding files to the index and the "Commit your changes" section for information about commits. You can exclude certain files with the .gitignore file.
$ git add . $ git commit -m "This is a description of the commit!"
- Push your changes to the remote.
If you are collaborating with others or working from multiple computers, it may be a good idea to use the
$ git push origin master
git pullcommand first. See the "Conflicts" section for an explanation.
This is not meant to be a comprehensive guide to Git; in fact, it makes many generalizations and has no mention of many important features. It is meant only to introduce some of the concepts of version control and cover the commands necessary to get started. If you are looking for a more comprehensive tutorial or specific information, please try the official tutorial.
This tutorial assumes you're using, or plan to use, a remote repository on the Center for High Performance Computing instance of GitLab. The process should be very similar for other hosting providers.
Create a project on GitLab
If you plan to share your work with others, you'll likely need a remote repository to ensure availability. If you're using GitLab, this can be done by creating a new "project." The project contains the remote repository and adds additional features, like a description, wiki, and editing tools that can be used in an Internet browser. Each project has a "visibility level" for security. "Private" (default) requires you explicitly grant access to each user who will be working on (or simply viewing or cloning) the project, "Internal" allows all authenticated users to view or clone the project (but editing privileges must still be granted explicitly), and "Public" allows anyone to view or clone the project. It's also possible to create projects for groups of users, which is recommended if you have many projects with similar permissions.
You can use HTTPS or SSH when transferring files to and from your computer. When using
HTTPS, you must sign in with your university ID and password (as you would on the
GitLab website), while with SSH, you generate a pair of keys and create a single password.
This decision is largely based on personal preference. The remainder of this tutorial
will use HTTPS for consistency. In most cases, you won't want to use the URL given
by the project page when using HTTPS. Instead, use the URL of the project page itself
(you can copy it directly from your browser). For instance, instead of
https://gitlab.chpc.utah.edu/gitlab-user/project-name. This will prompt you for both your username and password when pushing changes to
the remote instead of assuming your username (in this case) is "gitlab-user," which
is often different than your university ID, which must be used for authentication.
Create a local repository
Without an existing repository (new project)
To start using Git, you'll need to initialize it in the directory of your project.
$ module load git $ cd your_project_directory $ git init $ git remote add origin https://gitlab.chpc.utah.edu/gitlab-user/project-name your_project_directory
From an existing project
You can copy an existing repository to your own computer with the
git clone command.
$ module load git $ git clone https://gitlab.chpc.utah.edu/gitlab-user/project-name your_project_directory $ cd your_project_directory
Verify that the local repository exists
To verify everything's worked up to this point, run
git status in your project directory.
$ git status On branch master Initial commit Untracked files: (use "git add <file>..." to include in what will be committed) your_files/ nothing added to commit but untracked files present (use "git add" to track)
If this didn't work, you'll receive an error. If this happens, check your version of Git and the directory you're in and try again. The remainder of this tutorial assumes everything is working as intended, so it's best to resolve any issues now.
$ git status fatal: Not a git repository (or any parent up to mount point /your/home/directory) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Configure your name and email
This step is important, but often neglected. Your commits will be associated with the email address you provide here (including on most third-party hosting services) and your name will help colleagues identify you.
$ git config user.name "Firstname Lastname" $ git config user.name "firstname.lastname@example.org"
If you plan to use the same name and email address for all of your projects, you can configure them globally.
$ git config --global user.name "Firstname Lastname" $ git config --global user.name "email@example.com"
This saves your information and uses it for all projects (unless explicitly changed on a given project).
Edit and stage your files
You can use any editor to modify your files and run
git status periodically to view their status. You'll notice that any "untracked" files are listed
under a heading that says "use 'git add <file>...' to include in what will be committed";
a command like
git add your_file_name.ext will add the file to the staging area, officially called the index, which contains all of the changes that will be made with your commit. If you're
satisfied with all of your changes, it's possible to add them to the staging area
git add *. You can exclude files selectively with the .gitignore file. Once you've added files to the staging area, they will be visible when running
git status. You can remove files from the staging area with
git rm --cached your_file_name.ext.
Commit your changes
Once you're satisfied with the state of your project (specifically, the state of the
staging area), you should "commit" your changes. This is analogous to saving and backing
up your work (it's important to remember that Git alone should not be used to back up files: you could still lose them). You can compare commits and,
if necessary, revert a file to a commit. Each commit contains a brief message about
the changes that were made, and the easiest way to do this is with the
git commit -m "This is your message" command. You can create as many commits as you'd like before pushing your work to
a remote repository.
Push your changes to the remote
Git has no system to prevent collaborators (or even individuals working with multiple
branches) from having two entirely different versions of the same file. Comparing
and merging documents has always been tedious, but many methods of facilitating collaboration
have developed in recent years to make it easy or even unnecessary. Some software,
like the content management system used to edit the website you're reading now, requires
users "check out" a document before editing (much like a library, once it's been checked
out, nobody else can use it). Others, like Google Docs, allow people to work simultaneously
and display changes in real-time but require a consistent Internet connection and
only allow for one version of a file (there's little room for independent testing).
Git's solution is somewhere in the middle: it can be used offline and independently,
but it allows users to discuss conflicts and makes finding them much easier. In fact,
Git will prevent you from finalizing your changes until you have (potentially) resolved all conflicts with other versions. In other
words, you must
git pull the most recent version from the remote before you can
git push your own. If there are potential conflicts, they're identified (use
git diff to see them) at this point. You should try your best to manually fix any conflicts:
after you've pulled the more recent version of the file, you are now able to push
your own, regardless of whether you've corrected any problems. This system allows
all developers to work simultaneously without worrying about what others are doing,
but it only works if everyone knows how to use it. It's still possible to overwrite
someone else's work, but this allows for a much more dynamic development process than
other methods and stays out of the way when not needed. For instance, two people can
edit the same paper simultaneously. Each time there is a difference in the text, the
better option can be chosen, or a new one written, to create an entirely new document
with work from both contributors. No time or effort is wasted in comparing text that
is the same in both versions.
When you're ready to push your changes, it's generally a good idea to
git pull. Often, this won't cause any problems and you can proceed with your
git push. However, if there are conflicts, you will receive a warning:
$ git pull origin master Username for 'https://gitlab.chpc.utah.edu': your_id Password for 'https://firstname.lastname@example.org': From https://gitlab.chpc.utah.edu/gitlab-user/project-name * branch master -> FETCH_HEAD Auto-merging your_file_name.ext CONFLICT (content): Merge conflict in your_file_name.ext Automatic merge failed; fix conflicts and then commit the result.
The file with the conflict will be modified to contain both versions:
<<<<<<< HEAD This is an example of what it might look like. This is from the first version. ======= This is from the second version! >>>>>>> 57a4c537d0cc429794dfed77d02e5a1bfca9d91b
The differences can be identified with the
git diff command and should be resolved manually. When you're satisfied with the files, add
them to the staging area and create a new commit. Now, you can proceed with
$ git push origin master
If everything worked, your changes should now be available on the remote. Check on GitLab to see if everything worked as expected.
Create and use branches
Branches allow developers to work on multiple versions of a project simultaneously. They can be used, for example, to test features that may or may not be included in a project. If it's decided they are to be included in the main version of the project, the branches can be merged simply and issues should be identified (as with potential issues between local and remote files). If the new version of the project isn't needed, the branch can be abandoned or deleted entirely without repercussions.
A new branch can be created with
git checkout -b new_branch_name. The branch will contain the same files and commits as its origin when it is created.
You can view available branches and identify the branch you're currently on with the
git branch command.
git branch * new_branch_name master
Now, if you modify files, they'll be modified on the new branch. If you want to switch
to a different branch, you can use the
git checkout command again, like
git checkout master. Be sure to commit your changes on one branch before switching to another.
To merge one branch into another, use the
git merge command. Start on the branch you'd like to merge changes into and run
git merge other_branch. Everything said about conflicts between local and remote versions of a file holds for branching, too. If there have
been commits in both branches, conflicts will need to be resolved manually.
The .gitignore file (a child of the project directory) is used to exclude certain files from most Git operations. The files listed in this document will not be tracked by Git (without explicit instruction). It might be used by a developer who wants to share source code but not binaries or a scientist working with sensitive information publishing his or her tools while ensuring the data itself is not available to the public.
Your .gitignore file uses patterns to exclude files. As a result, if the files you are adding are similar, you can simplify the process. For instance,
experiment.out testing.out case1.out case2.out
might become (assuming all files ending in ".out" are to be excluded)
You can read more about patterns on the Git documentation.
The README file (a child of the project directory) describes a project and provides important information to potential users and contributors. It's typically displayed on the main page of a project on services like GitHub and GitLab. Most are written with Markdown syntax and named README.md. This is where people tend to look when searching for information about your project.
While Git can manage binary files, it works best with plain text. For instance, if you were writing a paper, it would be a good idea to use plain text (such as LaTeX) in place of a document created with an editor like Microsoft Word. Documents saved in plain text can be compared far more easily (often side-by-side) and can usually be viewed in a browser without downloading the file.
git pull the most recent version of a project before you start editing it. This way, you won't
have to resolve as many conflicts when it comes time to push your changes to the remote.