When we work together on a computer file, we need a way to save it that helps to keep track of each change made to it,
so our work is stored safely and so we do not create many different copies of the same file and get confused trying to figure out
which is the most current or the best one to keep. When we are working on coding projects, alone or in groups, we need to share our files in
the cloud, or a shared internet directory, and we also need to keep track of each stage of the coding process, the changes we made on our own
local computers before sharing with the group. Sometimes we have to go back and restore something we did in an early stage that we deleted later. Each saved stage of a file’s existence is a version, and there are a number of software tools available to store safely all those versions that we care about. Some of these tools are more thorough and reliable, safer to use than others, and Git is probably the most reliable and very widely used in the coding community, because of its distinctive way of storing all local changes ever made to a file.
Git is a version control system, or a system designed to manage changes to computer files by storing a memory of each version they have taken over time. Some version control systems (like Dropbox) do this in a centralized way by keeping one master copy of a file and storing all history of its changes in a single location in the cloud (remotely). In Dropbox, if two users share a file directory, and each person attempts to change the same file, they each make a change locally, but run into a problem if they try to save their file in the same shared file location in the Dropbox cloud. If they save changes at around the same time, the centralized system creates a copy of one of the files with an altered file name (usually by putting a (1) or a (2) at the end of the original filename in Dropbox), and the owners of the directory must then decide which version to keep or how best to reconcile the two versions. Git, by contrast, would record the differences between the files as a new state of the file’s existence, and that is because it is a distributed or decentralized version control system, a system in which each storage repository carries its own distinct history. People who collaborate on a project upload and download shared versions of files from a web-based storage space called a remote origin repository, but when they work with those remote files they create copies of them on their own computers in a location known as a master repository, and Git manages information about their local version history as well as their remote history.
Git basically tracks each change you have deliberately committed to a directory, like taking a set of snapshots to record the condition of each significant state change that you record every time you make a commit. Every time you commit (or save) the state of your project with Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored. Advanced Git users may quite deliberately try to keep their local working repository different from the shared remote repository because they are working on something unstable that may not be usable by their team yet. When such an advanced user is finally ready to commit a change to the remote repository, their team members may review and reject the commit if they find something wrong with it. Git connects multiple local repositories but preserves each one’s distinct version history, and when GitHub users push their local repository files to a remote repository, they make it possible for other GitHub users to pull in those files and continue to work with them, giving them new homes and local histories. The way in which Git users interact with each other and make files available through remote repositories is sometimes described as a web of trust, in which files can be developed in multiple locations and shared with others based on their merit, and can be denied or rejected when they cause problems.
So let’s say you and a team member are both updating the same XML document inside of your
shared repository. You make your changes, save them, and upload them back to the shared
repository. The problem, for most VCSs, comes when your team member is working on the
same document as you at the same time. One of you is about to have your work
overwritten; however, Git keeps that from happening. You and your fellow team member can
each upload your revisions to the same document, and Git will save two versions of the
file. If you both were working on different parts of the same document Git will
automatically merge your changes together without losing any work along the
way. If you both happen to edit the exact same segment of code Git will issue a
merge conflict to point out where the two files differ allowing you and
your team to decide what version to keep. You can even revert to an earlier version at
any time, relying on the
snapshot that Git preserved of every change ever
committed by all project collaborators.
One of the most important aspects of Git is its emphasis on local operation. Git doesn’t need to go out to the server to get the history of a project and display it for you – it simply reads it directly from your local computer. This means you see the project history almost instantly, without having to wait for server network lag. This also means that there is very little you can’t do if you’re offline. When you manage a project using Git, you keep copies of all the project files on both your local machine and the online GitHub server. You work on your local machine, periodically pushing your changes onto the server so they’ll be accessible to your project partners, and periodically pulling changes made by your project team from the server to your local machine. So as long as you have recently pulled in any changes you can work offline until you are ready to push your changes. For more information regarding what Git is and the benefits of using Git please visit the online version of Pro Git, a Git how-to guide.
Unlike the Graphical User Interfaces (GUIs), the command line (sometimes called the
shell or terminal) is the only place you can take advantage of
all the functions Git has to offer. The GUIs only implement the simplistic functionality
of Git as a means to use Git at the most basic level. For many of us the command line is
strange and unfamiliar, but it used to be the main way that people interacted with
computers before GUIs came to dominate the computer experience. The command line is a
place where people can work more freely and have many powers to process files, powers
that are usually not available or far more difficult to access in software GUI
interfaces. Think of the command line as a way to navigate and work with your computer’s
files and systems by typing commands (strings of text) at prompts, instead of using the
mouse. GitHub offers a helpful, interactive tutorial on using the Git command-line interface. To
learn more about the power tools you can work with at command line, see Lifehacker’s
A Command Line Primer for Beginners.
Below is a list of key terms you will need to understand while using Git and GitHub. Each of these terms will be discussed in greater detail throughout this tutorial, but think of this as your quick reference guide to the basics.
revision, this is an individual change to a file (or set of files). It’s like when you save a file, except with Git, every time you save it creates a unique ID (the
hash) that allows you to keep record of what changes were made when and by who. Commits usually contain a commit message which is a brief description of what changes were made.
The above definitions are compiled from the following three sites, which can be visited for definitions of even more Git terms: kernel.org, betterexplained.com, and help.github.com
Because GitHub is a social networking site, where people post their code so that it will
be accessible to others, using it requires creating an account. You can create a free
account by navigating to github.com. You are not required to create an account using your pitt.edu address;
however, associating an educational email address with your account may give you
additional benefits (including five free private repos). To
create your account, enter a user name, email address, and password and then click the
green button that says
Sign up for GitHub. Be sure to select the free option, and
then click the green button that says
Finish sign up. After creating your
account, GitHub will prompt you to verify your email address and decide on the frequency
of GitHub email notifications. Make sure to follow the link in the verification email.
It is important while you are enrolled in this coding course that you regularly checking
GitHub so you may decide to allow all notifications (knowing you
can change this later). You will then be directed back to your account.
Note: We have highlighted some recommended features of your GitHub account that you should take some time exploring further.
Git is available for download on all three of these platforms: Linux, Mac, and Windows. To download Git on your local machine follow the instructions available on the git-scm.com site. Since platform downloads differ we are not going to go through a step-by-step installation process.
Our suggested settings for a Windows installation are as follows:
Adjusting your PATH environment, we recommend selecting
Use Git Bash onlyas it provides an unmodified PATH.
Configuring the line ending conversations, we recommend choosing the option to
Checkout Windows-style, commit Unix-style line endings.
Git Bash; make sure the box to create this shortcut is checked.
Installation on a Mac is much simpler. For a detailed installation guide, we suggest following this tutorial for assistance while setting up Git on a Mac.
Now that you have Git on your computer you can access it through command line. On a
Windows installation you will be using the
Git Bash command shell and on a Mac
installation you will be using
The key thing to working with Git is always knowing where to find your files in the
Finder (on Mac) / File Explorer (on Windows) and in the Terminal/Git Bash Shell. So you
need to save directories in a place where you can easily see them. In your Finder/File
Explorer, make yourself a
GitHub directory that lives inside
on your desktop if that is a more logical place for you). Inside the
directory you will clone each of your project repositories. In the Terminal/Git Bash
Shell), you can navigate to your
GitHub directory from the computer's root by
cd Documents/GitHub/ if stored in documents or
Desktop/GitHub/ if stored on your desktop
cd means "change
directories" and in the above command you are stepping down into Documents (or Desktop)
and into the
GitHub folder. Use
ls to list out the contents of the
directory you have stepped down into.
When you first use your command shell you will need to configure Git to recognize you local computer as a correspondent to your remote GitHub account. Use the following commands to configure your local Git installation (these command can be done in Terminal/Git Bash Shell without needing to specify a directory):git config --global user.name "YOUR NAME"
Be sure to use the same email address you used when signing up for your GitHub account. This will add your name and email to a file named .gitconfig in your home directory. To verify that you entered your information correctly use the command:git config --list.
For more information on the
config command visit the Pro Git guide on configuration. Note: in order to cache your
username and password, you may need to follow further directions found on help.github.com. To test if your information has been cached try
pushing to remote repositories that you have been added as a collaborator to on
Projects on GitHub are stored in repos (short for repositories), which you can think of as equivalent to folders or directories on your computer. Repos can contain plain files and subdirectories, so you can use the subdirectories to organize your project files according to whatever hierarchy makes sense to you. Initially you may want to put all of your project files into the main repo directory, and as you add additional files and your project becomes more complex, you can think about creating subdirectories and distributing the files among them according to file type or other criteria. Your project mentor will advise you about how to structure the directory space for your project.
Your remote project repo will reside on the GitHub server, which is the remote server
that is accessible to others. It is possible to edit files directly on the server using
the GitHub interface, but the typical interaction with GitHub is to work on copies of
the project files a.k.a. repo clone on your own machine and
then, when you are satisfied with the state of your work, to push (upload) your new or
modified files to the remote server so that your project partners will have access to
them. Similarly, when your project partners push their changes to the server, those
won’t be reflected in the copies on your local machine until you pull (download) them.
The process of uploading files you have modified from your local machine to the GitHub
server, and of downloading from the GitHub server files that your project partners may
have developed or modified and uploaded, is called syncing (short for synchronization).
Syncing with command line requires two basic commands:
git pull and
git push. The command
git pull can be broken down into
git fetch and
git merge, but we will discuss the
distinction later in this tutorial. When you begin a work session, you should start by
syncing your local project space with the server using
git pull, so that
you’ll catch up on any changes other project contributors have made. When you reach a
stopping point in your work (when you have new or updated files
that your project partners need to see), you should commit (
commit) and sync again, but this time using
git push. You don’t
have to push after every keystroke, but you should commit regularly and push whenever
you reach a reasonably stable intermediate point to which your project partners should
The best way to initiate a new repo, for our group projects, will be to create it through the online web interface. Easy to follow instructions on how to do this can be found on help.github.com. If a repo is created through the web interface it will need to be cloned in order for you to access it and begin working on your local computer. Only one member of your project team will need to create the project repo, and all other team members will clone from that repo.
Alongside version control, one of GitHub’s most useful features is that it supports collaboration. Forget the days of emailing a file back and forth between group members as it is updated, or waiting to edit a document for fear of losing simultaneous edits. GitHub allows multiple collaborators to manage a repository. The only way to handle collaborators is through the web interface:
Settingsat the top of the screen.
In the future, if collaborators need to be removed simply navigate this same path and there will be an option beside each collaborator to remove them.
As we explain above, the way you work on your project (create files, edit files, delete files) is to work on them on your local machine and then sync any modifications to the GitHub server, so that they will be accessible to your project partners. The Git term for copying a project from the GitHub server to your computer initially so that you can begin to work on it locally is called cloning. You only have to clone a project once, when you first begin to work on it. Cloning is the copying of a project already in existence on the GitHub server to your local computer so that you can begin to work on it; whereas, syncing is exchanging updates between your local computer and the GitHub server after you’ve already cloned the repo and established a local copy in which you can work. This working model separates saving your work to your local computer (which you should do frequently) and syncing your local clone of the repo with the master copy on the GitHub server, a design that protects you from uploading mistakes to the server. That is, you do your development on your local machine, where only you have access to the files, and when you’re ready to share your work with your project partners, you sync (upload) it explicitly to the server.
Before you clone a repo, you need to understand how to navigate between directories using Git Shell. To begin you will open your Git Shell (command line / terminal interface with Git). We explain in detail how to open this above. When opening Git Shell from a shortcut link you will be placed in the main configuration folder you set up when installing Git. The following image explains how you navigate between directories:
Again, to begin, open your Git Shell (command line / terminal interface). We explain in detail how to open this above.
To clone a repo from the remote online server using command line you use one easy
git clone. After
git clone you copy the URL of the
repo you wish to clone, a space, and then a name for the folder where the cloned repo
This creates a copy of the entire repository on your local machine and adds the original location as a remote so you can pull from it again and push to it if you have permissions. Note: you will only be able to directly push changes to the GitHub server if you have been added as a project collaborator. Later in this tutorial we will discuss how you can contribute to other GitHub projects that you are not a collaborator on via forking and a pull request.
When making changes to a repo it is important to remember the three parts or your Git repo on your local machine: unstaged, staged, committed. If you make edits to a repo those changes sit unstaged until you have consciously added those changes to the staging area. This mean you could modify a file multiple times without documenting every single keystroke before asking Git to track those changes. Once you have added changes to the staging area Git will monitor if any new changes differ from what was added and mark those changes as unstaged. Staging changes allows you to review before committing.
git mergeor the single command
git pull. By using
git mergeyou can first see the changes that are being brought into your local repo (or fetched) from the remote repo. This allows for you to, upon review, make the decision to merge those remote changes with your local repo. The command
git pullcombines the two previously mentioned commands into one step, which can be convenient yet dangerous if there is a possibility of the changes coming in to conflict with files you may have edited without committing.
Some believe it is best practice to review all of the
remote files you are bringing into your local repo before merging. A detailed argument for always using
git merge over
git pull has
index. This is an intermediate area where commits can be formatted and reviewed before completing the commit. To stage a file, use the command
git add. If you want to stage all of the changed or newly created and deleted files stored in the working directory you can type
git add -A. To specify that you only want to stage a certain type of file you use an asterisk. For example: to stage all of the .html files you type
git add *.html. If there is only a single file you want to stage you can type the filename after the command. For example:
git add index.html.
As you are working on files it is important to always remember that although you are working locally you will need to check frequently with your project team to be sure new changes they may have made will not cause the files you are editing a merge conflict. For example: you are editing an XML transcription and one of your team members is going through all of the XML documents and replacing hyphens with the Unicode. Your partner edits many portions of the same document you are working in and pushes those changes to the GitHub repo. Since you’re not sure if those changes will cause a merge conflict or not you need to store your changes without adding them to the staging area so you can pull your partner’s changes in first. Git allows you to do this very simply with the command
git stash. This command stashes your changes outside of your working directory while you pull in your partner’s remote changes. You then can return your stashed files to your working directory and continue by adding your changes to the staging area. For more information on
git stashvisit the Pro Git guide on stashing
git commit. To write a commit message right inside of Git Shell type
git commit -mfollowed by the commit message in quotes. Git requires a commit message when committing. Think of commit messages as the breadcrumbs of your project for yourself and your team members; every commit message documents changes to the project with an author and time stamp.
Writing good commit messages matters when you work on a team, or even when you're working on your own because each commit you make to your repository helps to generate a log of your activities. Think about this as you're writing your commits, and check out this helpful guide on how to write good commit messages!
As discussed above, GitHub serves as a social media network that stores and talks to Git repositories. The GitHub interface is useful for many aspects of project management. The key parts of GitHub’s interface besides the remote storage of project files are the issues boards, wikis, and data visualizations. Wikis and issues are often used in projects to manage tasks and communicate between project team members and the GitHub community with project problems and updates. The data visualizations are helpful in mining a project’s information on collaborators and development.
To create or comment on an issue navigate to the desired repo and click on the
Issues tab near the top of the page; notice there is also a tab for Wikis.
Issues and Wikis on GitHub are written in Markdown Language. In fact,
markdown is used for styling all forms of writing on the GitHub platform.
GitHub offers a comprehensive guide to writing markdown that we recommend you
review; however, when writing on the GitHub interface there are options similar to
Microsoft Word that can be toggled to have a specific styling applied to your text.
Besides project team communication, another important aspect of project management is
data preservation. Long-term preservation can be defined as the ability to provide
continued access to digital materials, or at least to the information contained in them,
indefinitely. A sustainable preservation program should address organizational issues,
technological concerns and the digital curation/data management process. Using Git and
GitHub already provides one means of data preservation; however, it is important to
realize that all of the data collected through the use of Git / GitHub is also important
to preserve alongside the literal files of your project. One option we have researched
to be viable for creating a backup of your Git and GitHub data is backhub.co. Another nifty tool that
we recommend you update frequently is the online digital archive of websites. By adding updates to this
site you can go back and review previous saved versions of how your website(s) looked.
Besides data preservation it is important to also be able to mine your GitHub data. The
GitHub interface makes this really simple. In any repo, at the top of the page, there
are two tabs that are particularly useful for mining your GitHub data:
Graphs. A repo’s Pulse visualizes the activity of the GitHub repo,
while Graphs visualize everything from specific user contributions to the
types of files stored in the repo. These visualizations can be useful when trying to
determine details about how the repo (and therefore project) grows and changes over
We recommend our DHClass-Hub Quick Reference Guide.
Here is a list of commands used most frequently, and some options to modify the results of each command. Some of the commands listed here are associated with advanced project management while others are more basic. For commands to correct merge conflicts and bad commits please refer to our troubleshooting section. We have not listed every possible command; however we can recommend these sites to review more commands: the Pro Git Guide, siteground.com, atlassian.com, and orga.cat. Another really great thing about Git is that it has been in existence since 2005 and is used widely among computer programers; therefore, the resources on Git and Git commands are vast!
git help: possibly the most useful Git command, this command allows you to search the most common Git commands in the command/terminal shell. If you follow this command with another command or concept (i.e..
git help push) Git generates a html page detailing the command or concept as well as possible options for its use.
git init: initializes a git repository by creating the initial .git directory in a new or in an existing project.
git clone: copies an existing GitHub repo to local machine.
git status: checks the working directory to see if up-to-date with the remote repo.
git add: a.k.a. staging changes, adds changes to staging area of the working directory. This command is the first step in committing changes to your local version of the repo before pushing them to the remote GitHub version of the repo. By adding
git add -A) Git adds, modifies, and removes ALL entries to match the working directory. To specify that you only want to stage a certain type of file you use an asterisk. For example: to stage all of the .html files you type
git add *.html. If there is only a single file you want to stage you can type the filename after the command. For example:
git add index.html
git commit: tells Git to record the changes made to your version of the repo. Every commit needs to have a message that explains what files have been edited/added. After the command add
-mand then the commit message in quotes (
git commit -m "This is where your message goes"). You can skip the command
git addby using
git commit -a -mfollowed by your commit message. However, best practice is to add and commit in separate steps so you are aware of the files being added before making the commit. If you ever find yourself annoyed with your own commit message, before you push you can actually amend your commit using:
git commit --amend. Don't use this after you've pushed (it's like revising history then--it would tangle up everyone's repo histories to amend once you've pushed).
git stash: used when you want to record the current changes to the working directory, but want to go back to a clean working directory without forever losing those changes and without adding them to the staging area. This command saves your local modifications temporarily outside of your working directory and reverts the working directory to match the last commit (typically used so you can pull in remote changes to a clean directory while avoiding possible merge conflicts and keeping your changes available). The modifications stashed away by this command can be listed with
git stash list, inspected with
git stash show, and restored (potentially on top of a different commit) with
git stash apply.
git pull: this command is made up of two other commands (
git merge) and is used to fetch the data from a remote repository and merge it into your local computer’s version of the repository. If working in a fork it is important to remember that changes are pulled from the original remote repo the fork is made from:
git pull upstream master. When pulling from the original repo of a fork it is necessary to set that original repo as a remote origin (called
upstreamin the previous example).
Masterrefers to the branch of the original repo of which the changes are being pulled from, and can be changed to any branch name of that original repo that the desired changes reside in.
git remote: used to retrieve a list of remote repositories associated with local repository. To create a new remote association use
git remote addfollowed by a name for the remote branch (i.e. upstream) and then the URL of the remote repo on GitHub.
git remote -vdisplays a verbose list of the associated remote repos with the remote URL after the repo name.
git push: updates remote repos to match commits made on local machine. If working in specific branches you can designate the branch name (that you wish to push) after the command, or use
git push -allto add commits from all local branches.
git branch: used to work inside of branches. This command followed by a branch name creates a new branch. For a list of both remote-tracking branches and local branches use
git branch --list. Frequently branches are used for experimental development that may or may not be incorporated into the master branch of the repo. Having these changes in a branch prevents the master branch from bad commits and uncertain files.
git checkout: followed by a branch name allows you to switch into the working directory of a specific branch.
git diff: followed by the names of two branches allows you to compare the differences between the two branches. For example:
git diff master developmentwill show the differences between the master branch and the development branch. If you want to just get a list of the files changed between the two branches you can change the command to
git diff --name-onlyagain followed by the names of the two branches you wish to compare.
If you are reading this section you probably are at a fearful state of emergency with your project repo due to a merge conflict. The first thing that you should keep in mind however is that you can always undo the changes causing a conflict and go back to the state before the conflict occurred. Another benefit of using Git over many other VCSs is that a merge conflict will not bring your complete team to a halt or cripple your remote project repository. This is because, in Git, conflicts can only occur on a single user’s local machine. In Git, merging is the act of integrating another branch into your current working branch: most often taking changes from a remote branch and combining them with your current working directory on your local machine. Typically Git can seamlessly integrate multiple changes from multiple collaborators; yet, occasionally Git runs into a scenario where the changes overlap thus causing a merge conflict.
When faced with a merge conflict, the first step is to understand what happened. Did one
of your team partners edit the same file on the same lines as you? Did they delete a
file that you modified? Did you both add a file with the same name? Using the command
git status will tell you where you have
unmerged paths or
conflicts. The most common conflict occurs when one partner makes a change on a
particular line in a file, and a fellow team member makes a change on the exact same
line. Git does not automatically assume the last person to push changes is correct and
should over-write the other’s changes. When this sort of conflict occurs, Git writes a
special block into the file that contains the contents of both versions where the
conflict occurred. To complete this type of merge, use <oXygen> (or other text editor) to resolve the conflict, then add the
file and commit it to complete the merge. In the file, Git automatically adds conflict
markers to the affected areas. A conflict-marked area begins with
<<<<<<< and ends with
>>>>>>>. The two conflicting blocks themselves are
divided by a
=======. You have several options here. You can either keep
your changes, take your partner’s changes, or make a brand new change. Whatever you do,
you need to make sure to resolve the conflict such that the file makes sense, and
everyone is happy. The goal is to get the file to look exactly how you want it. You’ll
delete the conflict markers, and just type out a new change or remove the unwanted/wrong
lines. Once you have made those changes you can use
git add and follow
through with making a new commit to push to the remote. Opening the file in your editor
and cleaning it up there is perfectly valid, but can be jarring and difficult to
distinguish the changes. Using a dedicated merge tool (like Meld or Kaleidoscope)
can make this easier (if you have one installed). You can
configure your tool of choice using the
git config command. Consult your
tool’s documentation for detailed instructions. Then, in case of a conflict, you can
later invoke it by simply typing
Outside of the above solution there are a few other ways of fixing possible merge conflicts and getting rid of bad commits. The need-to-know Git commands for merge conflicts and bad commits based on the conflict and state of the local repo are:
git stash: this takes all the changes in your working directory and saves them on a new clipboard. Leaves the local working directory clean; you have no more local changes. Later, at any time, you can restore the changes from that clipboard in your working directory and continue working where you left off. To get a list of all your
stashesuse the command
git stash list. When you’re ready to restore a saved Stash, you have two options: use
git stash popto apply the latest
stashand clear it from your stash clipboard or use
git stash applyfollowed by the stash name to apply that specified
stash. The second option keeps the stash on the clipboard, and it can be deleted later via
git stash dropfollowed by the stash name.
git revert: is used to undo a previous commit(s) by introducing a new commit that reverses an earlier one. This doesn’t change the project history, which makes it a
safeoperation for commits that have already been published to a shared repository (i.e. the GitHub remote repo).
git reset: is used to undo changes in your working directory that haven’t been committed yet. If you do not want to loose your local changes or the changes from the commits being rolled back use
git reset --keepfollowed by
HEADor the commit hash ID. If you have decided you want to be rid of all current local changes use
git reset --hard. This will cause you to loose all current local changes. If you are unsure that you want those files best bet is to be safe and use
git stashbefore this command.
git clean: removes untracked files from your working directory. This command is often executed in conjunction with
git reset --hard. Remember that resetting only affects tracked files, so a separate command is required for cleaning up untracked ones. Combined, these two commands let you return the working directory to the exact state of a particular commit.
Note: When resetting or reverting a repo you can chose the commit to reset/revert back to
by grabbing the commit hash ID from that commit. A simple way to find commit hash IDs is
to use the GitHub interface by viewing all of the commits in a repo. If you simply want
to go back to the last commit use
HEAD after the command instead of the
commit hash ID.
The atlassian.com tutorial on undoing changes, the Pro Git blog titled Reset Demystified, GitHub help guide to merge conflicts, and the git-tower.com ebook are our go-to sources for fixing and preventing merge conflicts and bad commits.
Branching means you diverge from the main line of development and continue to do work without messing with that main line. The master branch is the main branch of a project. When working in projects with multiple collaborators it may be important that the master branch not have errors, and always be in good working order. If everyone is writing new code into the master branch, it’s possible for the master branch to enter an unstable state temporarily. To avoid that, projects may require that all developers work on different parts of the project separately, in their own branches, and merge their individual development branches into the master branch only once they are confident that it won’t have a destabilizing effect. This can be useful even in single-developer projects if you need to maintain multiple versions, such as a stable branch that people should be able to download and use and a development branch that will eventually become the new stable branch, but that isn’t yet stable. Branches can be merged and deleted because a branch is an inalienable part of the repo in which it is created in; thus, working in a branch requires you to already have a cloned repo. Just to be clear, when you clone a repo that has existing branches the entire repo gets cloned including all of the repo’s existing branches.
To really understand the way Git does branching, we need to take a step back and examine how Git stores its data. As stated above, Git doesn’t store data as a series of changesets or differences, but instead as a series of snapshots. When you make a commit, Git stores a commit object that contains a pointer to the snapshot of the content you staged. This object also contains the author’s name and email, the message that you typed, and pointers to the commit or commits that directly came before this commit (its parent or parents). A branch, in Git, is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master. As you start making commits, you’re given a master branch that points to the last commit you made. Every time you commit, it moves forward automatically.
Before creating a branch it is important to note what branches are already a part of your
local repo. To do this we use the command
git branch. This command lists
all the branches only in the local repo. The master branch came automatically from the
source repo upon cloning as the main branch of the repo. To see all of the local
branches, including the master branch, and the remote branches that can be pushed to we
use the command
git branch --all. Now to create a new local branch use the
git branch followed by what you wish to call the branch. For
git branch development will create a branch called
development. To verify that a new branch was added, re-run the
branch command. You should now see the new branch. The star to the left of a
branch name indicates where you are working, and this can be verified with
status. When you run
git status Git Shell should return a line
listing what branch you are on. For example: if you are working on the master branch Git
will return the line
On branch master.
If you want work to appear under a specific branch you will need to switch to the desired
branch before committing changes. To switch branches use the command
checkout followed by the name of the branch you wish to switch to. For
git checkout development will switch you over to the
You can verify that you have successfully switched branches by using either of the
previously explained commands:
git branch or
Once there have been changes made via a branch and you wish to combine those changes with the main (master) branch, you will need to do the following:
git difffollowed by the names of the two branches you want to compare. For example:
git diff master developmentwill show the differences between the master branch and the development branch. If you want to just get a list of the files changed between the two branches you can change the command to
git diff --name-onlyagain followed by the names of the two branches you wish to compare.
git branchto verify what branch you are currently working in. If you are not on the master branch you should switch to it using
git checkout master.
git mergecommand followed by the name of the branch being merged into the master branch. For example:
git merge development.
git diffagain with the two branch names, and if Git does not return anything that means there are no differences between the two branches.
git statusit will show that local master is ahead of the remote origin. To fix this and bring the remote origin repo up to date with your local master, run the
If you have done work in a branch, and you do not want to merge the changes into the
master branch until others on your team have reviewed it you can push the branch and its
changes before merging. As long as you have the branch selected in Git Shell, you can
use the command
git push origin followed by the name of the branch to
update the remote repo with the new branch. For example:
git push origin
development will push the development branch and its changes to the remote
origin. Now if another collaborator wishes to clone the repo, and bring in your branch
to their local repo they will be able to do so.
Some branches are created only temporarily; therefore, you may desire to delete a branch
before or after merging. The command
git branch -d followed by the branch
name will delete the specified branch; if the branch is not fully merged in its upstream
branch or contained in the current branch, this command will fail with a warning.
Otherwise if the branch has been successfully merged with another branch then the
command will go through and the branch will be removed. Use
git branch to
verify deletion. If you wish to remove the branch whether it has been merged or not the
command is altered slightly:
git branch -D followed by the branch name.
Part of the open-source etiquette of GitHub is that developers are encouraged to copy and then improve code originally created by others in a separate forked repo. GitHub supports the creation of derived projects through forking. When you fork a repo, you create a copy of it under your own account, where it acquires an independent identity. You can do anything with a repo you created by forking someone else’s project that you can with a repo you created from scratch. The moment that you create the fork, your new repo is no longer synchronized with changes in the repo from which you copied it originally. If you fork a repo and make changes that you would then like to contribute to the original source, you can issue a pull request, inviting the developers of the original repo to merge your forked repo’s changes into their original project.
To begin forking you need to navigate to the repo you wish to fork, and select the
fork button in the top right of the page.
Next, you will need to navigate to the new forked repo created on your account. From
here, follow the steps above on cloning, and clone the repo using
git clone and the HTTPS of the forked repo. Make sure you are cloning
the repo that is located on your account and not the source repo you forked from.
In order for the forked repo to reference the source repo, you will need to set up the
remote upstream. Do this with the following command:
git remote add
upstream followed by the HTTPS of the source repo. Use the command
git remote -v to show all of the remote versions of
your repository. You should see
origin (the direct remote version of the forked
repo) and the upstream you just created. When you want to bring any new changes made on
the source repo’s master branch into your forked repo, you can use the following
commands to do so:
git fetch upstream master will show the changes able to
be brought in and
git pull upstream master will bring the changes in. These
commands function the same way in a fork as they do in any cloned repo.
In the branching section of this tutorial we discussed why using a
branch is good practice. When working in a forked repo, branching becomes even more
important and practical. The master branch is better to be left free of
developmental/experimental changes so that it can be used to sync with the upstream
(source repo). Therefore, it is best to add and change files in branches of your fork.
After you have created a branch, modified files, and committed said changes use the
git log --pretty=short to see a listing of commits. Finally, to
push all the local branches and change to the forked remote use the command
Now if you go back to your fork on GitHub, you can see that GitHub noticed that we pushed
a new branch with changes to the source repo’s code. GitHub presents us with a big green
button to compare our changes and open a Pull Request to the original
project. You can alternatively go to the
Branches page at
https://github.com/[username]/[repo name]/branches to locate your branch and open a new
Pull Request from there.
If you click that green
Pull Request button, you’ll see a screen that asks you to
give your Pull Request a title and description. It is always worthwhile to put some
effort into this, since a good description helps the owner of the original project
determine what you were trying to do, whether your proposed changes are correct, and
whether accepting the changes would improve the original project or not. You can also
see a list of the commits in your forked repo’s branch that are
ahead of the
master branch. Should this branch get merged by the project owner, a unified
of all the changes will be made.
When you hit the create pull request button, the owner of the project you forked from
will get a notification that someone is suggesting a change with a link to a page that
has all of the information you provided. At this point, the project owner can look at
the suggested changes and merge it, reject it or comment on it. It is possible that the
owner will ask you to
rebase from master and resubmit a pull request. This might
happen if a series of changes were made on the source repo’s master branch that you had
failed to fetch or if such changes were made after you submitted your request. This is
not a problem! Use this very thorough YouTube video
to guide you through how to do this.
Git and GitHub can be a bit confusing for new users, but it makes project management much more robust than the available alternatives, and learning to use it is worth the effort. This tutorial is designed to get you started, and your instructors and mentors are available to advise and help if you get stuck or confused. There are a number of resources online to further your knowledge, some of which can be found at the following links:
Getting Git Right@ atlassian.com
Git - the simple guide