A git Primer

June 28, 2014

#business #cybersecurity #productivity #technology #tutorial

Definitions
git‘s Index
Branches
Remote Repositories
Merges
Managing a Website Using git
Tags
Essential Commands

git is a wicked-powerful distributed revision control system. It is confusing to many, so there are myriad tutorials and explanations online to help people understand it. This one will focus on the fundamental concepts and tasks rather than trying to compete with the documentation.

Definitions

Understanding how these components work together is the key to understanding git.

git‘s Index

First and foremost, it’s important to understand that git has something called an index that sits between your working directory and your commits. It’s basically a staging area, so when you git add you copy a snapshot of your working directory to the index, and when you git commit you copy that same thing from the index to create a new commit.

It is crucial to understand the intermediate (staging area) nature of the git index in order to grasp the relationship and differences between adding and committing content.

git status is very helpful in understanding git because it shows you the differences between the working directory and the index and previous commits. The "status" refers to the status of the working directory, so if you make a change in your content — say to index.php — and you run git status, it’ll show you what’s changed that isn’t yet staged for a commit (in your index):

$ git status

On branch master Changed but not updated: (use "git add ..." to update what will be committed) (use "git checkout -- ..." to discard changes in working directory)

modified: index.php

no changes added to commit (use "git add" and/or "git commit -a")

git diff is similar to git status, but it shows the differences between various commits, and between the working directory and the index. git diff --cached, on the other hand, shows you the difference between what’s in the index vs. in your last commit.

Branches

Branches are just pointers to various commits. Every project has a default pointer called Master. As you continue to create commits within the default branch, the Master pointer follows you along so that it always points to the latest commit in the branch.

Branches are created using the git branch command, which creates a new branch label which points to the current commit. So, until a new commit is made, both the previous and new branch labels will point to the current commit. One way to think of them is "File->Save As Copy" for your codebase.

Using this model, someone can then checkout the Master branch, make changes, and commit them. The result would look like this.

Keep in mind that the "Next Commit" commit would only happen for someone who had the Master branch checked out. If the same person who performed the git branch command immediately made changes, added them, and then committed them, they’d have made another commit along the Test branch instead, like so:

If both branches had been checked out, worked on, and had a commit made, the diagram would look like so:

To change to and start working on a given branch, you have to check it out. This is done with the git checkout command:

$ git checkout

Keep in mind that when you do this you will pull the copy of your project that exists at that branch’s latest commit point, and it will overwrite your current working directory with that version of your project.

Remote Repositories

Creating code branches on a single system is enjoyable enough, but the real purpose of git is allowing people in disparate locations to contribute to a project. You add remote repositories in git by using the git remote command, like so:

$ git remote add remotesite ssh://yourdomain.tld/path/to/remote/gitdirectory.git

[ This is using SSH as the protocol, but git supports many others as well. ]

Then you push your content from your local repository to the remote one, using the name you’ve set up:

$ git push remotesite +master:refs/heads/master

Then, to perform updates, you simply run git push with your target defined:

$ git push remotesite

Merges

If you have more than one person working on the project — each doing his own thing — eventually you’re going to have to synchronize. git handles this quite elegantly by simply moving the necessary branch pointers upon completion.

In the diagram below we merge Test into Master, and we simply end up with a later version of Master, as you might expect — only this version includes the changes that existed in Test as well.

NOTE: You want to make sure you’re fully committed before you attempt a merge. Not doing so invites wrath.

Managing a Website Using git

So now that we’ve seen the various components of git in play, let’s see how it all works together by creating a setup that allows us to manage a website remotely.

On Your Local System

Either have, or create, a web directory on your local system. This is the main place you’ll make changes from. Start by changing directory into it.

git init

Create some content.

vi index.html

Add the content to git‘s index.

git add index.html

Get a weekly breakdown of what's happening in security and tech—and why it matters.

Commit the changes.

git commit

On Your Server

Initiate a new git repository on the server.

mkdir /path/website.git && cd /path/website.git && git init –bare

Create your hook that will checkout the code to your actual web directory.

vi /path/website.git/hooks/post-update

GIT_WORK_TREE=/path/htdocs git checkout -f

chmod +x /path/website.git/hooks/post-update

Back on the Local System

Add the remote directory to the local config.

git remote add website ssh://path/server/website.git

Push the contents of the local repository to the remote one.

git push website +master:refs/heads/master

Then, change things locally and to upload changes, simply do a:

git push website

The changes will upload to the remote git repository and trigger the post-update hook, which copies the contents of the repository to the working directory — your live website directory (htdocs).

[ To update from the server side you can execute git commands as usual, but you must provide environment context with each command, like so: ]

GIT_DIR=/path/website.git GIT_WORK_TREE=/path/htdocs git $some_git_command

[ On the server you won’t have to push after you add and commit, as using the environment variables above will mean the committed changes will already be present in the repository. ]

Essential Commands

There are a good number of git commands, and the documentation is excellent, so I won’t cover many here. Here are some I think are worth mentioning.

git bisect – allows you to isolate the exact code push that caused a problem by executing git bisect bad to mark the current (broken location), then restore to a known-good configuration, and then git will step you through each commit you’ve made in between the two. For each one you then either git bisect good or bad until you’ve found your error-causing commit.
git stash – sets changes you’ve made off to the side in a way that lets you bring them back later. Usage: you are about to checkout another branch which would crush your current changes, so you git stash them before doing your checkout.
git rm / mv – deletes / moves items in the working directory with visibility to git so that you can commit afterwards.
git reset – resets to a specified state (commit).
git status – show the status of the working tree — including differences between the index and the current commit, differences between the working tree and the index, and items within the working tree that are not tracked by git.
git log – show a log of commits
git clone – clone a repository into a new directory.
git clean – remove untracked files from the working directory.

Notes

1 Be sure to restrict access to your .git directory on your server if it resides within the presented content.