What does Git stand for

1.1 Getting started - what is versioning?

This chapter is about how to get started with Git. We'll start with an explanation of the background to version control systems, then move on to how to get Git working on your system, and finally how to set it up so that you are able to get started with Git . By the end of this chapter, you should understand what Git is for and why you should use it, and you should be able to get started with Git.

What is versioning?

What is "versioning" and why should you care? Version management is a system that logs the changes to one or a series of files over time, so that you can later fall back on a specific version. The files that are versioned in the examples in this book contain software source code; in fact, in practice almost any type of file can be tracked using version control.

For example, as a graphic or web designer, you want to be able to track every version of an image or layout. As such, it would therefore be advisable to use a version control system (VCS). Such a system makes it possible to restore individual files or even an entire project to an earlier state, to see who last made which changes that might cause problems, to find out who originally made a change and many other things. A version control system generally offers the possibility of reverting to a previous working state at any time, even if you screwed up or lost files for whatever reason. All of these advantages are available for very little additional effort.

Local version management

Many people do version control by simply copying all of their files into a separate directory (the smarter ones use a directory with a timestamp in the name). This approach is very common and popular because it is so simple. But it is also incredibly error-prone. It is very easy to work in the wrong directory, edit the wrong files or overwrite files that you didn't want to overwrite.

For this reason, programmers long ago developed local version control systems that manage all changes to all relevant files in a database.

Figure 1. Local versioning

One of the more popular version control systems was RCS, which is still shipped with many computers today. RCS works on the principle that for every change a patch (a patch comprises all changes to one or more files) is saved in a special format on the hard drive. In order to restore a specific version of a file, it applies all patches up to the desired version and thus reconstructs the file in the desired version.

Central version management

Another big problem that many people then faced was working with other developers on other systems. To solve this problem, centralized version control systems (CVCS) have been developed. These systems, including CVS, Subversion and Perforce, are based on a central server that manages all versioned files. The clients can pick up the files from this central location and transfer them to their PC. The process of picking up is called checking out. This type of system has been the standard for version control systems for many years.

Figure 2. Central version management

This approach has many advantages, especially over local version control systems. For example, everyone knows more or less exactly what other people involved in a project are doing. Administrators can specify who can do what in detail. And it is much easier to administer a CVCS than to manage local databases on each individual user computer.

However, this structure also has some significant disadvantages. The most obvious disadvantage is the risk of system failure if a single component fails, namely if the centralized server fails. If this server is only unavailable for an hour, then no one can work in any form with others during that hour or save versioned files that are currently being worked on. If the hard drive used on the central server is damaged and no backups have been made, then all this data is irretrievably lost - the complete history of the project, apart of course from the respective status that employees happen to have on their computer. Local version control systems naturally have the same problem: If you manage the history of a project in a single, central location, you risk losing it completely if something goes seriously wrong at this central location.

Distributed versioning

This is where distributed version control systems (DVCS) come into play. In a DVCS (such as Git, Mercurial, Bazaar or Darcs), users don't just get the latest status of the project from a server: instead, they get a full copy of the repository. In this way, if a server is damaged, any repository can be copied back from any user computer and the server can be restored. Each copy, a so-called clone, is a complete backup of the entire project data.

Figure 3. Distributed versioning

In addition, such systems are excellent at dealing with different external repositories, so-called remote repositories, so that different groups of people can work together on a project in different ways simultaneously. This makes it possible to create and use different types of workflows that would not be possible with centralized systems. This includes, for example, hierarchical work processes.