Using a software tool to handle the versions of your project files lets you focus on the more interesting/innovative aspects of your project.
Version control may be the most important topic that we teach. We’re teaching Mercurial because it offers a consistent user interface, and the basic workflow requires only a few easy to remember commands.
There are lots of advanced features in Mercurial and other distributed version control systems such as branching, hashes, rebasing, etc. that we don’t teach. The goal is to introduce the basics so that the learners leave the workshop able to create and use repositories for themselves with confidence, and that they are aware that version control can also be used to collaborate with other people.
In other words, try to convince the learners that version control is useful for researchers working alone or in teams because it is
Make sure the network is working before starting the [Collaborating}(../02-collab/) section of this lesson. [A Better Kind of Backup}(../01-backup/) focuses on individual use of Mercurial on and can be done without network access. [Conflicts}(../03-conflict/) can also be taught without network access and Bitbucket by creating 2 clones of the repository and pulling changes between them.
Drawings are particularly useful in this lesson: if you have a whiteboard, use it!
Version control is usually not the first subject in a workshop, so get learners to create a Bitbucket account after the session before.
TortoiseHg is the recommended means of installing Mercurial for Windows learners. The workshop installation instructions should include that.
We don’t use a Mercurial GUI in these notes because we want learners to understand what commands are being run. That said, instructors may want to demo the TortoiseHg GUI (which is available for Windows, OS/X, and Linux) on their desktop at some point during this lesson.
The [Conflicts}(../03-conflict/) section of the lesson uses the KDiff3 graphical diff/merge tool.
The workshop installation instructions should include directions for OS X and Linux users to install KDiff3.
For Windows users it is bundled with TortoiseHg.
The Mercurial wiki has some notes on using Mercurial with kdiff3.
On OS X, download kdiff3 and drag the .app
directory somewhere
suitable (such as into /Applications
), then link it into the path with
cd ~/local/bin; ln -s /Applications/kdiff3.app/Contents/MacOS/kdiff3 kdiff3
.
The default OS X Mercurial client has merge
use the built-in
(i.e., system-default) FileMerge
tool to do the default, and this
might be preferable to requiring a separate (and clearly very
non-platform-native tool.)
When appropriate, explain that we teach Mercurial rather than CVS, Subversion, or Git primarily because of its consistent command-line interface, easy to learn basic workflow, and Bitbucket’s policy of providing unlimited collaboration in private repositories for academic users. CVS and Subversion are now seen as legacy systems. Git is also very popular, largely because of the GitHub service. The differences between Mercurial and Git are largely syntactic, so learning one provides the conceptual framework to use either.
Ask, “Who uses ‘undo’ in their editor?” All say “Me”. ‘Undo’ is the simplest form of version control.
Give learners a five-minute overview of what version control does for them before diving into the watch-and-do practicals. Most of them will have tried to co-author papers by emailing files back and forth, or will have biked into the office only to realize that the USB key with last night’s work is still on the kitchen table. Instructors can also make jokes about directories with names like “final version”, “final version revised”, “final version with reviewer three’s corrections”, “really final version”, and, “come on this really has to be the last version” to motivate version control as a better way to collaborate and as a better way to back work up.
Setting Up Mercurial
We suggest instructors and students use nano
as the text editor for this lessons because
Please point out to students during setup that they can and should use another text editor if they’re already familiar with it.
When setting up Mercurial, be very clear what learners have to enter: it is common for them to edit the instructor’s details (e.g. email).
Creating a Repository and Changing a File
hg status
,
Mac users may see a .DS_Store
file showing as untracked.
This a file that the OS X Finder creates in each directory that it
displays. It’s worth while routinely adding this file to .hgignore
.Tracking Changes to Files
It’s important that learners do a full commit cycle by themselves
(make changes, hg diff
, and hg commit
).
The “bio
repository” challenge does that.
This is a good moment to show a diff with the KDiff3 graphical diff tool. If you skip it because you’re short on time, show the graphical diff view of a commit on Bitbucket in the [Collaborating}(../02-collab/) section.
Exploring History and Recovering Old Versions
hg revert
command,
and that shell wildcard characters can also be used.Make it clear that Mercurial and Bitbucket are not the same thing: Mercurial is an open source version control tool, Bitbucket is a company that hosts Mercurial repositories on the web and provides a web interface to interact with repos that they host.
If your learners are advanced enough to be comfortable with SSH, tell them they can use keys to authenticate on Bitbucket instead of passwords, but don’t try to set this up during class: it takes too long, and is a distraction from the core ideas of the lesson.
It is very useful to draw a diagram showing the different repositories involved.
Role playing between two instructors can be effective when teaching the collaboration and conflict sections of the lesson. One instructor can play the role of the repository owner, while the second instructor can play the role of the collaborator. If it is possible, try to use two projectors so that the computer screens of both instructors can be seen. This makes for a very clear illustration to the students as to who does what.
It is also effective to pair up students during this lesson and assign one member of the pair to take the role of the owner and the other the role of the collaborator. In this setup, challenges can include asking the collaborator to make a change, commit it, and push the change to the remote repository so that the owner can then retrieve it, and vice-versa. The role playing between the instructors can get a bit “dramatic” in the conflicts part of the lesson if the instructors want to inject some humor into the room.
If you don’t have two projectors, have two instructors at the front of the room. Each instructor does their piece of the collaboration demonstration on their own computer and then passes the projector cord back and forth with the other instructor when it’s time for them to do the other part of the collaborative workflow. It takes less than 10 seconds for each switchover, so it doesn’t interrupt the flow of the lesson. And of course it helps to give each of the instructors a different-colored hat, or put different-colored sticky notes on their foreheads.
If you’re the only instructor, the best way to create is clone the two repos in your Desktop, but under different names, e.g., pretend one is your computer at work:
$ hg clone https://bitbucket.org/vlad/planets planets-at-work
It’s very common that learners mistype the remote alias or the remote URL
when adding a remote,
so they cannot push
.
You can diagnose this with hg paths
and checking carefully for typos.
To fix typos you can open the repo .hg/hgrc
file for editing with hg config --local
.
Before cloning the repo, be sure that nobody is inside another repo.
The best way to achieve this is moving to the Desktop
before cloning:
cd && cd Desktop
.
If both repos are in the Desktop
,
have them to clone their collaborator repo under a given directory using a second argument:
$ hg clone https://bitbucket.org/vlad/planets vlad-planet`
Conflicts, sometimes weird, will start to arise. Stay tight: conflicts are next.
Explain that conflict are not a very common occurrence. Mercurial is good at merging changes made by different people, or changes that you make in different clones of a repo; e.g. on your office workstation, on your laptop at home, on a lab computer, or on a remote HPC or cloud platform. A conflict arises when changes have been made to the same lines in a file or to binary files; Mercurial can’t figure out which of the conflicting version to choose so it makes you decide.
Expect the learners to make mistakes. Expect yourself to make mistakes. This happens because it is late in the lesson and everyone is tired.
If you’re the only instructor, the best way to create a conflict is:
hg clone https://bitbucket.org/vlad/planets planets-at-work
.hg pull
now and show the conflict.The commands hg heads
,
hg log --graph
or hg log -G
,
and hg summary
all provide views of the repository state
that can be helpful in explaining conflicts and how they are resolved.
Use hg merge --tool=kdiff3
to ensure that the KDiff3 GUI diff/merge tool is launched to handle resolution of the conflict.
Licensing
We teach about licensing because questions about who owns what, or can use what, arise naturally once we start talking about using public services like Bitbucket to store files. Also, the discussion gives learners a chance to catch their breath after what is often a frustrating couple of hours.
Hosting
A common concern for learners is having their work publicly available on Bitbucket. While we encourage open science, sometimes private repos are the only choice. It’s always interesting to mention the options to have web-hosted private repositories.