Automating Hg Work flow
Introduction
Mercurial(Hg) has very powerful command-line interface. This command line interface can be used to automate your development work flow that involves interacting with the Hg repository. In this post, I will introduce you to this interface and demonstrate certain uses of this interface by showing how I automate one of my own interactions with Hg. Our objectives for this automation is to DRY as much as possible, and
- Not repeat mechanical stuff that could easily be automated
- Reduce errors involved in visually looking up changeset information and typing out changset ids, while merging to/from branches(usually
default
) - Systematically generate consistent messages, that can be prefixed/suffixed to our commit messages. The instances where we want messages wrapped in a template would include:
- Committing to a particular branch with some meta information about the feature, ticket id, and other tags that can be picked up by other team collaboration applications like code review tools etc.
- While merging to
default
, carry over the branch names, ticket id and any meta information available from the branch automatically - While merging from
default
(As in once in a while you want to catch up withdefault
.)
Mercurial Commands
Revsets
Almost all Hg command-line interfaces require one or set of changsets provided as an argument. When revset argument is not provided, implicit default changesets are used. This changeset arguments are provided in the form which Hg
calls a revset
.
Here are some examples of using revsets
with hg log
command:
- Using the default revset option, that displays the last commit on the
local repo
across anyuser
.
$ hg log --limit 1
changeset: 29227:448ace9d726b
tag: tip
user: Guru Devanla <g@mail.com>
date: Sun May 13 10:13:33 2018 -0700
files: core-lib/cloop.py
description:
experimental ast implementation"
- To view that last 2 changesets that were committed by user
devanla
$ hg log -r 'last(user(devanla), 2)'
changeset: 29227:448ace9d726b
tag: tip
user: Guru Devanla <g@mail.com>
date: Sun May 13 10:13:33 2018 -0700
files: external/cloop.py
description:
experimental cloop in python with some help from ipython
changeset: 29223:3cc421ca1c23
user: Guru Devanla <g@mail.com>
date: Fri May 11 12:39:12 2018 -0700
files: package/public/table_compositor/README.rst
description:
update table compositor README
- To view the latest changsets that were committed by user ‘devanla’ but only on any
non-default
branch. Notice that the changes are from 2 different branches and not fromdefault
$ hg log -r 'last(user(devanla) and not(branch(default)) and heads(all()) and not closed(), 2)'
changeset: 27763:7c0af63767a4
branch: devanla/table-compositor/1010/improve-documentation
user: Guru Devanla <g@mail.com>
date: Thu Mar 29 11:35:38 2018 -0700
files: package/public/table_compositor/README.rst
description: improve documentation of table-compositor
changeset: 24297:db5591f91f97
branch: devanla/external/1911/update-commit-hooks
user: Guru Devanla <g@mail.com>
date: Fri Jan 05 15:43:19 2018 -0800
files: external/commit-hook.py
description: update commit hooks to handle mis-named branches
You see that the -r
argument to the command used above can get very complex. But, it is easy to understand what is going on here. The -r
argument is what we call the revset
. Reading the documentation for revsets will be helpful. While reading the documentation, keeping the following concepts in mind, would help understand the documentation better.
- Each of the functions used in the argument to
-r
returns a set of changesets. eg.heads()
returns all heads,last()
by default returns the set with onechangeset
which is the last committed changeset. These functions are also referred to asPredicates
. - These functions can be composed with other functions to filter/narrow down the changsets. eg.
last(heads(), 2)
returns set of last 2 changeset ids across all heads. - You can combined the result from each of this composed functions using set operators, since each of the composed functions return a set of changsets. eg.
last(heads(), 2) and not branch(default)
would filter outhead
ondefault
. These operators are referred to asPatterns
in the documentation.
With this breakup and the mercurial documentation you should be able to understand how to use revsets
. Take a few minutes and read through the documentation now.
Templates
One other concept you need to learn before we talk about automation is how we can customize the printing of log messages. Note that Mercurial documentation recommends always using the hg
commands to interact with the local and remote repositories and not to rely on the internal Python level API that can be used. In addition, to automate our interactions we usually need access to one particular changset id
. To achieve this we will have to use the --template
argument (or -T
).
- For example, to view the a list of changsets as one liners, with short form of node-ids
$ hg log -r "last(user(devanla), 2)" --limit 2 -T "{rev}: {node|short}: {desc}\n"
29227: 448ace9d726b: experimental cloop
29226: 1bdfad2d0c86: function refactor
Again, a lot more information about templates
can be got using hg help templates
or here.
Here is the outline you will need to quickly understand how to use the templating
system:
- Every changset in mercurial has a set of attributes, like
changeset id
,branch
,user
,phase
etc. Each of these attributes can be accessed and printed by a set of pre-defined keywords. eg:node
for changeset ids,user
for user,phase
for phases etc. - The value provided by these keywords are
strings
and they can be transformed by applyingfilters
on them. I would call themtransformations
as the termfilter
is a bit misleading. - Filters may not be sufficient in all scenarios, For example, dealing with list of values (files updated in a changeset) where the names have to be a concatenated string, applying conditional filtering/formatting based on the values, apply transformations to list of values. In such scenarios we have ‘functions’ like
if
,files
etc. - A list operator
%
is also available, to process return values of expressions that return lists.
Automating Our Tasks: Example
Powered with our knowledge of using revsets
and templates
we are able to either customize our output to view logs or just let the template return the changset id
which in turn get be piped to other functions in our automation scripts.
Putting together, some of the things we learned about revsets and template, we can write powerful queries to interact with the Hg repository.
To get the last update of a user (refer to example above)
To get the last update of a user across different branches (refer to example above)
Get the last update of user with a
keyword
in commit message or file name. (refer tokeyword
intemplate
documentation for more info)
hg log -r "last(user(devanla) and keyword('norm'))"
- Combine the former queries with template to just return node id, using
-T {node}
- Use the node_id to perform some operations like
hg up
to last commit on a non-default node that is not merged yet
hg log -r 'last(user(devanla) and heads(all()) and not parents(merge()) and not closed() and not branch(default))' -T "{node|short}" | xargs hg up
Automatically wrapping commit
messages in a template to achieve consistency
Now, lets make use of this knowledge to automate our work flow. This automation will especially help consistently apply some message templates to our commit messages.
This will be our objective:
Objective #1
I usually work on a feature branch. For consistency, I always name a branch with a particular format, say, {user}/{package}/{ticket-id}/{feature-desc-in-short}
. This format forms a branch
name. For example, a name of the branch in this format would be devanla/core-lib/19191/refactor-normalization-in-phase-2
Each time we commit a changeset to this branch, we want our commit message to include information about the ticket id, which will look this way core-lib: update norm function to take an optimization param, #19191.
Note, core-lib
and 19191
are string that were extracted from the branch name and used to ‘decorate’ the original commit message.
This method of referencing ticket ids in commit messages, usually tends to help other team collaboration tools to link changsets to code review requests. Note, that this template is just an example. The point is we want to be able to tag each commit with some information that is available in the branch. This provides information at the commit level and could also provide the needed meta-information for team collaboration tools.
Objective #2:
Once in a while we want to catch up with default. The typical steps would be
hg up branch-you-are-working-on
hg pull
hg merge default
hg commit -m"catching-up-with-default"
We want to automate this step.
Objective #3:
The inverse of step 2, is that once we are done working on changes on a feature branch, we want to merge those changes to default. While merging those changes, we want to commit messages of the merge to follow a template. We want this template to capture the meta-information that is available in the branch name. For example, if we want to merge branch: devanla/core-lib/19191/refactor-normalization-in-phase-2
. Then, we want the commit message of the merge to be `core-lib: merged, refactor-normalization-in-phase-2, refs #19191’.
Note, that we built this commit message entirely from the branch name. We will automate this step as well.
Let’s Automate
Automate objective #1
Automating objective #1 is a little tricky. We will have to perform a number of steps that we have not discussed about so far. I picked up this method from this post on SO.
import re
import curses
# Example: devanla/core-lib/919191/refactor-norm-function
= re.compile(r'[a-z]*/([a-zA-Z]*)/([0-9]*)/*')
pat
def precommit_hook(repo, **kwargs):
# keep a copy of repo.commitctx
= repo.commitctx
commitctx = repo[None].branch()
branch = kwargs['ui']
ui
curses.setupterm()= ''
suffix_message = ''
prefix_message if branch != 'default':
= re.match(pat, branch)
m if m:
if len(m.groups()) == 2:
= m.groups()[0]
tlt = m.groups()[1]
redmine_ticket else:
= ''
tlt = m.groups()[0]
redmine_ticket = ui.prompt('\nUpdate message with \"%s: [YOUR MESSAGE], refs #%s\" to message? (y/N)?' %
response
(tlt, redmine_ticket))if response != 'N':
= ', refs #{}'.format(redmine_ticket)
suffix_message = tlt + ':'
prefix_message def updatectx(ctx, error):
= '{}{}{}'.format(
ctx._text
prefix_message,
ctx._text,
suffix_message)#ctx._text += extra_message
return commitctx(ctx, error)
# monkeypatch the commit method
= updatectx repo.commitctx
Here is what the script does. We start with a branch-name that looks as follows: devanla/core-lib/191921/feature-to-improve-normalization.
.
Now say, we make some changes and commit the changeset, with the following message: hg commit -m"refactoring core norm functions args
.
At that point this hook gets called, and transforms the message with information available in the branch name. The script shown above also asks for confirmation before transforming the message. After the hook is applied the final message will look like core-lib: refactoring core norm functions args, refs #191921
.
Note that we attached the package-name
and ticket-id
to the commit message. This will have to change depending on each person’s work flow.
Automating objective #2
To achieve objective #2 we use shell scripting. Since, I use zsh
, I have outlined the sample script using the zsh
syntax. To automate this objective we add an alias function to our zsh
that does the following
- Get the id of the current branch
- Makes sure we are not on
default
already. (since we want to merge changes from default) - Ask for confirmation
- Merge changes from
default
with a default commit message. Note, that since we have our pre-commit hook enabled, the final message we get would look as followscorelib: catch-up-with-default, reds 191921
. Is is nice that our pre-commit hook is still effective for this script as well.
# alias added to .zshrc or equivalent file that will bring this alias to zsh scope
cuwd () {
local current_branch=`hg identify -b`
echo $current_branch
if [[ $current_branch = "default" ]]; then
echo "Yikes! Sorry. \n You should NOT be on the default to perform this operation"
return
fi
echo "Merging default to current branch"
local temp='Press any key to Continue, Ctrl-C to quit'
vared temp
hg merge default
hg commit -m"catch-up-with-default"
}
Using this kind of consistent message template also automate future reporting, filtering changesets that are routine merges.
Automating objective #3
Now, once you are done working on a feature and you are ready to merge to default
. Each time we merge to default, we want to provide information on what this merge is about. Again, since we have all this information in the branch name, we can have a script that will help us build a message and commit the merges to default
.
For example, if we have a branch named devanla/core-lib/10101/feature-2001
, then while we merge to default
we can generate a message that looks like this: core-lib: merged, feature-to-improve-normalization, refs #10101.
. Note that we nicely have some desc, the project-name and the ticket id in the message of the merge commit.
This again can be achieved with a shell script, powered by our knowledge of revsets
and templating
.
merge_latest_branch_to_default () {
# this script only allows merging to default
# check and quit if you are not on default
local current_branch=`hg identify -b`
echo $current_branch
if [[ $current_branch != "default" ]]; then
echo "Yikes! Sorry. \n You should be on the current branch to perform this operation"
return
fi
hg lc_last_update
local changeset_display="`hg lc_last_update --template "{rev}: {node|short}: {branch}"`"
local changeset="`hg lc_all --limit 1 --template "{node|short}"`"
local branchname=`hg lc_all --limit 1 --template "{branch}"`
local message="`echo $branchname | sed -e 's/\(devanla\)\/\(.*\)-\([0-9]*\)-\(.*\)/\2:merged, \4, refs #\3/g'`"
echo "Pushing changeset = $changeset_display"
local temp='Press any key to Continue, Ctrl-C to quit'
vared temp
hg merge -r $changeset
echo "\n\nMerge completed, do you want me to commit, using the following message\n"
echo "\n\n------------------------------------------------"
echo $message
echo "\n---------------------------------------------------"
local temp='Press any key to Continue, Ctrl-C to quit'
vared temp
hg commit -m"$message"
echo "Complete"
}
Again, this script is just an example of how a automatic merge can be crafted. You will have to add you own little script to accommodate your particular work flow.
Conclusion
We developers are lazy (and proudly so) and we seek to automate many of the tasks we perform. Automation also comes with the benefit of reducing errors and provides a consistent result. In this post we saw how we can leverage some of the powerful command-line feature of Hg to automate our interactions with Hg as we add more features and changsets to our code base.