leah blogs: August 2025

20aug2025 · Remembering the work of Kevin S. Braunsdorf and the pundits tool-chain

You may perhaps not recognize the name of Kevin S. Braunsdorf, or “ksb” (kay ess bee) as he was called, but you certainly used one tool he wrote, together with Matthew Bradburn, namely the implementation of test(1) in GNU coreutils.

Kevin S. Braunsdorf died last year, on July 24, 2024, after a long illness.

In this post, I try to remember his work and legacy.

He studied at Purdue University and worked there as a sysadmin from 1986 to 1994. Later, he joined FedEx and greatly influenced how IT is run there, from software deployments to the physical design of datacenters.

Kevin was a pioneer of what we today call “configuration engineering”, and he wrote a Unix toolkit called msrc_base to help with these tasks. (Quote: “This lets a team of less than 10 people run more than 3,200 instances without breaking themselves or production.”) Together with other tools that are useful in general, he built the “pundits tool-chain”. These tools deserve further investigation.

Now, back in those days, Unix systems were vastly heterogeneous and ridden with vendor-specific quirks and bugs. His tooling centers around a least common denominator; for example, m4 and make are used heavily as they were widely available (and later, Perl). C programs have to be compiled on their specific target hosts. Remote execution initially used rsh, file distribution was done with rdist. Everything had to be bootstrappable from simple shell scripts and standard Unix tools, porting to new platforms was common.

The idea behind msrc

The basic concept of how msrc works was already implemented in the first releases from 2000 we can find online: at its core, there is a two-stage Makefile, where one part runs on the distribution machine, and then the results get transferred to the target machine (say, with rdist), and then a second Makefile (Makefile.host) is run there.

This is a practical and very flexible approach. Configuration can be kept centralized, but if you need to run tasks on the target machine (say, compile software across your heterogeneous architecture), it is possible to do as well.

Over time, tools were added to parallelize this (xapply), make the deployment logs readable (xclate), or work around resource contention (ptbw). Likewise, tools for inventory management and host definitions were added (hxmd, efmd). Stateful operations on sets (oue) can be used for retrying on errors by keeping track of failed tasks.

All tools are fairly well documented, but documentation is spread among many files, so it takes some time to understand the core ideas.

Start here if you are curious.

Dicing and mixing

Unix systems contain a series of ad-hoc text formats, such as the format of /etc/passwd. ksb invented a tiny language to work with such file formats, implemented by the dicer. A sequence of field separators and field selectors can be used to drill down on formatted data:

% grep Leah /etc/passwd
leah:x:1000:1000:Leah Neukirchen:/home/leah:/bin/zsh
% grep Leah /etc/passwd | xapply -f 'echo %[1:5 $] %[1:$/$]' -
Neukirchen zsh

The first field (the whole line) is split on :, then we select the 5th field, split by space, then select the last field ($). For the basename of the shell, we split by /.

Using another feature, the mixer, we can build bigger strings from diced results. For example to format a phone number:

% echo 5555551234 | xapply -f 'echo %Q(1,"(",1-3,") ",4-6,"-",7-$)' -
(555) 555-1234

The %Q does shell-quoting here!

Since the dicer and the mixer are implemented as library routines, they appear in multiple tools.

“Business logic” in m4

One of the more controversial choices in the pundits tool-chain is that “business logic” (e.g. things like “this server runs this OS and has this purpose, therefore it should have this package installed”) is generally implemented using the notorious macro processor m4. But there were few other choices back then: awk would have been a possibility, but is a bit tricky to use due to its line-based semantics. perl wasn’t around when the tool-chain was started, though it was used later for some things. But m4 shines if you want to convert a text file into a text file with some pieces of logic.

One central tool is hxmd, which takes tabular data file that contain configuration data (such as, which hosts exist and what roles do they have), and can use m4 snippets to filter and compute custom command lines to deploy them, e.g.:

% hxmd -C site.cf -E "COMPUTONS(CPU,NPROC)>1000" ...

Later, another tool named efmd was added that does not spawn a new m4 instance for each configuration line.

m4 is also used as a templating language. There I learned the nice trick of quoting the entire document except for the parts where you want to expand macros:

`# $Id...
# Output a minimal /etc/hosts to install to get the network going.

'HXMD_CACHE_TARGET`:
	echo "# hxmd generated proto hosts file for 'HOST`"
	echo "127.0.0.1	localhost 'HOST ifdef(`SHORTHOST',` SHORTHOST')`"
	dig +short A 'HOST` |sed -n -e "s/^[0-9.:]*$$/&	'HOST ifdef(`SHORTHOST',` SHORTHOST')`/p"
'dnl

This example also shows that nested escaping was nothing ksb frowned upon.

Wrapper stacks

Since many tools of the pundits tool-chain are meant to be used together, they were written as so-called “wrappers”, i.e. programs calling each other. For example, above mentioned hxmd can spawn several commands in parallel using xapply, which themselves call xclate again to yield different output streams, or use ptbw for resource management.

The great thing about the design of all these tools is how nicely they fit together. You can easily see what need drove the creation of the tool, and how they still can be used in a very general way, also for unanticipated use cases.

Influences on my own work

Discovering these tools was important for my own Unix toolkit and some tools are directly inspired, e.g. xe, and arr.

I still ponder host configuration systems.

NP: Adrianne Lenker—Not a Lot, Just Forever

Copyright © 2004–2022