Dominic Cleal's Blog

Reproducible builds on Copr with tito and git-annex

Recently, Fedora's Copr service was launched which lets individuals create personal repos and build packages on Fedora's servers for a number of Fedora and EL versions (similar to OBS and PPAs).

I've set up a couple of repos under it, and one which contains builds of various gems as dependencies for librarian-puppet. Setting up and tracking RPM builds is made quite easy with git and tito, which lets you set up a simple directory structure in your RPM repo, track specs and source binaries (.gem files), tag and release changes to Copr:

$ tree 
|-- rel-eng
|   |-- packages
|   |   |-- rubygem-highline
|   |   |-- rubygem-librarian
|   |   |-- rubygem-librarian-puppet
|   |   `-- rubygem-thor
|   |-- releasers.conf
|   `-- tito.props
|-- rubygem-highline
|   |-- highline-1.6.20.gem
|   `-- rubygem-highline.spec
|-- rubygem-librarian
|   |-- librarian-0.1.2.gem
|   `-- rubygem-librarian.spec
|-- rubygem-librarian-puppet
|   |-- librarian-puppet-0.9.17.gem
|   `-- rubygem-librarian-puppet.spec
|-- rubygem-thor
|   |-- rubygem-thor.spec
|   `-- thor-0.15.4.gem

6 directories, 16 files

(my librarian-puppet-copr repo)

However storing binary files in git has lots of problems, including the resulting size of the repo. To solve this, I use git-annex so the repo only stores metadata and a local cache of the binaries. The binaries can then be fetched from the web on a clean checkout using the script and git-annex on the fly.

Setting this up is easy:

  1. mkdir myrepo && cd myrepo
  2. git init
  3. git annex init
  4. tito init
Next, configure Copr and tito to use Copr using Mirek Suchý's blog post and my rel-eng directory. The key part of the rel-eng configuration for using the git-annex support in tito 0.5.0 is setting the builder in tito.props:

builder = tito.builder.GitAnnexBuilder

Adding new packages is now a matter of doing:

  1. mkdir rubygem-foo && cd rubygem-foo
  2. vim rubygem-foo.spec
  3. git annex addurl --file=foo-1.2.3.gem or copy the file into place and run git annex add foo-1.2.3.gem
  4. git commit -am "Add foo 1.2.3"
  5. tito tag --keep-version
  6. tito release copr-domcleal

git-annex will store the file in its local storage under .git/annex, replacing the file in git with a symlink based on the checksum of the contents. When you push the repo to a remote without git-annex support (like GitHub), then the binaries won't be transferred, keeping the size small. Other users can fetch binaries by adding a shared remote (e.g. a WebDAV or rsync share or web remotes using

When tito is building SRPMs, it will "unlock" the files, fetching them from available remotes, build the RPM and then re-lock the checkout.

So the combination of git (for tracking sources and specs), git-annex (for keeping sources outside git), tito (for tagging builds) and Copr (for building and publishing) makes it easy to build and release your own RPMs, while allowing you to make the source code and build process accessible and transparent.

Automatic DNS updates from libvirt guests

Recently I've been relying a lot on Vagrant and vagrant-libvirt in particular for spinning up a variety of OSes for testing. One irritating habit I've developed is to check the IP of the VM each time it comes up if I need to view a website/app hosted on the VM itself and paste this into my browser, rather than using a DNS name.

Since libvirt runs dnsmasq by default for DHCP and DNS services inside NAT virtual networks, getting DNS working from the hypervisor (my desktop) is very easy. First ensure that the libvirt network has the domain name correctly configured - this was my hostname initially, but I changed it to

$ sudo virsh net-edit default
  <forward mode='nat'>
      <port start='1024' end='65535'/>
  <bridge name='virbr0' stp='on' delay='0' />
  <mac address='52:54:00:03:62:29'/>
  <domain name=''/>
  <ip address='' netmask=''>
      <range start='' end='' />

Change the <domain> tag's name attribute, then virsh net-destroy default and virsh net-start default to restart and apply this change.

Next I updated the images and VMs I use in libvirt to send hostnames when making DHCP requests, ensuring that dnsmasq would associate the hostname with the DHCP lease. On Red Hat variants, set in /etc/sysconfig/network-scripts/ifcfg-eth0 and run service network restart.

Provided the domain in the DHCP_HOSTNAME is the same as the domain set in libvirt's network (which is used to configure dnsmasq), the hostname will get updated and is shown in /var/lib/libvirt/dnsmasq/default.leases with the domain removed:

1387893215 52:54:00:15:d1:73 foreman *

If hostnames aren't shown, check syslog for messages such as this to indicate a domain name mismatch:

Dec 24 12:32:46 cobalt dnsmasq-dhcp[8571]: Ignoring domain for DHCP host name foreman

Lastly, I configure NetworkManager on my desktop to redirect queries for to the dnsmasq instance serving the libvirt network. First I switch NetworkManager to run a dnsmasq instance of its own instead of pointing resolv.conf to external resolvers:

$ sudoedit /etc/NetworkManager/NetworkManager.conf

Then add a dnsmasq config snippet to send queries for through to the dnsmasq instance bound to the gateway IP address of the libvirt network:

$ sudoedit /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf

And now, from my desktop I'm able to bring up a VM and instantly query or use the hostname to access it.

Extending Foreman quickly with hook scripts

Foreman integrates with a bunch of systems such as DNS (BIND, AD), DHCP (ISC) and Puppet when systems get created and destroyed. It does most of this via its smart proxy architecture, where an agent (the proxy) is deployed on your other servers which gets called from your central Foreman host during "orchestration".

Most production setups are a bit more complex though with other systems we don't integrate with in Foreman, so the plugin support in version 1.1 is there to extend Foreman's capabilities. Since this means learning some Ruby and Rails, I've released foreman_hooks so admins can instead write simple shell scripts to hook straight into Foreman's orchestration feature.

I'm going to walk through a example hook to run a PuppetDB command to clean up stored data (facts, resources) when hosts are deleted in Foreman.

Foreman 1.2+ installed from RPMs: please use the ruby193-rubygem-foreman_hooks package from the plugins repo.

Other users: first we'll install the plugin by adding this config file:

# echo "gem 'foreman_hooks'" > ~foreman/bundler.d/foreman_hooks.rb
And then run this to install and restart Foreman:
# cd ~foreman && sudo -u foreman bundle update foreman_hooks
Fetching source index for
Installing foreman_hooks (0.3.1)
# touch ~foreman/tmp/restart.txt

Next, we'll create a hook script that runs when a managed host is destroyed in Foreman:

# mkdir -p ~foreman/config/hooks/host/destroy
# cat << EOF > ~foreman/config/hooks/host/destroy/70_deactivate_puppetdb

# We can only deactivate data in PuppetDB, not add it
[ "x$1" = xdestroy ] || exit 0

# Remove data in PuppetDB, supplying the FQDN
# assumes sudo rules set up as this runs as 'foreman':
#   foreman ALL = NOPASSWD : /usr/bin/puppet node deactivate *
#   Defaults:foreman !requiretty
sudo puppet node deactivate $2

# chmod +x ~foreman/config/hooks/host/destroy/70_deactivate_puppetdb

(on Foreman 1.2, change "host" in the path to "host/managed")

There are a few things here to note. The path we're creating under config/hooks/ refers to the Foreman object and event (see the README). For hosts we can use "create", "update" and "destroy" to extend the orchestration process and there similar events for every other object in Foreman. The 70 prefix influences the ordering with other tasks, see grep -r priority ~foreman/app/models/orchestration for more.

The script gets two arguments, the first is the event name ("destroy" etc) and very importantly for orchestration events, this can change. If an orchestration task fails, the process will get rolled back so the script will then be called with the opposite event to the one it was first called with. For example, a hook in the create/ directory will first be called with "create", then a later task may fail and it will be called again with "destroy" to revert the change. Orchestration of DNS records etc in Foreman works in the same way. Since this example is only able to remove data and not add it, the first line checks the argument and exits if it isn't asked to destroy. Other scripts should take note of value of this argument.

The second argument is the string representation of the object, i.e. the FQDN for host objects. On stdin, we receive a full JSON representation which gives access to other attributes. There are helpers in to access this data, see examples/ to get a copy.

Lastly in this example, we run the PuppetDB command to remove the data. The exit code of orchestration hooks is important here. If the exit code is non-zero, this will be treated as a failure so Foreman will cancel the operation and roll back other tasks that were already completed.

Now when the host gets deleted from either the Foreman UI or the API, the host gets deactivated in PuppetDB:

# puppet node status
Deactivated at 2013-04-07T13:39:59.574Z
Last catalog: 2013-04-07T11:56:40.551Z
Last facts: 2013-04-07T13:39:30.114Z
There's a decent amount of logging, so you can grep for the word "hook" in the log file to find it. You can increase the verbosity with config.log_level in ~foreman/config/environments/production.rb.
# grep -i hook /var/log/foreman/production.log
Queuing hook 70_deactivate_puppetdb for Host#destroy at priority 70
Running hook: /usr/share/foreman/config/hooks/host/destroy/70_deactivate_puppetdb destroy

One area I intend on improving is being able to interact with Foreman from within the hook script. At the moment, I'd suggest using the foremanBuddy command line tool from within your script - though you shouldn't edit the object from within the hook, unless you're using the after_create/after_save events.

I hope this sets off your imagination to consider which systems you could now integrate Foreman into, without needing to start with a full plugin. Maybe we could begin collecting the most useful hooks in the community-templates repo or on the wiki.

(PuppetDB users should actually use Daniel's dedicated puppetdb_foreman plugin. I hope development of dedicated plugins down the line will happen as a natural extension.)

Introducing tests for Augeas resources in Puppet

Augeas resources in Puppet have always been a bit of a black box, as they use somewhat esoteric commands (based on augtool, but with a different parser), are often not idempotent without some work, and are difficult to test.

Last year I wrote about a small test framework I'd written for augeasproviders, a set of new types and providers. These tools allowed me to check the providers were making the change correctly and were idempotent.

Before his talk on testing Puppet modules at Puppet Camp Ghent, vStone asked me about testing Augeas resources in the same way, so I decided to refactor this code into a new extension for Tim Sharpe's rspec-puppet tool, so it could be used against resources in manifests.

rspec-puppet-augeas is a small gem you can install to run Augeas resources inside your rspec-puppet tests. It runs the resource against as many test fixture(s) as you like, then runs it again to make sure it's idempotent. It can provide debug output from the provider when it goes wrong, and adds many file inspection tools to check it went right.

The README has info on setting up (it's quick) and I've published an example module showing how it should look. There are lots of examples showing off each feature inside the project tests themselves.

Feedback gratefully received. File issues, PRs or contact me on IRC.

Testing techniques for Puppet providers using Augeas

Over the past few months I've continued working on augeasproviders, a Puppet module that implements a few providers which use the Augeas configuration API to modify files. One of the interesting challenges has been testing providers, which I'm going to explain in this entry.

The challenge appears quite simple: the provider will alter the contents of a config file by adding, removing or editing a line and the spec (that is, a set of RSpec examples using the puppetlabs_spec_helper) has to test this was successful. Unlike core Puppet providers which may alter package databases and entire system state, augeasproviders operates only on individual files and so it can be run safely within a test harness.

The main flow of testing is shown above and follows ths sequence:

  1. A temporary copy is made of a fixture file (sample input file) which here is a typical /etc/hosts file. The resource is created on the fly from the type, referencing the temporary copy made of the fixture.
  2. The resource is added to a Puppet catalog instance and the catalog, containing a single resource is applied using Puppet. This tests the provider in a near "real life" setting, as all normal provider functionality will be used.
  3. Puppet will call methods to create, destroy or change individual parts of an existing resource via getter/setter methods. In augeasproviders, these use the Augeas API to open the target file, make the change or retrieve the value and close the file.
  4. The catalog run will complete and the temporary fixture will have been updated with any changes made. If there have been any resource failures or log messages at warning or above, the spec fails.
  5. A second catalog run is now performed in order to catch idempotency problems, which shouldn't produce any further changes. If a getter and setter aren't consistent, then Puppet will attempt to set a property on every run since the two implementations differ. This useful test discovered a few issues in augeasproviders.

Now the updated fixture is complete, the contents need to be tested. Originally tests were using the Augeas API again to load the fixture and then check certain parts of the tree. Since the file has been persisted to disk using a lens that is tested and typechecked upstream, this is pretty safe and allows us to skip over implementation details such as whitespace formatting within the output.

Unfortunately it's a very verbose way of testing. Improving on this was to use the augparse utility shipped with Augeas, normally used for testing lenses transform between a sample file and a given tree. The spec was changed to generate a module for augparse that took the fixture and an "expected tree" in the spec and compared the two via the lens.

An additional feature of this was to filter the full fixture into just the line that was expected to change, meaning only part of the entire file would need to be tested against a tree. (Clearly a bad bug could cause other parts to change, but it's a trade-off to maintain the tree text for the entire file.)

The above techniques used to test the augeasproviders code could easily be adapted into other provider tests, whether or not the provider itself uses the Augeas API.

Update: the new idempotency code and all the fixes it's helped with have now been released as version 0.3.0.

Alternative Augeas-based providers for Puppet

Puppet has a few core types such as host for populating /etc/hosts and mailalias for /etc/aliases that use an internal parsedfile helper. You'll be familiar with this if you've ever seen it announce the following in any file it touches:

# HEADER: This file was autogenerated at Sun Mar 04 17:45:13 +0000 2012
# HEADER: by puppet.  While it can still be managed manually, it
# HEADER: is definitely not recommended.

It also rewrites the entire file, ignoring the whitespace you'd carefully placed in existing entries. While you can simply template the entire file, or use Augeas resources to do it, I wanted to improve the built in types.

So I've released a module called augeasproviders that contains new provider implementations for host and mailalias, using the Augeas library under the covers. Augeas is good at preserving the existing format of the file while reading, editing and adding additional entries (though it can't copy formatting yet).

No Augeas knowledge is necessary, just download, enable pluginsync and set the default provider.

Older entries are hidden, but you can browse them with the date categories at the top-right of this page.

Created by Chronicle v3.5