Intro

This post is about automating the provisioning of your local machines via code. That could be your {work, personal} {laptop, desktop, VM}.

It requires an upfront investment but ultimately adds value by reducing time on subsequent configurations of new environments.

In this post I will show you how I achieve this via Ansible and the benefits of this approach.

Demo

For motivation, this is how I setup some core applications and preferences automatically when on a new environment:

Installing and personalising git, i3, direnv and pipenv via Ansible.

With a few lines I configured a window manager, git, direnv and pipenv. My actual setup does almost all my entire local provisioning from scratch. But before going further in this example, let’s revisit the problem and the solution in more detail.

Problem

As a developer I end up using a number of machines (at home and at work), with various OS/distributions. Examples:

  • personal laptop (Fedora)
  • personal desktop/server (Fedora and Windows)
  • work laptop (Ubuntu)
  • work servers (any)
  • temporary setups for each of the above whenever one breaks
  • temporary VMs (e.g. using Linux in Virtualbox in a Windows laptop)

Additionally:

  • Each of the environments above gets (re)provisioned periodically.
  • Each of these require a number of things set up so that I’m in my most productive environment.
    • git, ssh keys, gpg, direnv, pass, i3 (with my keychain, HiDPI support), Gnome, Chrome, Atom, PyCharm (with the last two needing extensions, personal config, keymap etc). My Python setup (pyenv with various Python versions, pip, pipenv, virtualenv). A virtualenv configured for each of my Python projects.
    • Most of the above benefit from:
      • integration with the bash shell for tab completion and an updated bash prompt
      • personal fields/preferences (e.g. my email address in git, favourite i3 setup).

OS package managers today are highly reliable for the installation part, but work isn’t finished just when a package gets installed. And the periodic manual provisioning of each of the machines/VMs above results in a substantial amount of time. If you have a similar situation, the initial investment in automating the provisioning (as well as the maintenance of it) requires less time than the manual provisioning over a few years time.

Furthermore, sometimes you want to start from a clean image or test big changes on your setup without being concerned with breaking the workstation you rely so much upon.

If having the best setup in your environments isn’t automated, you will compromise. In my tight work schedule I failed for some time to setup things that my personal laptop never failed to have, which accelerate or improve the way I work. You will fix something quickly in one environment but it won’t impact other environments, you might make notes but you’ll forget to update, and you’ll defer problems until they get just enough in your way. And the setup of your environments can get messy.

Backups

Backups solve part of this need for a single environment but aren’t ideal to reuse elsewhere (not to mention the difficulties in arranging for a decent full-system backup in Linux laptops).

Some people - like myself months ago - rely on tools like Dropbox for sharing a subset of data in the user’s home such as config folders and files via soft links to the shared Dropbox folders/files, while manually installing system packages. Some problems with this approach:

  • Critical config is not version controlled
    • Did you make a mistake in a config? Hopefully it was within the retention period of Dropbox deleted files and you won’t have to deal with customer support for files/directories not handled properly by their web app.
  • Generates a number of file conflicts between environments.
  • Requires setups and application versions to be very similar across different machines/VMs (generally not the case between work and home environments, or even in machines within each of these).
  • The setup of backups isn’t automated itself within the scope of this solution.
  • You still can’t test changes.

Either you will move almost everything to Dropbox (generating constant syncs for files you wouldn’t normally care about, and did I mention file conflicts?) or it will be a partial move still requiring a manual configuration.

It would be great to abstract what’s common and what’s specific to the OS and share the common bits, facilitating reusability across OS’s (either for switching or parallel use, like work and personal laptops). And then also share reusable common patterns within the community.

Enter Infrastructure as Code

Years ago a new movement started that brought development and infrastructure together to make Infrastructure as Code (or IaaC). Tools were developed to manage fleets of machines via a master machine with inventories and tasks defined in code to do things like install and configure databases, reverse proxies, LDAP, web servers, or any sysadmin task really - making infrastructure finally reproducible and automated.

In the current world of managed services and with a competitive cloud environment, there is rarely now the need for small to medium sized companies and individuals to manage so much infrastructure directly themselves, with tools like Terraform sufficing to just manage cloud services instead.

The improvement of online IDEs and platforms planning to manage the entire SDLC and a wider adoption of machines like Chromebooks might change this, but at least one machine remains on our hands to coordinate everything else - and that is the laptop or desktop in front of you.

Ansible

One of these IaaC tools is Ansible. Simple yet powerful, written in Python and whose inventories and tasks are written in readable YAML with an extensive range of native modules to change the state of your applications, services, files, networks, etc.

A module for instance is apt, so you can request packages to be present in Ubuntu/Debian - or not, with ansible’s job being to make sure your desired list of packages is in the state you instruct via your code.

A task uses modules, in the example above a task would be:

# if the git package is already installed this action does nothing
# otherwise this installs git
apt:
  name: git  # name of package
  state: present  # instructs said package to be installed
become: true  # ansible needs to run sudo

Ansible modules are generally idempotent. Ansible is faster and easier than doing the equivalent in say a bash script to handle all the setup (although this isn’t apparent from the example above alone).

Roles and Ansible Galaxy

One big promise of Ansible (albeit not matching expectations) was the advent of reusable sets of tasks - aka roles - that could be developed and shared within the community on Ansible Galaxy, not too differently from PyPI in the Python world. Needing only a few user defined variables, a role would take over all aspects related a certain outcome. You’d be able to say download and install a role for setting up say nginx, grafana or Apache in a few commands.

I have made available I few roles I use for myself for git, i3 and others.

Ansible walkthrough

Let’s setup a few of my roles as an example, from which you can build your entire Ansible setup. Note that this will make changes to your system, review the roles actions before starting.

I’ll assume you have pip3. I install Ansible on the user folder since at this point of the machine provisioning, pipenv etc are likely not yet installed:

1
pip3 install --user ansible

To use my roles, we add them to an ansible-galaxy requirements file (similar to pip’s requirements.txt):

cat > requirements.yml <<EOF
---
- src: n_batalha.ansible_git_role
- src: n_batalha.ansible_pipenv_role
- src: n_batalha.ansible_direnv_role
- src: n_batalha.ansible_i3_role
EOF

We install the roles (they go into ~/.ansible/roles):

ansible-galaxy install --force -r requirements.yml

Now we define a playbook that uses these roles, defining a few user variables for git:

- hosts: localhost
  roles:
  - role: n_batalha.ansible_git_role
    vars:
      email: "your@email.com"
      user_name_long: John Doe
  - role: n_batalha.ansible_direnv_role
  - role: n_batalha.ansible_i3_role
  - role: n_batalha.ansible_pipenv_role

And now execute the playbook:

ansible-playbook playbook.yml --ask-sudo-pass

Now git (with bash completion, git-prompt, git aliases, user and email defined), direnv, i3 (with HiDPI, shortcuts, status bar, and Gnome keyring) and pipenv are installed in your environment.

Testing

Another advantage of this setup which is easy to overlook is the ability to test system configurations before actually running them on your machine. Either via Docker or Vagrant. I won’t cover this here but you’ll be able to find how this works on my shared ansible role test project.

Conclusion

If you have more than a single machine and you’re an advanced Linux user, it might be worth to automate its provisioning. Use the above as a starter point, and explore Ansible Galaxy for more reusable roles.

Perhaps I will share later my entire setup and the playbooks to invoke on a ready to use repo for newcomers. Meanwhile I hope this helps.


References

Roles

Mine, used above

I recommend reading the code before running, PRs welcome :).

Bots - a review

IntroBefore I was even working in the fields of data and engineering, I have been intrigued as to how bots and automated personal assista...… Continue reading

Flask, SQLAlchemy and testing

Published on July 09, 2017

Alexa, Python and Kubernetes - a cookiecutter

Published on January 18, 2017