For the past four years, I ve been working on
Carthage, a free-software
Infrastructure
as Code framework. We ve finally reached a point where it makes
sense to talk about Carthage and what it can do. This is the first in a
series of
blog
posts to introduce Carthage, discuss what it can do and show how it
works.
Why Another IAC Framework
It seems everywhere you look, there are products designed to support
the IAC pattern. On the simple side, you could check a Containerfile
into Git. Products like Terraform and Vagrant allow you to template
cloud infrastructure and VMs. There are more commercial offerings than I
can keep up with.
We were disappointed by what was out there when we started Carthage.
Other products have improved, but for many of our applications we re
happy with what Carthage can build. The biggest challenge we ran into is
that products wanted us to specify things at the wrong level. For some
of our cyber training work we wanted to say things like We want 3 blue
teams, each with a couple defended networks, a red team, and some
neutral infrastructure for red to exploit. Yet the tools we were trying
to use wanted to lay things out at the individual machine/container
level. We found ourselves contemplating writing a program to generate
input for some other IAC tool.
Things were worse for our internal testing. Sometimes we d be
shipping hardware to a customer. But sometimes we d be virtualizing that
build out in a lab. Sometimes we d be doing a mixture. So we wanted to
completely separate the descriptions of machines, networks, and software
from any of the information about whether that was realized on hardware,
VMs, containers, or a mixture.
Dimensional Breakdown
In discussing Carthage with Enrico Zini, he pointed me at
Cognitive
Dimensions of notation as a way to think about how Carthage
approaches the IAC problem. I m more interested in the idea of breaking
down a design along the idea of dimensions that allow examining the
design space than I am particular adherence to Green s original
dimensions.
Low Viscosity, High
Abstraction Reuse
One of the guiding principles is that we want to be able to reuse
different components at different scales and in different environments.
These include being able to do things like:
- Define an operation like Update a Debian system and apply that
in several environments including as part of building a base VM or
container image, applying to an independently managed machine, or
applying to a micro service container that does not run services like
ssh
or systemd
.
- Defining a role like DNS server that can be applied to a
dedicated machine only having that role, to a traditional server with
multiple roles, or in a micro service environment.
- Allowing people to write groups of functionality that can be
useful in descriptions of a small number of machines, but can also be
reused in large environments like modeling of cyber infrastructure to
defend. In the small environments, things are simplified, but in larger
environments integration like directories, authentication infrastructure
and the like is needed.
- Allow grouping of functionality at multiple levels. So far I have
talked about grouping of software to be installed on a single machine or
container. We also want to allow groups of containers (pods or
otherwise), groups of machines, groups of networks, or even enclaves
(think a model of an entire company or section of a company). Each kind
of grouping needs to be parametric and reusable.
Hidden Dependencies
To accomplish these abstraction goals, dependencies need to be
non-local. For example, a software role might need to integrate with a
directory if a directory is present in the environment. When writing the
role, no one is going to know which directory to use, nor whether a
directory is present. Taking that as an explicit input into the role is
error-prone when the role is combined into large abstract units (bigger
roles or collections of machines). Instead it is better to have a
non-local dependency, and to find the directory if it is available. We
accomplish this using
dependency
injection.
In addition to being non-local, dependencies are sometimes hidden. It
is very easy to overwhelm our cognitive capacity with even a fairly
simple IAC description. An effective notation allows us to focus on the
parts that matter when working with a particular part of the
description. I ve found hiding dependencies, especially indirect
dependencies, to be essential in building complex descriptions.
Obviously, tools are required for examining these dependencies as
part of debugging.
First Class Modeling
Clearly one of the goals of IAC descriptions is to actually build and
manage infrastructure. It turns out that there are all sorts of things
you want to do with the description well before you instantiate the
infrastructure. You might want to query the description to build network
diagrams, understand interdependencies, or even build inventory/bill of
materials. We often find ourselves building Ansible inventory, switch
configurations, DNS zones, and all sorts of configuration artifacts.
These artifacts may be installed into infrastructure that is
instantiated by the description, but they may be consumed in other ways.
Allowing the artifacts to be consumed externally means that you can
avoid pre-commitment and focus on whatever part of the description you
originally want to work on. You may use an existing network at first.
Later the IAC description may replace that, or perhaps it never
will.
As a result, Carthage separates modeling from instantiation. The
model can generally be built and queried without needing to interact
with clouds, VMs, or containers. We ve actually found it useful to build
Carthage layouts that cannot ever be fully instantiated, for example
because they never specify details like whether a model should be
instantiated on a container or VM, or what kind of technology will
realize a modeled network. This allows developing roles before the
machines that use them or focusing on how machines will interact and how
the network will be laid out before the details of installing on
specific hardware.
The modeling separation is by far the difference I value most between
Carthage and other systems.
A Tool for Experts.
In Neal Stephenson s essay
In
the Beginning Was the Command Line , Stephenson points out that the
kind of tools experts need are not the same tools that beginners need.
The illustration of why a beginner might not be satisfied with a Hole
Hog drill caught my attention. Carthage is a tool for experts. Despite
what cloud providers will tell you, IAC is not easy. Doubly so when you
start making reusable components. Trying to hide that or focus on making
things easy to get started can make it harder for experts to efficiently
solve the problems they are facing. When we have faced trade offs
between making Carthage easy to pick up and making it powerful for
expert users, we have chosen to support the experts.
That said, Carthage today is harder to pick up than it needs to be.
It s a relatively new project with few external users as of this time.
Our documentation and examples need improvement, just like every project
at this level of maturity. Similarly, as the set of things people try to
do expand, we will doubtless run into bugs that our current test cases
don t cover. So Carthage absolutely will get easier to learn and use
than it is today.
Also, we ve already had success building beginner-focused
applications on top of Carthage. For our cyber training, we built web
applications on top of Carthage that made rebuilding and exploring
infrastructure easy. We ve had success using relatively understood tools
like Ansible as integration and customization points for Carthage
layouts. But in all these cases, when the core layout had significant
reusable components and significant complexity in the networking, only
an IAC expert was going to be able to maintain and develop that
layout.
What Carthage can do.
Carthage has a number of capabilities today. One of Carthage s
strengths is its extensible design. Abstract interfaces make it easy to
add new virtualization platforms, cloud services, and support for
various ways of managing real hardware. This approach has been validated
by incrementally adding support for virtualization architectures and
cloud services. As development has progressed, adding new integrations
continues to get faster because we are able to reuse existing
infrastructure.
Today, Carthage can model:
- Machines
- Networks
- Dynamically compose groupings of the above
- Generate model level artifacts
- Ansible inventory
- Various DNS integrations
- Various switch configurations
Carthage has excellent facilities for dealing with images on which
VMs and Containers can be based, although it does have a bit of a
Debian/Ubuntu bias in how it thinks about images:
- Building base images from a tool like
debootstrap
- Customizing these images
- Converting into VM images for
kvm
, VMware, and AWS
- Building from scratch OCI images for Podman, Docker and k8s
- Adding layers to existing OCI images
When instantiating infrastructure, Carthage can work with:
systemd nspawn
containers
- Podman (Docker would be easy)
- Libvirt
- VMware
- With the AWS
plugin, EC2 VMs and networking
We have also looked at Oracle Cloud and I believe Openstack, although
that code is not merged.
Future posts will talk about core Carthage concepts and how to use
Carthage to build infrastructure.
comments