One of the more important trends in development over the past decade has been the movement towards automated provisioning of servers. The idea of infrastructure as code is closely linked to the devops philosophy, but seems largely to have grown out of changes in how developers use servers as a result of cloud hosting and on demand server resources. The goals of long lived servers and uptime don’t really apply in the world of instant server creation over APIs. Many development teams I’ve worked with have taken to deploying an application by deploying the entire stack (servers and all) which means they can’t waste time installing lists of packages and altering config files. This is something that needs to be automated.
There are a number of tools that have been developed to automate server (and cluster) provisioning. Among them are Ansible, Chef, Puppet, and SaltStack. At Zivtech we use Puppet, but I’ve increasingly become a fan of Ansible. As a result of my growing interest in Ansible, I’ve been asked by server coworkers about the differences between the two systems and their approaches.
Puppet is venerable software. It is one of the oldest projects in the space of modern server provisioning systems. Ansible is new in this respect and is a younger project than even SaltStack, which is considered new in this space. Despite the lack of a long track record, Ansible has been gaining popularity for a number of reasons, most notably the ease of getting going.
In general, I find Ansible a far easier tool to work with. The syntax is YAML with variables and templates using Jinja syntax. So the trick is really just learning all the keywords to make a valid configuration apply correctly to the server. Puppet by contrast uses its own Ruby-like syntax. Puppet also uses different syntaxes for variables in templates and in its own code.
These are minor differences though. There are two major differences that differentiate the tools and are more likely to make an impact on any choice to start using one or the other.
Declarative vs Imperative
Puppet uses a declarative syntax. The idea is that you tell puppet what things should look like on a server and Puppet will enforce these rules. For your code, this means you must provide explicit dependencies. If you want to change your Apache configuration file, you must ensure that the code that handles this explicitly requires that Apache first be installed via another rule. Puppet parses through all the files and creates a graph of dependencies and then runs all the rules in an order that it algorithmically determines. That order may change in later versions of puppet. In some versions, I’ve heard that it might change each time it is run. In principal this is a wonderful idea and is more in line with the ideal of repeatable deployments. In practice, I find it tiresome and find myself fighting to make simple things work properly. Let me provide a bad example. It is a bad example because you probably wouldn’t need to do something like this and because there are better ways to accomplish the same task. But let's say that you want to:
- create a file
- then append the output of several commands into that file
- and finally remove that file after copying the data from it to another file
In Puppet, explicitly requiring that a file exists and then later requiring that it does not exist causes a collision that cannot be resolved. Puppet will simply fail to execute if this happens. In your own code this is fairly easy to avoid but when you begin using pre-packaged Puppet modules supplied by the community, you increase your chances of having two pieces of code attempt to declare overlapping parts of the system. The ways around these problems are often a bit ugly.
Ansible doesn’t have this problem because it follows an imperative mode of operation. In Ansible, you are free to insist that a file exist and then immediately insist that it does not. Ansible will run your tasks in order from top to bottom. First it will add the file and then it will remove it. If contributed code applies to the same files, it will run in the order in which you call it. The upside to this is that as long as you order things correctly and ensure the final result is as you expected, you do not need to spend much time patching or debugging collisions. The drawback is that modifications to the system that you have written could get reverted by other code that runs later. While this is a definite possibility and something that needs to be acknowledged as a possible drawback, I find the possibility of issues resulting from this less frustrating than reconciling two conflicting pieces of pre-existing code.
Push vs Pull
One of the features of Ansible that makes it simple is the way it connects to servers and the speed with which you are able to provision a brand new machine. Ansible requires nothing aside for Python on the client that needs to be provisioned and most Linux/Unix distros have Python by default. After installing Ansible on your host provisioning computer (or server) the only requirement is that you have ssh access to the machine. When you run Ansible against a server or a cluster of servers it will ssh to each machine and run the tasks. This means that you don’t have to worry about dependency problems on your servers and you don’t need to install anything on the server other than what you need to run your applications. Server updates don’t break the provisioning process (though they can still cause provisioning to fail, it won’t be due to issues with Ansible itself).
Puppet approaches this problem in reverse. Puppet runs in a leader/follower setup where one server is the central coordinator and all the other servers request updates from it. The central server is called the Puppet Master. This model requires that each server have Puppet installed on it and requires some extra setup to ensure certificates are authorized between the servers and the Puppet Master. Again, there are benefits and drawbacks. The most notable drawback is that the Puppet Master is a point of failure for deployments. If something bad happens to that machine, it will be necessary to reconfigure something to ensure new features can be deployed to the servers. Ansible doesn’t have this risk. Anyone with proper ssh credentials can provision the servers from a laptop if the need should arise. The server model is nicer if you are running a large infrastructure, but if you have only a few servers, Ansible’s simplified setup might save some headaches.
A Question of Purpose
I think both Ansible and Puppet are totally valid deployment tools. If I had to start over again with our setup at Zivtech, I’d probably turn to Ansible (though SaltStack would be a contender as well). Much of the decision for me would be based on how many servers need to be managed and how those servers are meant to be created and how long they will exist. In a server-on-demand type of environment, I think Ansible’s model is a better fit since speed to provision is more important and a server that fails to build will more likely be rebuilt than debugged. In environments with long lived servers and many of them, Puppet may be the better option as Puppet also has a nice dashboard and reporting tools to allow you to see how the various nodes are behaving.
Regardless of your choice, there are good tutorials and enough documentation to help you accomplish any server provisioning task. Puppet and Ansible have open source repositories of popular provisioning tasks to help you get started and prevent you from rewriting the same code that has been written dozens of times before. Happy automating!