Setting up IP failover with Heartbeat and Pacemaker on Ubuntu Lucid

By:
Jun 16, 2010

One of Zivtech's clients recently asked us to renovate their server setup, with a focus on improving its availability.

This post will cover the highly available load balancer setup that currently serves 7 sites, with another 4 in the pipeline. Basically, we want to run the sites through two Linux-based software load balancer boxes, and be able to keep the sites up even if one of them fails.

The ingredients:

  • Two Ubuntu Lucid boxes, standard stripped down server install. We'll call them Bart and Lisa, because that's how server naming rolls at Zivtech
  • Heartbeat
  • Pacemaker
  • A service IP per site (an IP address that is not the primary IP of the box it lives on, so it can be moved)

Heartbeat and Pacemaker are the real stars here. Pacemaker is the "brains" of the two, providing the high level control of our little cluster, while Heartbeat does the dirty work - it provides the low level inter-node communication and command framework.

Getting started with Heartbeat and Pacemaker

All of the packages we need are available in Ubuntu repositories, so we can just apt-get them on both boxes.


root@bart:~# apt-get install heartbeat pacemaker

The first place to go for configuration information is the The Linux-HA User's Guide, followed by the excellent Pacemaker configuration website.

The first step is to create an initial Heartbeat configuration and auth key on one of the boxes and copy it to the other, and restart the service.

root@bart:~# cat /etc/ha.d/ha.cf
autojoin none
bcast eth0
warntime 3
deadtime 6
initdead 60
keepalive 1
node bart.example.com
node lisa.example.com
crm respawn
root@bart:~# scp /etc/ha.d/ha.cf lisa:/etc/ha.d/
root@bart:~# ( echo -ne "auth 1\n1 sha1 "; \
dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \

/etc/ha.d/authkeys
root@bart:~# chmod 0600 /etc/ha.d/authkeys
root@bart:~# scp /etc/ha.d/authkeys lisa:/etc/ha.d/
root@bart:~# ssh lisa chmod 0600 /etc/ha.d/authkeys
root@bart:~# /etc/init.d/heartbeat restart
....
root@bart:~# ssh lisa /etc/init.d/heartbeat restart

If all went well, both nodes of the cluster should be up and know about each other. To check, use one of the insanely helpful command line tools that come with the packages, crm_mon.


root@bart:~# crm_mon -1 | grep Online
Online: [ lisa.example.com bart.example.com ]

Ok, now we move on to setting up the IP addresses we would like Hearbeat and Pacemaker to keep available. There are a few ways to change the configuration of your cluster, but the easiest one is to use the crm tool (go here for the excellent documentation). Note, this is the cluster configuration, not the configuration for a single node - you only need to update this on one node, and the changes will propagate automatically.

To Pacemaker, an IP address is a resource (and it can handle many different types of resources, but that's beyond the scope of this post). Assuming we have two sites, with IP addresses of 192.168.1.111 and 192.168.1.222 respectively, we'd tell Pacemaker about them like this:


root@bart:~# crm configure
crm(live)configure# primitive site_one_ip IPaddr params ip=192.168.1.111 cidr_netmask="255.255.255.0" nic="eth0"
crm(live)configure# primitive site_two_ip IPaddr params ip=192.168.1.222 cidr_netmask="255.255.255.0" nic="eth0"
crm(live)configure# commit
crm(live)configure# exit

To check that your configuration is what you think it is as you go, you can use crm configure show. Right now, your configuration should look similar to this:


root@lisa:~# crm configure show
node $id="9f5e6cd6-2b75-4445-8d75-43c1a34fe431" bart.example.com
node $id="fb502469-691b-4021-b504-7da7a188bc63" lisa.example.com
primitive site_one_ip ocf:heartbeat:IPaddr \
params ip="192.168.1.111" cidr_netmask="255.255.255.0" nic="eth0"
primitive site_two_ip ocf:heartbeat:IPaddr \
params ip="192.168.1.222" cidr_netmask="255.255.255.0" nic="eth0"
...

The next thing we want to do is tell Pacemaker on which node we'd prefer the IPs to live when both nodes are up. We do this by setting a "location" for each IP resource. In the following configuration, we tell Pacemaker to keep one IP on each node when they are both up, which is a fairly typical setup for a pair of load balancers:


root@bart:~# crm configure
crm(live)configure# location site_one_ip_pref site_one_ip 100: bart.example.com
crm(live)configure# location site_two_ip_pref site_two_ip 100: lisa.example.com
crm(live)configure# commit
crm(live)configure# exit

Checking the configuration, we should see two lines like this added to our output of crm configure show:


location site_one_ip_pref site_one_ip 100: bart.example.com
location site_two_ip_pref site_two_ip 100: lisa.example.com

Nearly there! Now that Pacemaker knows about our IP address resources, and where they should live, we can setup monitoring, which is the real meat of this exercise:


root@bart:~# crm configure
crm(live)configure# monitor site_one_ip 40s:20s
crm(live)configure# monitor site_two_ip 40s:20s
crm(live)configure# commit
crm(live)configure# exit

Done! Now, to test your setup, turn off/pull out the network cable from one of your nodes, and watch the other one pick up the IPs. Connect the the node again, and Pacemaker will honour your preferences, and send the failed-over IP address back to the original node.

Now that we have highly available load-balancing layer, the next post in this series will show you how to setup Apache to balance requests between backend servers running PHP and Drupal.

Related posts

Apr 16, 2013
The quality assurance (QA) phase of a web development project is the last phase before launch. While the development team has a lot of experience with QA phases, oftentimes the client team is new to the process, which can lead to stress. Let's prevent this with a little Q&A on QA.
Posted by: Jody Hamilton
Apr 15, 2013
Nominees for the 17th annual Webby Awards have been announced! And Zivtech-built sites are in the running for four awards!
Posted by: David Marvin
Feb 14, 2013
This weekend 9 members of the Zivtech staff will be participating and competing in Startup Weekend Health Philadelphia 2013, which, according to their site, ”centers on building a web or mobile application that could be the basis for a credible business. After 500+ Startup Weekends world-wide, this will be the first dedicated solely to healthcare’s unique problems.”
Feb 11, 2013
We love Responsive Web Design, and we love Drupal. But do they love each other? After working on a number of RWD and Drupal projects this year, I'm happy to report that they get along just fine. Though "Love" might be stretching it a bit.
Posted by: Mason Wendell
Nov 19, 2012
I've worked on many "Site Rescue" jobs, in which a client comes with a sick, but often brand-new and expensive Drupal site, desperately needing it fixed in a number of ways. Through this work I've seen a lot of patterns of "worst practices" employed. I enjoy the challenge of this cleanup work, while feeling bad for the clients who have been taken by incompetent vendors. I refer to these sites as "lemons" and I've made a study of their characteristics. 
Posted by: Jody Hamilton
Oct 10, 2012
Much has been written on how to use the Drupal Views module effectively, but less on how to use it like a perfectionist professional. I started the following checklist for my team to level up everyone's pre- ticket review quality, but it can be useful for any Views builders. Getting Started On every site, run drush vd to turn the Views settings to the advanced mode.
Posted by: Jody Hamilton
Jun 27, 2012
A little while ago (hours ago, actually) my friend Sam and I released 1.0 of our Sass project Breakpoint. (Announcement here.) Breakpoint makes writing media queries really simple. I want to show you a little about how we use it day-to-day on responsive projects here at Zivtech.
Posted by: Mason Wendell
Jun 4, 2012
After a full day of sessions and learning about Drupal, Saturday June 23rd from 10 am to 5 pm will be the third Get Involved with Core Sprint. (Previous sprints were held at DrupalCon Denver and DrupalCamp Twin Cities). The sprint will be hosted at Zivtech (the same location as the previous night's after-party).
Posted by: Tim Plunkett
May 25, 2012
Come join Zivtech and the rest of the Philadelphia Drupal community on June 22-23, for two action packed days of good programing, great beer/food/code, and even better people!
May 22, 2012
Recently, I was challenged with upgrading a Drupal site from 4.6 to 7. If you're not too familiar with Drupal, you might not realize that Drupal must be upgraded one major release at a time. For those of you who are counting, that's four major version upgrades (4.6 -> 4.7 -> 5 -> 6 -> 7). This might not seem like a big deal, but depending on the site, this can be a tricky process.
Posted by: Ryan Gibson

Pages