Setting up A Basic IP Failover using Ubuntu (on Linode)

Web development
Zivtech
Zivtech Staff

Goal:

This is a basic IP failover setup. There are other tools out there like Heartbeat and the Linux-HA suite. But I wanted something with less overhead. I also wanted something that was 'semi-automatic'.

If a site goes down I want to have a script that will allow another server to grab the active IP address. I don't want to automate this so I can do a basic check before flipping the switch. Switching servers will not necessarily solve anything if the problem is with the site itself or related to connectivity problems with the host.*

I've got going so I'll be right away if there is an issue.

I am specifically doing this on a pair of Linode servers.

Setup:

Server A with IP 1

Server B with IP 2

Swap IP 3

LAMP stack with Drupal site.

Pre Step:

  1. Make sure apache and php settings are identical
  2. Setup master / slave replication. Server A is master.
  3. Setup identical files on Server A and B using rysnc and svn.

Steps to Setup Failover on Linode

  1. Purchase additional IP address (IP 3) for Server A
  2. Add IP 3 as a Failover for Server B: https://www.linode.com/members/linode/ipfailover.cfm
  3. Add IP 3 as Eth0:1 on Server A and Server B
  4. for Server A:
    vim /etc/network/interfaces
    auto eth0
    iface eth0 inet static
    address 'IP 1'
    gateway 'IP 1 gateway'
    netmask 255.255.255.0
    auto eth0:1 iface eth0:1 inet static
    address 'IP 3'
    gateway 'IP 3 gateway'
    netmask 255.255.255.0


    for Server B:
    vim /etc/network/interfaces
    auto eth0
    iface eth0 inet static
    address 'IP 2'
    gateway 'IP 2 gateway'
    netmask 255.255.255.0
    auto eth0:1
    iface eth0:1 inet static
    address 'IP 3'
    gateway 'IP 3 gateway'
    netmask 255.255.255.0


    You can find the gateway information on your networking tab in your Linode account. Restart networking on Server A. Make sure Eth0:1 is setup correctly using 'ifconfig'.
  5. Direct traffic for examplesite.com to 'IP 3' You can do this through the Linode DNS manager.
  6. Test switching traffic from Server A to Server B Turn off eth0:1 on Server A: ifdown eth0:1 Turn on eth0:1 on Server B: ifup eth0:1 Send out a gratuitous arp request on Server B: arping -I eth0:1 'IP 3' -uc 5

A note on arping

There are two arping packages, one by "Alexey Kuznetsov" and one by "Thomas Habets." 

You want the one by Alexey. "apt-get install arping" downloads the one by Thomas.

For Alexey's you want "apt-get install iputils-arping". This cost me a bunch of tenergy (time and energy) because they don't take the same arguments. With Alexey's you should have no problem with the instructions above.

Making a script

The above test should work. If it does you are almost finished. Almost but not quite.

You don't want to leave the 'IP3' info in your /etc/network/interfaces files on each server. You don't want to rely on eth0 up and down. This is because if the machine reboots you don't want it to bring up all of the interfaces in that file. You can remove 'auto eth0:1' but I want to be able to reboot with the non-active box if necessary without it bringing up the IP3.

The solution for this very basic HA setup is to have separate 'interfaces' files. One for the box in the primary state (receiving traffic from IP3) and secondary state (not receiving from IP3). You also want a to update the /etc/hosts file with IP3 for the primary state as well.

Finally I have a short script that makes a box primary or secondary by:

  1. switching the primary/secondary versions of the /etc/network/interfaces file
  2. switching the primary/secondary versions of the /etc/hosts file
  3. send a gratuitous arp from the box taking on the new primary state.

Conclusion

Now I can switch boxes in a matter of seconds if there is an issue without the overhead of the Linux HA and with the added ability to control when I make the switch. Please let me know if you have any thoughts / suggestions!


* Because the two servers are in the same data-center I don't want them to switch even if one becomes completely unavailable. I'm assuming if one is completely unreachable the other will be unavailable as well. It is possible that one could completely go down while the other is up but it is an edge case that is not worth preparing for given the extra overhead in this situation.

Ready to get started?

Tell us about your project