HALF High Availability Linux Firewall Bogdan Lucaciu (bogdan [at] wiz [dot] ro) 1. What is the actual problem? With the stateless firewalls in Linux 2.0/2.2, a High Availability mechanism is rather easy to implement, because as long as identical packet filtering rules and routing tables are on all the cluster nodes,the failover process can use any IP-Address takeover method (like HeartBeat or VRRP). With the netfilter/iptables infrastructure in Linux 2.4/2.6, the firewall does more than just simple packet filtering, and provides a connection tracking system that can be used in stateful firewalling. This tracking system collects information about all current connections, that can be associated with filtering decisions and NAT data. Providing a working failover system also implies the replication of this state information from the active node to all waiting cluster nodes. 2. What can we do about it? This situation requires a system providing: - A connection tracking state replication protocol - An event interface generating event messages as soon as state information changes on the active node - An interface for explicit generation of connection tracking table entries on the standby slaves - Some code (preferably a kernel thread) running on the active node, receiving state updates by the event interface and generating conntrack state replication protocol messages - Some code (preferably a kernel thread) running on the slave node(s), receiving conntrack state replication protocol messages and updating the local conntrack table accordingly These requirements were first presentend in a draft by Harald Welte, head of the netfilter core team, and where implemented in a proof-of-concept kernel module by Krisztian Kovacs. This module is ct_sync, a connection tracking state information replication mechanism, currently in heavy development, but stable enough to help with the everyday firewall scenarios. The developers are recommending keepalived for the ip-address takeover program, but any can be used. Ct_sync is currently only available for kernel 2.4, but once it reaches a stable state, it can be easily ported to 2.6. It doesn't provide support for nat helper modules (for FTP, IRC, etc), since a mechanism replicating expectations is rather difficult to implement, but this feature is high on their todo list. The current version of ct_sync is available on the netfilter svn , at this url http://svn.netfilter.org/netfilter/tags/netfilter-ha/ . 3. What is HALF? High Availability Linux Firewall is a project that tries to bring this extraordinary failover system to the masses, mostly for testing purposes, but it can also be used in production environments for a certain subset of problems, where it has a pretty stable behaviour. Well what do you get with HALF? - an information center at http://linux.wiz.ro/half , containing documentation, test results, case studies and more - half-cd , a modified woody debian installer, that can be used to quickly setup a HALF node. It's most important features are a 2.4.26 kernel image with ct_sync support and a backported debian package of keepalived 1.1.7 (keepalived is not available in woody), but there are other useful things: ext2, ext3, XFS, ReiserFS support, parted, vi and bash at install time, LVM and RAID utils or postfix as the default MTA. The half-cd is quite small, providing only the necesary tools for a firewall system. - HALF apt repository, containing necessary packages for converting an existing woody firewall into a HALF node, using the apt dpkg frontend in debian. You should use these apt sources in /etc/apt/sources.list: deb http://apt.wiz.ro/debian/ stable half deb-src http://apt.wiz.ro/debian/ stable half HALF's primary purpose is providing a solid and stable testbase for ct_sync developers and testers, since building a failover setup based on ct_sync is quite a long and troublesome process. I gained quite a lot of experience learning about it , testing it and talking with the developers, so I thought I should put this knowledge to good use, and started working on HALF. 4. Building HALF First thing i had to do was building a debian package with the kernel image and modules, using a patched 2.4.26 source tree. The kernel .config and the script I made for patching the kernel source are available on the half website. Making the backport for keepalived implied changing some of the official maintainer's contributions, but I eventually got it running, and I also added some docs and examples for using with ct_sync to the package. The longer part was modifying the woody installer, but in few steps it consisted of: - unpacking the iso, unpacking the installer rootfs, adding programs to the rootfs - using my kernel for the installer, by changing the linux.bin, modules.tgz, config.gz, sys_map.gz and modcont archives/files. - changing the repository, by removing useless packages and adding HALF specific packages, and dependencies. - rebuilding the iso 5. Replicating fire - a HALF cluster scenario Having the HALF CD, 5 computers, 9 ethernet adapters, two switches and lots of patchcords handy, I thought I should give it a serious test. Here is the actual topology I used: 192.168.1.128 [tom] [jerry] 192.168.1.129 \ / [switch] / \ virtual ip 192.168.1.254 / \ 192.168.1.1 192.168.1.2 [halfone]10.0.0.1 - 10.0.0.2[halftwo] 192.168.2.1 192.168.2.2 \ / virtual ip 192.168.2.254 \ / [switch] | [gateway] 192.168.2.128 So i used a crossover cable for ct_sync replication, in the 10.0.0.0/24 network. The 192.168.1.0/24 is the LAN and 192.168.2.128 is the gateway address. Keepalived provided the virtual addresses 192.168.1.254 and 192.168.2.254 for the active node. I used a stateful firewall configuration with nat, and all the "-m state ESTABLISHED" connections were restored when the active node was brutally turned off. The setup was pretty stable and I didn't notice any unusual problems, as long as I used only the supported features. The computers were all identical, AMD Duron 1400MHz CPU, 256MB DDRAM, VIA Chipset, EEPRO100 ethernet adapters, and I used two Planet switches. 6. The future of HALF Although at the moment ct_sync is still in a developement state, it's will become very stable in the near future, and then, HALF will be more than just a testbase, but a convenient way to setup a professional redundant firewall system. Meanwhile I'm working with the netfilter developers with bugtesting and code audit. 7. Conclusions. High availability firewalls are really important and unfortunately Linux didn't have firewall failover support until now. HALF takes the the ct_sync module one step further, making it easy for anyone needing such a high performance firewall setup.