How to load balance TCP connections with HAProxy
This week I was at a client where we were doing some performance testing of the JBoss Enterprise Data Services product (EDS for short). EDS is based on the JBoss.org community project Teiid which is a data virtualization system that allows applications to use data from multiple, heterogenous data stores. It’s a really cool product if you have a lot of backend data sources and you want to expose a simplified virtual (SQL) database to your front end applications — and it runs within the JBoss application server, in our case the enterprise version — Enterprise Application Platform or EAP for short.
As we were doing performance testing we wanted to run EDS within a cluster of JBoss EAP nodes, now clustering EAP nodes is fairly straightforward and you can then setup a front end load balancer with Apache httpd, in my case I used the Red Hat product based on the Apache web server called Enterprise Web Server(EWS) and mod_cluster to cluster and load balance my application servers. Now this sort of clustering is fine if you want to do replication of your applications and use distributed cache replication within the cluster, however the question was how do you do load balancing on the ODBC and JDBC connections that EDS provides for the application tier? As these types of connections can’t be load balanced by the Apache Web Server we had to come up with another way to do this.
As we were at a very large enterprise client their initial suggestion was that we use a hardware TCP load balancer to do this, however I was pretty sure this was a straightforward problem that must have been solved already and that a software based solution must exist… and low and behold there is one and it is called HAProxy.
HAProxy is a really cool load balancer that is very powerful and flexible and also really easy to use and it seems that not too many people are aware that it can be used to load balance ANY TCP connection, this blog post lead me in the right direction and with the ethos of trying to help all of you out there on the web here is my very short how to and sample configuration for how to load balance ODBC and JDBC connections — specifically for Teiid embedded in the JBoss Enterprise Data Services product.
First you need to get HAProxy, in my case I was running everything on RHEL so the easiest thing is to add the Fedora EPEL repository to your machine and then just do a
yum install haproxy
You should now have haproxy installed on your machine — now you just need to configure and launch it!
So first configuration, I’ve included a sample configuration file for 4 nodes, as I had two physical machines with 2 nodes running on each of them. The first node on each machine was running the default JBoss ports and the second was using ports 100 greater than the default. Here is my file:
You’ll notice that I also create a listener for stats on on port 9090 — this means that you can have a nifty web stats page that you can view on
http://<haproxyIP>:8080/haproxy?stats
(username/password = admin/admin)
Now that we have HAProxy configured we just need to run it, in this case it would be
haproxy –f <config file> –V
I used the verbose flag so that I could see what was going on but you may not necessarily want this for production.
Now just launch the stats page to see your nodes connected and you are up and running — don’t forget to configure the ODBC and JDBC listener sections to put your own ports to listen to depending on the setup you are going to use.
Oh, and by the way, the performance testing we did showed that Teiid is amazingly performant!