Scaling with HAProxy and EC2 Autoscaling groups
AWS EC2 Autoscaling groups are a very powerful solution to get a pool of machines automatically scaled up or down depending on the ressources needed. Whenever your application needs to handle an increasing number of requests, the autoscaling group will fire up new instances to spread the load on more nodes. As the traffic decreases, machines would be terminated to avoid having too much of unused resources, therefore optimizing infrastructure costs. A load balancer is then needed to dispatch the requests on all the available nodes. The EC2 Elastic Load Balancer (ELB) is used almost every time because:
- You don't have another infrastructure layer to manage as the ELB is working out of the box.
- It integrates without any configuration with autoscaling groups. You just have to target the group, and the ELB will automatically spread the load on the group instances.
- It is cheap ($0.008 per Gb of used bandwidth)
The ELB is great for most cases. But, in some situations, you may want to use another load-balancing solution. It might be because:
- You want more control on the way you manage traffic and your load balancing configuration (the ELB essentially being a black box that you cannot really tweak).
- Depending the kind of business you operate, your application may receive important and sudden spikes of traffic at certain times. In a similar way to autoscaling groups, the ELB will internally scale to accommodate the load. However, if the traffic ramp-up is too important or fast, you will basically be DDoS-ing your ELB (which was likely not yet scaled and ready for this amount of traffic).
Enters HAProxy. According to the homepage of its website:
HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications.
It has been for the past years a very popular solution because of its simplicity (it does one thing and does it very well), its performance and robustness. Virtually every big tech company out there uses it to balance the traffic load on their servers (Twitter, Airbnb and Reddit, to name a few).
In this post, we'll build a HAProxy load balancer machine on a classic EC2 Ubuntu instance, then use the AWS SDK with some scripting to automatically configure HAProxy with the instances present in our autoscaling group.
Setup HAProxy on a EC2 instance
Since we’ll need to call the EC2 API from the instance in order to retrieve the current nodes that we need to balance traffic on, we could have to deal with storing access keys and secrets on the load balancer to authenticate API calls. Plus, if you use some configuration management tool like Ansible or Chef, you would have to deal with encryption to avoid keys being readable from your configuration code.
Good for us, the AWS API offers a handy alternative to the use of access keys: by attaching an IAM role to an instance, this instance can be automatically authenticated under this role when calling the API using instance profile credentials, without us having to do anything. Sounds better, no?
So, let’s start by creating a role in the IAM dashboard and configure it with the EC2 read access.
- Enter a name for your role, for example HAProxyRole
- On the next page, select Amazon EC2 role type in the list AWS Service Role
- Choose to attach the AmazonEC2ReadOnlyAccess strategy to your the role, and finally confirm the role creation on the next page.
Now, any instance launched with this role attached will be able to authenticate API calls with the EC2 reading permissions.
Next step is to create and launch the instance. We’ll launch from the Ubuntu Server AMI. You can select any instance type for this example. Next, in the Configure Instance Details section, you can attach the role HAProxyRole we just created to the new instance. The rest of the options are up to you, then you can launch the instance.
HAProxy having a very low memory footprint, but high CPU usage at scale, you’ll probably want to go for a C4 instance type for production use.
Tweaking system limits
This is not a required step, but you’ll probably want your HAProxy load balancer to handle as many connections as the instance resources allows.
Linux is, by default, limiting the number of file descriptors that can be open at the same time. TCP connection sockets being treated in the same way that regular files, this limit could throttle the number of simultaneous connections handled by HAProxy. Let’s bump it up a bit.
SSH into the freshly created instance and edit the file /etc/systcl.conf
:
fs.file-max = 10000000
fs.nr_open = 10000000
We also have to update the file /etc/limits.conf
* soft nofile 10000000
* hard nofile 10000000
root soft nofile 10000000
root hard nofile 10000000
If you’re interested in further tweaking of the system to maximize the load that your LB can handle, this post gives more details about how to achieve this.
HAProxy setup
Add the apt PPA repository to install HAProxy 1.6:
$ sudo add-apt-repository ppa:vbernat/haproxy-1.6
$ sudo apt-get update
$ sudo apt-get install -y haproxy
Edit the file /etc/default/haproxy
to enable the HAProxy service daemon:
ENABLED=1
Then start HAProxy with
sudo service haproxy start
While not required, you might want to install the hatop utility to monitor your HAProxy load balancer.
$ sudo apt-get install hatop
We will have a default “template” configuration file for HAProxy, /etc/haproxy.cfg.template
, the only missing thing being the backend nodes. Since we’ll dynamically retrieve the current autoscaled instances from the EC2 API, we’ll generate a final configuration file /etc/haproxy.cfg
from the template one, adding all node IP addresses.
# /etc/haproxy.cfg.template
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /tmp/haproxy
stats timeout 30s
user haproxy
group haproxy
daemon
node lb
nbproc 1
maxconn 2000000
defaults
log global
mode http
option forwardfor
option http-server-close
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
retries 3
option dontlog-normal
frontend mydomain.com
bind *:80
default_backend autoscaling_group
maxconn 2000000
backend autoscaling_group
balance roundrobin
# autoscaling group instances will be
# dynamically added below
Retrieving Autoscaling group instances
We’re now gonna do some scripting to get autoscaling group instances from the AWS API. The script would run, let’s say, every 3 minute, to refresh the list of backend servers. You can use any language for this, but for the rest of this tutorial I’ll go for Ruby.
We’ll need to install Ruby, some other libraries and the aws-sdk gem to interact with the AWS API:
$ sudo add-apt-repository ppa:brightbox/ruby-ng
$ sudo apt-get update
$ sudo apt-get install -y software-properties-common ruby2.3 ruby2.3-dev zlib1g-dev libxml2-dev build-essential libpcre3 libpcre3-dev
$ sudo gem install aws-sdk
# /usr/bin/haproxy-autoscaling-update.rb
require 'aws-sdk'
# We use instance profile credentials to authenticate
# using the role attached to the instance
region = "eu-west-1"
auto_scaling_group = "app"
credentials = Aws::InstanceProfileCredentials.new
Aws.config.update(credentials: credentials)
Aws.config.update(region: region)
autoscaling = Aws::AutoScaling::Client.new(region: region)
ec2 = Aws::EC2::Client.new(region: region)
# Retrieve current autoscaling group instances
response = autoscaling.describe_auto_scaling_groups(auto_scaling_group_names: [auto_scaling_group])
instances = response.auto_scaling_groups.first.instances
hosts = []
instances.each do |instance|
if instance.lifecycle_state == "InService"
# We cannot access the private IP address of the
# instance using Autoscaling API, so we have to
# retrieve the instance object from the EC2 API.
ec2_instance = ec2.describe_instances(instance_ids: [instance.instance_id]).reservations.first.instances.first
if ec2_instance.state.name == "running"
hosts << {ip: instance.private_ip_address, public_name: instance.public_dns_name}
end
end
# Copy template config to the config file
# and append hosts to backend configuration
FileUtils.cp("/etc/haproxy/haproxy.cfg.template", "/etc/haproxy/haproxy.cfg")
open("/etc/haproxy/haproxy.cfg", "a") do |f|
hosts.each do |host|
f << "\tserver #{host[:public_name]} #{host[:ip]} check port 80\n"
end
end
# Reload HAProxy with system command
stdout = `service haproxy reload`
puts " -> reloaded HAProxy: #{stdout}"
Add the script to your crontab so it will run every 3 minute (you might use a different strategy, depending of your application or infrastructure):
*/5 * * * * sudo ruby /usr/bin/haproxy-autoscaling-update.rb
After the script has run, run sudo hatop -s /tmp/haproxy to monitor HAProxy stats. If you have some instances running into your autoscaling group, you should see them listed as backend servers like on the image below.
Conclusion
We’re done! This is a working solution, but it could be improved in many ways:
- We could keep a record of the latest loaded list of instances, so we can update the HAProxy configuration and reload it only when the list of instances has changed since the last reload.
- If your LB continuously receives an important amount of traffic, you’re probably dropping some connections during the short time window HAProxy needs to reload its configuration. There are ways around this, and Yelp engineers wrote a very detailed post on that matter, as well as the Github engineering blog.
- The basic HAProxy configuration of this post only handles HTTP traffic coming on port 80, but you probably want your application traffic over HTTPS. You can implement the SSL termination on the HAProxy LB instead of the web servers, then dispatch requests on port 80.