AWS EC2 Autoscaling groups are a very powerful solution to get a pool of machines automatically scaled up or down depending on the ressources needed. Whenever your application needs to handle an increasing number of requests, the autoscaling group will fire up new instances to spread the load on more nodes. As the traffic decreases, machines would be terminated to avoid having too much of unused resources, therefore optimizing infrastructure costs. A load balancer is then needed to dispatch the requests on all the available nodes. The EC2 Elastic Load Balancer (ELB) is used almost every time because:

  • You don't have another infrastructure layer to manage as the ELB is working out of the box.
  • It integrates without any configuration with autoscaling groups. You just have to target the group, and the ELB will automatically spread the load on the group instances.
  • It is cheap ($0.008 per Gb of used bandwidth)

The ELB is great for most cases. But, in some situations, you may want to use another load-balancing solution. It might be because:

  • You want more control on the way you manage traffic and your load balancing configuration (the ELB essentially being a black box that you cannot really tweak).
  • Depending the kind of business you operate, your application may receive important and sudden spikes of traffic at certain times. In a similar way to autoscaling groups, the ELB will internally scale to accommodate the load. However, if the traffic ramp-up is too important or fast, you will basically be DDoS-ing your ELB (which was likely not yet scaled and ready for this amount of traffic).

Enters HAProxy. According to the homepage of its website:

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications.

It has been for the past years a very popular solution because of its simplicity (it does one thing and does it very well), its performance and robustness. Virtually every big tech company out there uses it to balance the traffic load on their servers (Twitter, Airbnb and Reddit, to name a few).

In this post, we'll build a HAProxy load balancer machine on a classic EC2 Ubuntu instance, then use the AWS SDK with some scripting to automatically configure HAProxy with the instances present in our autoscaling group.

Setup HAProxy on a EC2 instance

Since we’ll need to call the EC2 API from the instance in order to retrieve the current nodes that we need to balance traffic on, we could have to deal with storing access keys and secrets on the load balancer to authenticate API calls. Plus, if you use some configuration management tool like Ansible or Chef, you would have to deal with encryption to avoid keys being readable from your configuration code.

Good for us, the AWS API offers a handy alternative to the use of access keys: by attaching an IAM role to an instance, this instance can be automatically authenticated under this role when calling the API using instance profile credentials, without us having to do anything. Sounds better, no?
So, let’s start by creating a role in the IAM dashboard and configure it with the EC2 read access.

  • Enter a name for your role, for example HAProxyRole
  • On the next page, select Amazon EC2 role type in the list AWS Service Role
  • Choose to attach the AmazonEC2ReadOnlyAccess strategy to your the role, and finally confirm the role creation on the next page.

Now, any instance launched with this role attached will be able to authenticate API calls with the EC2 reading permissions.

Next step is to create and launch the instance. We’ll launch from the Ubuntu Server AMI. You can select any instance type for this example. Next, in the Configure Instance Details section, you can attach the role HAProxyRole we just created to the new instance. The rest of the options are up to you, then you can launch the instance.

HAProxy having a very low memory footprint, but high CPU usage at scale, you’ll probably want to go for a C4 instance type for production use.

Tweaking system limits

This is not a required step, but you’ll probably want your HAProxy load balancer to handle as many connections as the instance resources allows.

Linux is, by default, limiting the number of file descriptors that can be open at the same time. TCP connection sockets being treated in the same way that regular files, this limit could throttle the number of simultaneous connections handled by HAProxy. Let’s bump it up a bit.

SSH into the freshly created instance and edit the file /etc/systcl.conf:

fs.file-max = 10000000  
fs.nr_open = 10000000  

We also have to update the file /etc/limits.conf

* soft nofile 10000000
* hard nofile 10000000
root soft nofile 10000000  
root hard nofile 10000000  

If you’re interested in further tweaking of the system to maximize the load that your LB can handle, this post gives more details about how to achieve this.

HAProxy setup

Add the apt PPA repository to install HAProxy 1.6:

$ sudo add-apt-repository ppa:vbernat/haproxy-1.6
$ sudo apt-get update
$ sudo apt-get install -y haproxy

Edit the file /etc/default/haproxy to enable the HAProxy service daemon:

ENABLED=1  

Then start HAProxy with

sudo service haproxy start  

While not required, you might want to install the hatop utility to monitor your HAProxy load balancer.

$ sudo apt-get install hatop

We will have a default “template” configuration file for HAProxy, /etc/haproxy.cfg.template, the only missing thing being the backend nodes. Since we’ll dynamically retrieve the current autoscaled instances from the EC2 API, we’ll generate a final configuration file /etc/haproxy.cfg from the template one, adding all node IP addresses.

# /etc/haproxy.cfg.template

global  
      log /dev/log    local0
      log /dev/log    local1 notice
      chroot /var/lib/haproxy
      stats socket /tmp/haproxy
      stats timeout 30s
      user haproxy
      group haproxy
      daemon

      node lb
      nbproc 1
      maxconn 2000000

defaults  
      log     global
      mode    http
      option forwardfor
      option http-server-close

      timeout connect 5000
      timeout client  50000
      timeout server  50000
      errorfile 400 /etc/haproxy/errors/400.http
      errorfile 403 /etc/haproxy/errors/403.http
      errorfile 408 /etc/haproxy/errors/408.http
      errorfile 500 /etc/haproxy/errors/500.http
      errorfile 502 /etc/haproxy/errors/502.http
      errorfile 503 /etc/haproxy/errors/503.http
      errorfile 504 /etc/haproxy/errors/504.http

      retries 3
      option  dontlog-normal

frontend mydomain.com  
      bind *:80
      default_backend autoscaling_group
      maxconn 2000000

backend autoscaling_group  
      balance roundrobin

      # autoscaling group instances will be
      # dynamically added below

Retrieving Autoscaling group instances

We’re now gonna do some scripting to get autoscaling group instances from the AWS API. The script would run, let’s say, every 3 minute, to refresh the list of backend servers. You can use any language for this, but for the rest of this tutorial I’ll go for Ruby.

We’ll need to install Ruby, some other libraries and the aws-sdk gem to interact with the AWS API:

$ sudo add-apt-repository ppa:brightbox/ruby-ng
$ sudo apt-get update
$ sudo apt-get install -y software-properties-common ruby2.3 ruby2.3-dev zlib1g-dev libxml2-dev build-essential libpcre3 libpcre3-dev

$ sudo gem install aws-sdk
# /usr/bin/haproxy-autoscaling-update.rb

require 'aws-sdk'

# We use instance profile credentials to authenticate
# using the role attached to the instance
region = "eu-west-1"  
auto_scaling_group = "app"  
credentials = Aws::InstanceProfileCredentials.new

Aws.config.update(credentials: credentials)  
Aws.config.update(region: region)

autoscaling = Aws::AutoScaling::Client.new(region: region)  
ec2 = Aws::EC2::Client.new(region: region)

# Retrieve current autoscaling group instances
response = autoscaling.describe_auto_scaling_groups(auto_scaling_group_names: [auto_scaling_group])  
instances = response.auto_scaling_groups.first.instances

hosts = []  
instances.each do |instance|  
  if instance.lifecycle_state == "InService"
    # We cannot access the private IP address of the
    # instance using Autoscaling API, so we have to
    # retrieve the instance object from the EC2 API.
    ec2_instance = ec2.describe_instances(instance_ids: [instance.instance_id]).reservations.first.instances.first
    if ec2_instance.state.name == "running"
      hosts << {ip: instance.private_ip_address, public_name: instance.public_dns_name}
    end
end

# Copy template config to the config file
# and append hosts to backend configuration
FileUtils.cp("/etc/haproxy/haproxy.cfg.template", "/etc/haproxy/haproxy.cfg")  
open("/etc/haproxy/haproxy.cfg", "a") do |f|  
  hosts.each do |host|
    f << "\tserver #{host[:public_name]} #{host[:ip]} check port 80\n"
  end
end

# Reload HAProxy with system command
stdout = `service haproxy reload`  
puts " -> reloaded HAProxy: #{stdout}"  

Add the script to your crontab so it will run every 3 minute (you might use a different strategy, depending of your application or infrastructure):

*/5 * * * * sudo ruby /usr/bin/haproxy-autoscaling-update.rb

After the script has run, run sudo hatop -s /tmp/haproxy to monitor HAProxy stats. If you have some instances running into your autoscaling group, you should see them listed as backend servers like on the image below.

Conclusion

We’re done! This is a working solution, but it could be improved in many ways:

  • We could keep a record of the latest loaded list of instances, so we can update the HAProxy configuration and reload it only when the list of instances has changed since the last reload.
  • If your LB continuously receives an important amount of traffic, you’re probably dropping some connections during the short time window HAProxy needs to reload its configuration. There are ways around this, and Yelp engineers wrote a very detailed post on that matter, as well as the Github engineering blog.
  • The basic HAProxy configuration of this post only handles HTTP traffic coming on port 80, but you probably want your application traffic over HTTPS. You can implement the SSL termination on the HAProxy LB instead of the web servers, then dispatch requests on port 80.