This role will install haproxy from the official repository http://haproxy.debian.net.

Important

This role consider that haproxy will always serve https.

This role currently doesn't handle the management of the https certificates and private keys. HAproxy looks for files in /usr/local/etc/tls/haproxy: each files here must contain the private key, the certificate and the full chain (yes, everything in one file!).

HAproxy will automatically answer https requests with SNI with the correct certificate.

Mandatory variables

This role uses object to define configuration parameters.

The haproxy version is mandatory, but should already be defined in group_vars/all/software_versions, so except in very specific cases (like testing of new version), you don't need to override it:

haproxy_version: "2.8"

For the backends, you can define several of them this way:

haproxy_backend:
  - name: "identity-test"
    balance: "roundrobin" # this is the default and can be ommitted
    server:
      - name: "id-test-1" # if undefined, takes the value of the "fqdn"
        fqdn: "identity-test-node-1.cosium.com" # if undefined, takes the value of the "name""
        port: "8080"
      - name: "id-test-2"
        fqdn: "identity-test-node-2.cosium.com"
        port: "8080"
	proto: "h2"
    check: "check inter 2s fastinter 2s downinter 2s" # default is "check"
	options: "string containing the options for this server, this is optional"

Unfortunately, currently this role cannot find out which certificate are active and thus which ones should be seen by zabbix so you must list the https website with this list:

haproxy_https_monitoring:
  - identity.cosium.com

TLS profiles

Changelog of the TLS parameter:

The TLS configuration is generated with https://ssl-config.mozilla.org/#server=haproxy&version=2.8.

The default profile is "intermediate" (which supports TLS 1.2+) but you can switch it to modern (which supports TLS 1.3+) via this variable:

haproxy_tls_profile: "modern"

Optional variables

You can change the default backend of the frontend:

haproxy_frontend:
  default_backend: "error404"

This roles has a default maximum number of connection set to 20000 (the default in vanilla haproxy is 500). You can adjust this with this variable:

haproxy_maxconn: 20000

You can also adjust the timeout values of haproxy, which are explained here:

The default are:

haproxy_timeout_connect: "5s"
haproxy_timeout_client: "50s"
haproxy_timeout_server: "50s"

From haproxy documentation:

In TCP mode (and to a lesser extent, in HTTP mode), it is highly recommended that the client timeout remains equal to the server timeout in order to avoid complex situations to debug.

You can handle all robots.txt for all frontends via this variable:

haproxy_robotstxt: True

When set to true, the url /robots.txt will return:

User-agent: *
Disallow: /

Is it usefull when backends should not be indexed.

You can also use the robots.txt backend in only some cases, for this, just reference the robots_txt acl. Example:

acl something hdr(host) something.example.org
use_backend robotstxt if is_robots_txt something

The default acl robotstxt is in the standard frontend.

You can define several user lists, to have one authentication page (basic_auth):

haproxy_userlist:
  mailcatcher:
    - bolle_mailcatcher
    - user2

In this example:

Passwords are automatically generated by the role and added to hashicorpvault. If you wish, you can define them in advance, respecting this name:

haproxy_basicauth_%USERNAME%_password # replace %USERNAME% with the username you've defined

Information: the password is added to the haproxy configuration in clear text to avoid this: http://docs.haproxy.org/2.9/configuration.html#3.4-user

Attention: Be aware that using encrypted passwords might cause significantly increased CPU usage, depending on the number of requests, and the algorithm used.
For any of the hashed variants, the password for each request must be processed through the chosen algorithm, before it can be compared to the value specified in the config file.
Most current algorithms are deliberately designed to be expensive to compute to achieve resistance against brute force attacks. They do not simply salt/hash the clear text password once, but thousands of times.
This can quickly become a major factor in HAProxy's overall CPU consumption!

Example of haproxy configuration:

haproxy_frontend_raw_config: |
  acl mailcatcher.bollebrands.com	hdr(host)		-i mailcatcher.bollebrands.com
  http-request auth				if mailcatcher.bollebrands.com !{ http_auth(mailcatcher) } !acme-challenge
  use_backend mailcatcher.bollebrands.com			if mailcatcher.bollebrands.com { http_auth_group(mailcatcher) }

Frontends

By default, this role create a frontend named "https" which has the following default configuration:

frontend https
	filter compression
	compression algo gzip
	compression type text/html text/plain text/xml text/css text/csv text/rtf text/richtext application/x-javascript application/javascript application/ecmascript application/rss+xml application/xml application/json application/wasm
	mode	http
	bind	:443,:::443 v6only ssl crt /usr/local/etc/tls/haproxy alpn h2,http/1.1
	bind	:80,:::80 v6only
	http-request set-header X-Forwarded-Proto https if { ssl_fc }
	redirect scheme https code 301 if !{ ssl_fc }
	option forwardfor
	# block access to any git paths
	acl git path,url_dec -m sub /.git
	use_backend error404 if git
	# block access to path begining by "/manager" except from 10.0.0.0/8
	acl internal_network src 10.0.0.0/8
	acl manager path,url_dec -m beg /manager
	use_backend error404 if manager !internal_network
	# redirect multiple traling slash to one slash
	acl has_multiple_slash path_reg /{2,}
	http-request set-path %[path,regsub(/+,/,g)] if has_multiple_slash

You can override the "bind" lines with this list:

haproxy_frontend:
  bind_list:
    - "127.0.0.1:443 ssl crt /usr/local/etc/tls/haproxy alpn h2,http/1.1"
    - "127.0.0.1:80"

You can add a raw configuration to the default frontend with this variable:

haproxy_frontend_raw_config: |
  acl admin path,url_dec -m beg /auth/admin
  use_backend error404 if admin !internal_network

You can deactivate the default frontend with this variable:

haproxy_default_frontend: false

You can also define any number of custom frontends with this object:

haproxy_frontend_list:
  - name: "something"
    mode: "http/tcp"
    bind_list:
      - "*:389"
      - "1.1.1.1:80"
    config: |
      free field to define the config of the frontend

This allows full control over custom frontends for haproxy.

letsencrypt automatic certificate generation

/!\ Lets encrypt automatic certificate generation can only be used on single node cluster (no keepalived).

For this to work correctly, you need to need to have all domains in the haproxy_https_monitoring variable. Each domains has its own certificate, alternative names are not supported.

To activate it, set this variable:

haproxy_letsencrypt: true

During certificate generation and renew, an http server is created to handle the challenge on port 8888. The server is created via a simple python command line and is only active during lets encrypt operations.

Coraza WAF installation

Enable coraza WAF like this:

haproxy_coraza: true

If the haproxy_waf_sample_percent variable is defined, Coraza will be enabled in the default frontend. However, if waf_sample_percent is defined within the haproxy_frontend_list, Coraza will be enabled in each frontend where waf_sample_percent is explicitly set.

IIS specific headers for https

The header Front-End-Https On is the equivalent to X-Forwarded-Proto https for IIS, to activate it, set this variable to true:

haproxy_iis: true

Issues with compression

Some mime types are problematic if compressed so compression was disabled for them, those are:

application/hal+json
application/prs.hal-forms+json

See the following tickets for more informations:

haproxy and journald

In the systemd file for haproxy, the following line was added:

BindReadOnlyPaths=/dev/log:/var/lib/haproxy/dev/log

This line gives to haproxy the capability to send its log to journald. While this looks like a good idea, it is not.

With this lines, the logs are duplicated between /var/log/haproxy.log and journald. On production, this means an increase by a factor of 40 (!!!) of the amount of write to the disk.

With this line: 400KB/s, without: 10KB/s.

This is crazy... and remember that this is duplicate logs that we don't use since filebeat will read the /var/log/haproxy.log and ignore journald. This also shows the poor optimisation of journald vs simple log files but this is an other story.

Anyway, this role removes this line from the service file for all those reasons.

haproxy documentation

Official documentation can be found at https://www.haproxy.org/download/2.8/doc/configuration.txt (change the version number for the latest if needed).

Important part that we look for often is the one that details the "Session state at disconnection", which is essential to debug connectivity issues. Search for "8.5. Session state at disconnection" in the doc to find it immediately.