16/01/2025

Load balancing the Kubernetes Control Plane

Highly available control plane components for the simple folk

kubernetes networking homelab

When creating a Kubernetes cluster, one of the first things the documentation recommends is load balancing the control plane.
Since the official guide doesn't cover methods to do so, let's find out a simple yet very effective method that anyone can implement in their own network without having special hardware.

Optional ramble: true load balancing and other psyops

This is an optional (yet informative) paragraph about the nuances of load balancing, if you're here just for the practical part or you already know what I'm talking about, feel free to skip to the next paragraph.
There are two types of load balancing:

passive/active load balancing
active/active load balancing

The first one, which is the topology we are going to implement, is not, unfortunately, what you can call "true" load balancing, because in such scenario, only one -- the active one -- of the available load balancers are actually routing traffic at a certain point in time, whereas the others are in standby, waiting for the active load balancer to fail. Once such failure happens, one of the "waiting" load balancers -- the passive ones -- will be elected as the main one.
We will enter into detail about how all of this works in a bit.

When talking about an active/active scenario, multiple load balancers work in parallel to achieve true load balancing, meaning that traffic can be routed by many load balancers at the same time. This is generally achieved through what is called a anycast setup, generally obtained using BGP. In this scenario, there are no single points of failure, because if some load balancer fails, there is no actual downtime.
BGP is very clever, because, in very simple terms (but it's essentially all it does), it "enables multiple hosts to share the same IP" (it's an oversimplification, but you get the point) and advertise it from different locations at the same time: it's essentially the protocol that makes the modern internet work. Have you ever wondered how CDNs can serve traffic from multiple locations all over the world? Well, that's essentially BGP!
Take a look at it yourself and try to dig google.com:

$ dig google.com
; <<>> DiG 9.18.30 <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46619
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;google.com.   IN A

;; ANSWER SECTION:
google.com.  93 IN A 142.250.180.174 <-- THIS LINE HERE!

;; Query time: 21 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Thu Jan 16 15:43:39 CET 2025
;; MSG SIZE  rcvd: 55

How can Google have just one IP for the whole traffic it handles? Shouldn't they have more than one? No! That's because that is an anycast IP, meaning that it gets advertised from many places all over the world and BGP-enabled routers can correctly route traffic to the nearest "advertising point", reducing latency and optimizing resources.
The same concept can be applied in your network or, much more realistically, inside a datacenter, where there are multiple routers and multiple load balancers doing STUFF no one knows about, using various software and/or special routers, but we are not here for that (I'll probably make a post in the future about BGP tho).
If you're intrested, there's a very good Cloudflare blog post about the matter.

`keepalived` and NGINX to the rescue

Let's go back to our homelab, where we don't have BGP-enabled routers because we don't want to, because we are simple people and because we just want our damn Kubernetes cluster, on which we definitely spent too much time, to be reacheable if one control plane node goes down.

Before proceeding, what would happen if we didn't load balance our control plane?
Let's imagine we had three control plane components and we had our ~/.kube/config file configured as such:

# example kubeconfig file 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://11.11.11.1:6443
  name: homelab
# ... more stuff

That would correspond to this scenario:

kubectl with one control plane

If that node went down, the cluster wouldn't be reacheable anymore, and we would have to manually change the endpoint in our ~/.kube/config file!

What we want to achieve is the following setup: load balancer setup

We essentially want clients to be able to reach our control plane components (i.e. the kube-apiserver on which all kubectl commands are sent to) using only one IP (in this case 11.11.11.253) that will automatically represent all control plane nodes.
The connections will be load balanced using two NGINX instances which will share such IP (using keepalived): this way we won't have any single point of failure (just a little downtime if one of the load balancers goes offline, but that's because of our active/passive setup) on our load balancers: having only one could be a problem for our Kubernetes cluster if the load balancer were to go offline.

Even if it doesn't seem so, achieving this infrastructure is actually quite simple and painless: we will setup NGINX to forward connections to our control plane and keepalived for the active/passive setup.

The configuration

In this scenario we have two load balancers so, naturally, this setup has to be done twice.
You can install NGINX and keepalived following the official guides: NGINX, Keepalived.
Pick the load balancer you want to be the main one and create a keepalived.conf file inside /etc/keepalived with the following content:

! main configuration (this is a comment)
global_defs {
   ! name of the load balancer, doesn't really matter what you choose
   router_id nginx_lb
}

vrrp_script check_nginx {
    ! the script that will be used to check if the nginx process
    ! on this machine is still alive
    script "/bin/check_nginx.sh"
    interval 2
    weight 50
}

vrrp_instance VI_1 {
    ! MASTER (ugh) = main node
    state MASTER
    ! which interface should keepalived listen on
    interface eth0
    ! whatever number you want, it has be the same
    ! for each "pool" of load balancers that you want to share the same IP 
    ! on
    virtual_router_id 51
    ! the main node should have higher priority than secondary ones
    priority 110
    ! advertisement interval in seconds
    advert_int 1
    ! the ip you want to listen for
    virtual_ipaddress {
     11.11.11.253/24
    }
    track_script {
      ! the script we defined earlier
      check_nginx
    }
}

More details on configuration parameters here.
The /bin/check_nginx.sh should look like this:


#!/bin/sh
if [ -z "`/bin/pidof nginx`" ]; then
 systemctl stop keepalived.service
 exit 1
fi

This means: "if /bin/pidof nginx (nginx's process id) is null then shutdown keepalived"
On the secondary load balancer you'll write the same configuration we just saw but with the state variable changed to BACKUP:

global_defs {
  ...
}
vrrp_script ...
vrrp_instance VI_1 {
  state BACKUP
  ...
}

Keeaplived will automatically keep track of all the load balancers that share the same virtual_router_id using the VRRP protocol.
With this configuration, the active load balancer will reply to ARP queries issued to 11.11.11.253 just as if it was its primary IP address. Backup nodes will do the same when the main load balancer goes offline. Looks difficult, but it's a really simple concept (and protocol)!

To actually load balance connections issued to the control plane we will use the following NGINX configuration (/etc/nginx/conf.d/loadbalance.stream):

stream{
  upstream k8s {
          least_conn;
          server 11.11.11.1:6443 max_fails=3 fail_timeout=5;
          server 11.11.11.2:6443 max_fails=3 fail_timeout=5;
          # add as many servers as you need
          # server 11.11.11....:6443 max_fails=3 fail_timeout=5;
 }
  upstream etcd {
          least_conn;
          server 11.11.11.1:6001 max_fails=3 fail_timeout=5;
          server 11.11.11.2:6443 max_fails=3 fail_timeout=5;
          # add as many servers as you need
          # server 11.11.11....:6443 max_fails=3 fail_timeout=5;
  }
  server { 
          listen 4001;
          proxy_pass etcd;
  }
  server { 
          listen 6443;
          proxy_pass k8s;
  }
}

IMPORTANT: the `stream` module must be enabled on NGINX for this to work

With this configuration NGINX will load balance connections between all the Kubernetes control plane components.
Note that this setup is not something limited to Kubernetes: you can use this configuration to load balance traffic to your websites or whatever service you have replicated on multiple servers!

Final touch: `~/.kube/config`

You'll need to edit your ~/.kube/config file with the floating IP we defined inside Keepalived's configuration:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://11.11.11.253:6443 # yay!
  name: homelab

You can now try to disable one load balancer and issue a kubectl command: if you followed the guide correctly, everything should still be working!

That was all for today, thank you for sticking with me until the end.
Until the next post!

Load balancing the Kubernetes Control Plane

Highly available control plane components for the simple folk

Optional ramble: true load balancing and other psyops

keepalived and NGINX to the rescue

The configuration

IMPORTANT: the stream module must be enabled on NGINX for this to work

Final touch: ~/.kube/config

Stay in the loop

`keepalived` and NGINX to the rescue

IMPORTANT: the `stream` module must be enabled on NGINX for this to work

Final touch: `~/.kube/config`