Deploying a Production Kubernetes Cluster in 2023 — A Complete Guide

Pavel Glukhikh
21 min readAug 25, 2023

--

Photo by Growtika on Unsplash

Recently, I saw some articles that mentioned Kubernetes as a platform may be losing popularity. Some even suggested that the platform is dead or will soon be dead. I strongly disagree with this assessment. Kubernetes is still very much relevant and widely used. Not only that, but there has been some exciting advancements and updates since the last time I wrote a deployment guide.

So, without any further delay, let’s get into it. This guide is a step by step walkthrough on deploying a production-ready Kubernetes cluster that can run on-prem, in a large datacenter, or in the cloud. (Or all 3 at once!)

This cluster focuses on several key things:

  • Ease of deployment
  • Ease of management
  • Scalability in every layer from the load balancer to the ETCD cluster with minimal steps required to join new nodes
  • Security — every connection between every node is encrypted
  • Batteries included — At the end of this guide, no other config is required. Everything from the highly available control plane, to the cluster filesystem will be working.

First, let’s go through the deployment environment:

  • This cluster is not intended for single board computers (Raspberry Pi, etc). It requires a decent amount of space, memory, and CPU. (But nothing a single ESX or Hyper-V node wouldn’t be able to handle.)
  • ETCD likes SSDs, but is not required unless you will be running a large, I/O heavy ETCD cluster.
  • It is possible in this deployment scenario to not only mix / match different worker / control plane node types, but also to mix and match the deployment environments. This means that you could deploy a worker on one server, a test worker on a desktop, an HA control plane in Azure, and a mirror in AWS. Yes, you have THAT much flexibility.
  • This setup was tested in an ESX enterprise environment with non-SSD storage.
  • According to the resource pool stats in vCenter, running just management workloads, the entire cluster uses:
  • 10 virtual machines + load balancer (can be another VM)
  • ~300–500 mb of RAM avg per node
  • ~500–700 MHz of CPU avg per node
  • I used 50 GB HDDs for ETCD, 80 GB for masters, and 120 GB for workers

Let’s look at what is actually in the cluster.

The entire cluster, scaled to production-ready specs, includes a total of 10 virtual machines plus a load balancer. Obviously, this is a large cluster, and designed to handle large, enterprise-grade workloads that can scale out even further.

If you need to scale down, you can use 2 ETCD nodes, at least 1 master and at least 1 worker, plus a load balancer VM that can also handle storage if needed (or the storage can come from a NAS / SAN).

ETCD Cluster:

The 3 node ETCD cluster hosts the distributed key value database engine, ETCD, which is used as the key value database for the Kubernetes cluster. In Kubernetes terms, ETCD is the “backend datastore”. The cluster is highly available and all connections are routed through the load balancer (HAProxy in my deployment). All connections are also encrypted and secured with certificates. In this deployment, we will be turning off the bundled ETCD services and using the external cluster only.

Load Balancer:

The load balancer is an external component to the cluster that handles connections between several things:

  • ETCD node-to-node connections
  • Connections to the ETCD cluster (+ load balancing)
  • Master-to-master connections
  • Connections to the control plane (master) cluster (+ load balancing)
  • Communication between the workers and the master cluster (+ load balancing)
  • Optional — sessions from users to the Kubernetes workloads and TLS termination (if ingress is not used) (+ load balancing)

Storage Provider:

This could be anything that has a lot of available disk space and is capable of handling iSCSI connections. (Or whatever technology you want to use to make the storage on the device available to the cluster nodes). In my deployment, I used a virtual machine running an iSCSI server that uses ESX managed datastore space. An enterprise NAS or SAN will also work here.

Master Cluster:

The 3 master nodes run the Kubernetes Control Plane and all other master services. The Kubernetes worker nodes connect to this master cluster. Note that the master cluster is capable of also hosting Kubernetes workloads (not something usually done in larger datacenter scale clusters).

Workers:

The workers connect to the master cluster via the load balancer. They host the majority of the Kubernetes workloads. In this deployment, the master cluster and the workers will both be capable of running workloads. Note that the workers are not a cluster, but are standalone servers that connect to the master cluster.

OCFS2 Cluster Filesystem Storage Cluster:

The master and the worker nodes host a distributed Oracle Clustered Filesystem storage cluster that gives us shared cluster-ready storage. You can use any cluster-aware filesystem here, but I decided to use OCFS2 as it is enterprise ready and straightforward to set up.

Each node except the ETCD nodes and load balancer will have a shared disk containing an OCFS2 cluster filesystem that will be managed by Longhorn and have its own storage class.

The cluster will look something like this (note that for this deployment we will be using a load balancer for ETCD instead):

Credit: Janakiram MSV

Preparing the Servers

As mentioned before, the entire scaled deployment consists of 10 servers or virtual machines, or some combination of both. See above for the minimal required number of machines.

For my deployment, I used Ubuntu 22.04 minimized version. Note that the minimized part simply means the installer does not install anything but the absolute minimum required to run Ubuntu. It can be selected in the Ubuntu guided installer. Also note that none of the machines have a GUI.

We not be using kubectl proxy because it's a terrible tool and needs to be retired, and we will also not be using any “localhost” addresses. So don’t use the GUI! I recommend you install the non-gui versions of Linux on each machine. Embrace the terminal!

I posted above the average compute benchmarks for each of the VM types. Plan your deployment according to what you will use it for. In my deployment, I use it for medium intensity production workloads, so I have between 8 and 12 GB of RAM on each server. Each VM also has 12 vCPUs. But again, tune this to what you think is appropriate for your use case.

The shared cluster volume also has about 500 GB of space in my deployment.

Once your 10 servers or however many machines you choose to deploy are up, the fun begins!

On each machine, run the usual prep commands.

apt update && apt upgrade -y

Once the updates are done, make sure of the following. This is REALLY important!

  • Each server / vm must have a unique hostname. Don’t use things like server 1 or node a. Assign meaningful hostnames to each server before you proceed.
  • Every hostname must be able to reach every other hostname. If you have an internal DNS namesystem or trust whatever DNS solution you are using, feel free to create A records for each server’s hostname. .local will also work in most cases.
  • I recommend however, using the /etc/hosts file and copying / pasting a list of hostnames and IPs in the entire cluster (all load balancers, all masters, all workers, storage provider, and all ETCD nodes). On every single server that will participate in this deployment.

Oh, and we will be using the nano text editor for editing things in this guide. Vim is overrated. (Yes, I said it).

Some helpful commands to prep your servers:

1a. Change hostname:

sudo hostname new-hostname

1b. Edit the hosts file:

sudo nano /etc/hosts

1c. Make sure to update your hostname in the hosts file as well if you changed it. Once all of that is done, reboot the server.

Side note — if you don’t want to connect to 10 servers every time, use a program like Terminus or a command system like Ansible / AWX.

2. Swap still breaks container orchestration systems. We need to turn off swap on every node except the storage provider, etcd, and load balancer.

On any nodes hosting Kubernetes (masters and workers only):

sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

and

sudo swapoff -a

3. We also need to change some things on the networking side of Linux. Run these commands on any nodes hosting Kubernetes (masters and workers only):

sudo modprobe overlay
sudo modprobe br_netfilter

sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

sudo sysctl --system

That's it for the preparation. (No, really — this is it. K3S will do the actual installs for us!)

Let’s set the Kubernetes nodes aside for now and work on the load balancer.

SSH into your load balancer machine and run the following commands:

  1. Update sources and install HAProxy.
apt update && apt install haproxy -y

2. Create a configuration file that has the following:

  • A frontend for ETCD cluster communication
  • A frontend for master cluster communication
  • A frontend for haproxy stats
  • A backend for ETCD cluster communication
  • A backend for master cluster communication

The configuration will look something like this, modify the bindings and IPs to match your own network.

nano /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
daemon

stats socket /var/lib/haproxy/stats # Make sure this path / file exists

defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000

frontend stats
bind *:8399
stats enable
stats uri /stats_secure
stats refresh 10s
stats admin if LOCALHOST
stats auth some_user:changeme # Change to your desired logins

frontend etcha200-main # For ETCD
bind 10.0.1.5:2379 # Change to your network
retries 3
mode tcp
option tcplog
default_backend etcha200-main-backend


frontend kubeha200lb # For master cluster
bind 10.0.1.5:6443 # Change to your network
retries 3
mode tcp
option tcplog
default_backend kubha200-masters


backend etcha200-main-backend
mode tcp
balance roundrobin
option tcp-check

server etc020 server_ip_address:2379 check fall 3 rise 2 # Change to your ETCD servers
server etc021 server_ip_address:2379 check fall 3 rise 2
server etc022 server_ip_address:2379 check fall 3 rise 2


backend kubha200-masters
mode tcp
balance roundrobin
option httpchk GET /healthz
http-check expect status 200 # Expect a 200 OK response for a healthy server
timeout connect 5s # Increase the timeout for establishing connections
timeout server 60s # Increase the timeout for waiting for a response from the server
timeout check 10s # Increase the timeout for health checks
option tcp-check

server kubema050 server_ip_address:6443 check fall 3 rise 2 # Change to your master servers
server kubema051 server_ip_address:6443 check fall 3 rise 2
server kubema052 server_ip_address:6443 check fall 3 rise 2

3. Once you have finished editing the file, save it and restart HAProxy:

sudo systemctl restart haproxy
sudo systemctl enable haproxy

4. You should now have a fully configured HAProxy load balancer. You can also access the HAProxy dashboard at:

http:// <your_haproxy_server_ip>:8399

The logins are the ones you set in the beginning of the config file.

But, but, but, my HAProxy didn’t start!

If you are having problems starting HAproxy after writing the config, check for these things:

  • Are the bind IPs you used actually available on that server? (They must be an IP or IPs assigned to a network adapter on that server).
  • Does each front end have a matching back end?
  • Are the run socket / admin.sock files created? If not, create them.
  • Make sure you don’t have a typo anywhere in the file.
  • To check the HAProxy configuration, use
haproxy -f /etc/haproxy/haproxy.cfg -c

For more details on HAProxy or additional troubleshooting guides, see: https://www.haproxy.com/blog/testing-your-haproxy-configuration

Also note —when the cluster is built, your HAProxy server will be handling some pretty important connections. You may want to look into getting a highly available HAProxy setup, always validating changes with the above command, and using haproxy reload instead of haproxy restart.

That’s it for the load balancer!

Now, let's build a production-ready ETCD cluster.

  1. SSH into each of your ETCD nodes and running the following commands:
nano download-etcd.sh

2. Copy and paste this script:

#!/bin/bash

ETCD_VER="v3.5.9"
DOWNLOAD_URL="https://storage.googleapis.com/etcd"

# Clean up previous downloads
rm -f "/tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz"
rm -rf "/tmp/etcd-download-test"
mkdir -p "/tmp/etcd-download-test"

# Download etcd release
curl -L "${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz" -o "/tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz"

# Extract and prepare downloaded files
tar xzvf "/tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz" -C "/tmp/etcd-download-test" --strip-components=1

# Clean up downloaded tarball
rm -f "/tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz"

# Make binaries executable
chmod +x "/tmp/etcd-download-test/etcd"
chmod +x "/tmp/etcd-download-test/etcdctl"

# Verify the downloaded binaries
"/tmp/etcd-download-test/etcd" --version
"/tmp/etcd-download-test/etcdctl" version

# Move binaries to the bin folder
sudo mv "/tmp/etcd-download-test/etcd" "/usr/local/bin"
sudo mv "/tmp/etcd-download-test/etcdctl" "/usr/local/bin"

echo "etcd ${ETCD_VER} is now installed and ready for use."

3. Save it and make it executable:

chmod 766 download-etcd.sh

4. Run the download script:

./download-etcd.sh

The script will automatically download and install the latest version of ETCD. At the time of writing, that version is 3.5.9. Make sure you update the version you want to download in the script if you are reading this some time in the future.

5. Let’s make sure ETCD is installed:

etcd --version

etcd Version: 3.5.9
Git SHA: c92fb80f3
Go Version: go1.19.10
Go OS/Arch: linux/amd64

6. Now we need to make some certificates. In this guide, I used Cloud Flare’s SSL tool. Feel free to use whatever tool you want, you just need a CA file, a .crt and a .key for each node in the cluster.

Let’s install cfssl:

VERSION=$(curl --silent "https://api.github.com/repos/cloudflare/cfssl/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*/\1/')
VNUMBER=${VERSION#"v"}

wget https://github.com/cloudflare/cfssl/releases/download/${VERSION}/cfssl_${VNUMBER}_linux_amd64 -O cfssl

chmod +x cfssl

sudo mv cfssl /usr/local/bin

Note — in this guide, we will be installing cfssl to the first ETCD node, however it does not make a difference where you install it and generate the certs from. You can also do this on any Linux or Mac machine.

If you need other cfssl install instructions: https://computingforgeeks.com/how-to-install-cloudflare-cfssl-on-linux-macos/?expand_article=1

7. Next, we need to generate the certificates.

mkdir certs && cd certs
echo '{"CN":"CA","key":{"algo":"rsa","size":2048}}' | cfssl gencert -initca - | cfssljson -bare ca -
echo '{"signing":{"default":{"expiry":"43800h","usages":["signing","key encipherment","server auth","client auth"]}}}' > ca-config.json

This will create the directory structure and set up the cert generator config.

This creates 3 files — ca-key.pem, ca.pem, and ca.csr

8. Next, we need to create the actual certs and keys:

Important! Make sure you replace the hostnames and IPs with your own. This must match the hosts file or DNS records created earlier.

export NAME=etcd-1
export ADDRESS=10.0.1.15,$NAME
echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME

Let’s do an ls to see what we have so far:

ls

We should now have certs and keys for the first node.

9. Repeat the steps for the each of the ETCD nodes and, create a cert for your load balancer as well using the steps above. All that needs to be changed is the IP and the name.

Once that is complete, you should have a full set of certs and keys for each ETCD node and your HAProxy node. Save these someplace safe!

10. Now, we need to distribute the certs and keys to each node.


Server=10.0.0.16
USER=somebody

scp ca.pem $USER@$HOST:etcd-ca.crt
scp node-1.pem $USER@$HOST:server.crt
scp node-1-key.pem $USER@$HOST:server.key

Replace the server and user with your own, as well as the cert names. Note that this may take a while as there is a lot of files to copy.

You can also use a program like Cyberduck if you want something with a GUI to transfer the file.

Be careful of the file names and extensions! Make sure you have 2 .crt and 1.key on each ETCD server and the load balancer.

11. Now, ssh into each of the nodes you copied the certs to and move them to an appropriate directory. I used /etc/pki/k3s.

sudo mkdir -p /etc/pki/k3s
sudo mv * /etc/pki/k3s
sudo chmod 600 /etc/pki/k3s/server.key

12. Now that we have the certs generated and in thier proper locations, this is where the fun begins. We can now start installing ETCD, however we will not be using thier guide. Instead, we will be using a load balancer and securing the communications between all nodes, instead of using ad-hoc connections.

sudo nano /etc/etcd/etcd.conf

Paste the following and modify according to the comments:

ETCD_NAME=hostname # Change to the hostname of current ETCD node
ETCD_LISTEN_PEER_URLS="https://server_ip:2380" # Change to IP of current ETCD node
ETCD_LISTEN_CLIENT_URLS="https://server_ip:2379" # Change to IP of current ETCD node
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" # Whatever name you want, no spaces
ETCD_INITIAL_CLUSTER="etcd1=https://server_ip_1:2380,etcd2=https://server_ip_2:2380,etcd3=https://server_ip_3:2380" # All of your ETCD cluster member hostnames and IPs
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://server_ip:2380" # Current node's IP
ETCD_ADVERTISE_CLIENT_URLS="https://haproxy:2379" # Hostname (NOT IP) of your load balancer
ETCD_TRUSTED_CA_FILE="/etc/pki/etcd/etcd-ca.crt" # Make sure all of the paths are correct
ETCD_CERT_FILE="/etc/pki/etcd/server.crt"
ETCD_KEY_FILE="/etc/pki/etcd/server.key"
ETCD_PEER_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/etcd-ca.crt"
ETCD_PEER_KEY_FILE="/etc/etcd/server.key"
ETCD_PEER_CERT_FILE="/etc/etcd/server.crt"
ETCD_DATA_DIR="/var/lib/etcd"

Make sure to double check hostnames and IPs. If something does not work later, check this file for errors first.

13. Do the same steps as in step 12 for each of the ETCD nodes. Make sure to change the IPs and hostnames of the node where needed.

14. Next, we’ll create the Unit file.

sudo nano /lib/systemd/system/etcd.service

Paste in the following:

[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target

[Service]
Type=notify
EnvironmentFile=/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000

[Install]
WantedBy=multi-user.target

Do the same on all ETCD nodes.

15. We’re ready to start the ETCD service!

sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd

At this point, the ETCD service should be running on all ETCD nodes. Let’s check it’s status:

sudo systemctl status etcd

It should say that the service is loaded and running and there should not be any errors.

If there are errors that a node did not join or the service won’t start:

  • Look at the ETCD logs. If the service fails to start, there will be commands for view the journal and the service logs.
  • Check the unit file and the etcd.conf file carefully.
  • Make sure the path to your certificates is correct and they are writable by ETCD.
  • Check your HAProxy config, in the ETCD backend and frontend.

16. This step is optional. If you want to test the functionality of your ETCD cluster, you can use the following commands:

curl --cacert /etc/pki/k3s/etcd-ca.crt --cert /etc/pki/k3s/server.crt --key /etc/pki/k3s/server.key https://haproxy:2379/health

Replace the cert paths and the haproxy hostname with your own. This is to check cluster health.

etcdctl --endpoints https://haproxy:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key member list
etcdctl --endpoints https://haproxy:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key remove foo bar
etcdctl --endpoints https://haproxy:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key put foo bar
etcdctl --endpoints https://haproxy:2379 --cert /etc/etcd/server.crt --cacert /etc/etcd/etcd-ca.crt --key /etc/etcd/server.key del foo

Again, replace the cert paths and the haproxy hostname with your own. This is to test storing, fetching, and deleting an object in the database.

You can also view the member list with the first command to make sure all nodes are present.

If everything in the tests worked fine, congratulations! You now have a working secure, highly available ETCD cluster that is ready to accept connections from Kubernetes.

Now that we have all of our infrastructure services for Kubernetes set up, we can set up the actual K3S cluster.

We’ll start with the masters.

  1. Make sure swap is disabled on all master nodes (as described earlier), and the networking part is done.
  2. Make sure the hosts file and / or DNS is up to date with all hostnames in the cluster.
  3. Make sure you have copied the certificates, keys, and ca file onto each of the master servers as described in the VM prep steps.

4. Lets go!

This is the command we’ll use to install the K3S service and set up the node as a master:

curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -s - server   --token=your_token   --datastore-endpoint="https://haproxy:2379"   --datastore-cafile="/etc/pki/k3s/etcd-ca.crt"   --datastore-certfile="/etc/pki/kube/server.crt"   --datastore-keyfile="/etc/pki/k3s/server.key"   --tls-san=haproxy

Make sure you replace the parameters with your own.

Here is what it does:

  • The command is a curl command
  • Get the actual installer
  • Set the config mode. See Config options link at the bottom for details.
  • We want to make this node a “sever” aka — master.
  • Token is a string that you generate. I recommend letters and numbers only, 32 chars or less. Keep it safe.
  • The datastore endpoint parameter tells the K3S installer to use the ETCD cluster and to not deploy ETCD as a workload.
  • Next are the certificate file paths. You’ll need to use the one you generated for the load balancer. Make sure to use the hostname.
  • tls-san is the hostname of your load balancer machine.

Once you have everything set up, run the command.

The command will take a few minutes to run. If everything is correct, the run process will end with the K3S service starting.

If it fails to start, use the journal commands to find out what happened, check your command, your haproxy configs, and the steps you have done so far.

If you need to start over at any point, scroll up in the installation output.

You will see a killall.sh script and an uninstall.sh script. Run them both and then reboot. You will have a clean node so you can correct any issues and try again.

5. You should see the master and worker join commands if you run:

service k3s status

If you don’t, here they are:

Masters:

curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -s - server   --token=your_token   --datastore-endpoint="https://haproxy:2379"   --datastore-cafile="/etc/pki/k3s/etcd-ca.crt"   --datastore-certfile="/etc/pki/k3s/server.crt” \

Again, replace with your own parameters. Use the same token.

Workers:

curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -s - agent   --token=your_token   --server https://haproxy:6443   --datastore-endpoint="https://haproxy:2379"   --datastore-cafile="/etc/pki/k3s/etcd-ca.crt"   --datastore-certfile="/etc/pki/k3s/server.crt"   --datastore-keyfile="/etc/pki/k3s/server.key"   --node-name=hostname_of_worker_node

Make sure to change the node hostname accordingly.

If all of the join commands were successful, congratulations! You now have a production-ready Kubernetes cluster!

If something failed, check over your commands and configuration. Use the uninstall scripts if needed.

Also, this may be a good time to check your HAProxy dashboard. If you have a healthy cluster, you will see all of your ETCD and master nodes showing up green and traffic going to all of them.

So, we now have a Kubernetes cluster, great! But, we’re not done yet!

Now that we have a working Kubernetes cluster, we need some distributed storage to store our applications on. As mentioned earlier, we’ll be using an ISCSI storage provider and OCFS2.

You can use any device with enough space that is capable of serving ISCSI or your preferred method of presenting shared volumes to your servers / VMs.

First, we need to install and configure an ISCSI server.

As mentioned above, I used a single VM with a large disk for this.

  1. Prepare the node:
apt update && apt upgrade

2. Install the iSCSI Target Services:

apt install tgt -y

3. Check iSCSI target service status:

systemctl status tgt


tgt.service - (i)SCSI target daemon
Loaded: loaded (/lib/systemd/system/tgt.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2023-22-08 07:13:04 UTC; 23s ago
Docs: man:tgtd(8)
Main PID: 7770 (tgtd)
Status: "Starting event loop..."
Tasks: 1
Memory: 1.1M
CGroup: /system.slice/tgt.service
??7770 /usr/sbin/tgtd -f

Jul 11 07:13:04 kubesp systemd[1]: Starting (i)SCSI target daemon...
Jul 11 07:13:04 kubesp tgtd[7770]: tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
Jul 11 07:13:04 kubesp tgtd[7770]: tgtd: work_timer_start(146) use timer_fd based scheduler
Jul 11 07:13:04 kubesp tgtd[7770]: tgtd: bs_init(387) use signalfd notification
Jul 11 07:13:04 kubesp systemd[1]: Started (i)SCSI target daemon.

4. Now we can configure a LUN (Logical Unit Number):

nano /etc/tgt/conf.d/iscsi.conf

Copy and paste the following:

<target iqn.2020-07.example.com:lun1>
backing-store /dev/sdb
initiator-address 10.1.0.25
incominguser iscsi-user password
outgoinguser iscsi-target secretpass
</target>

Change the example.com to whatever name you want to use. (Note — this is not a domain name and does not point to DNS. It’s simply an identifier.)

Change /dev/sdb to wherever the disk you want to use is.

Change iscisi-user to some username.

Change password and secretpass to some other passwords.

  • The first line defines the name of the LUN.
  • The second line defines the location and name of the storage device on the iSCSI Target server.
  • The third line defines the IP address of the iSCSI initiator.
  • The fourth line defines the incoming username/password.
  • The fifth line defines the username/password that the target will provide to the initiator to allow for mutual CHAP authentication to take place.

5. Restart the tgt service.

systemctl restart tgt

6. Verify that everything is working correctly:

tgtadm --mode target --op show

Awesome! We now have a working target server. Now we need to connect it to the Kubernetes nodes and set up OCFS.

  1. For each of the Kubernetes nodes (workers and masters only).
apt install open-iscsi -y

2. We can now run a discovery on the ISCISi target.

iscsiadm -m discovery -t st -p ip_oftarget_server

You should see something like this:

tgtserver01:3260,1 iqn.2020-07.example.com:lun1

3. Next we need to define the LUN device:

nano /etc/iscsi/initiatorname.iscsi

3a. Add your target name:

InitiatorName=iqn.2020-07.example.com:lun1

Change this to what you defined in the tgt setup.

4. Next, you we will define the CHAP configuration that we configured on the iSCSI target server to access the iSCSI target from the iSCSI initiator.

The node configuration file will exist in the directory ‘/etc/iscsi/nodes/‘ and will have a directory per LUN available.

nano /etc/iscsi/nodes/iqn.2020-07.example.com\:lun1/192.168.1.10\,3260\,1/default

Again, replace with your own names. (What you configured on the tgt server).

Add / modify the following:

node.session.auth.authmethod = CHAP  
node.session.auth.username = iscsi-user
node.session.auth.password = password
node.session.auth.username_in = iscsi-target
node.session.auth.password_in = secretpass
node.startup = automatic

Replace with your tgt configuration parameters.

5. Restart the iscsi services.

systemctl restart open-iscsi iscsid

6. If you want to verify the connection:

iscsiadm -m session -o show

7. To verify that we have a storage device from the target:

lsblk

We should now have a blank, unformatted volume presented from the tgt server.

8. Create a new ocfs2 filesystem on the new device:

sudo apt update
sudo apt install ocfs2-tools
sudo mkfs.ocfs2 -T mail -L "MyOCFS2Volume" /dev/sdb

Replace MyOCFS2Volume with some name without spaces. You only need to run this command on ONE node.

8a. Make sure to install ocfs2-tools on each node.

8b. To mount the filesystem automatically at boot, run on each node:

sudo nano /etc/fstab

Add the following line:

/dev/sdb      /mnt/ocfs2      ocfs2   _netdev,defaults        0  0

Save and close the file. Do not reboot and do not mount the filesystem yet.

We should now have a formatted ocfs2 volume on the first node. Make sure to install ocfs2-tools on all other Kubernetes nodes you want to the storage on and mount the volume. Remember that you only need to create the filesystem one time.

Now we can set up the OCFS2 cluster.

  1. Run the following commands on each node to prepare for installing the cluster:
modprobe ocfs2

systemctl enable o2cb
systemctl enable ocfs2
systemctl enable iscsid open-iscsi

2. Run the initial configuration on each node. This is important! You will need to define a cluster name. Do not use spaces. On each node, when the config wizard asks for a cluster name, put the same cluster each time.

dpkg-reconfigure ocfs2-tools

Restart the o2cb service:

systemctl restart o2cb

3. Next, we need to define the cluster config. Take a look at the example below and run:

nano /etc/ocfs2/cluster.conf

Copy and paste the following. Make sure to replace the names and details with your own.

cluster:
name = cluster_name # Replace with your own on each line that says "cluster".
heartbeat_mode = local
node_count = 5 # The total number of nodes in your OCFS2 cluster, starting with "0".

node:
cluster = cluster_name
number = 0
ip_port = 7777
ip_address = 10.1.0.5
name = k3sma01

node:
cluster = cluster_name
number = 1
ip_port = 7777
ip_address = 10.1.0.6
name = k3sma02

node:
cluster = cluster_name
number = 2
ip_port = 7777
ip_address = 10.1.0.7
name = k3sma03

node:
cluster = cluster_name
number = 3
ip_port = 7777
ip_address = 10.1.0.8
name = k3swo01

node:
cluster = cluster_name
number = 4
ip_port = 7777
ip_address = 10.1.0.9
name = k3swo02

node:
cluster = cluster_name
number = 5
ip_port = 7777
ip_address = 10.1.0.10
name = k3swo03

Edit the file according to how many nodes you have.

This is important:

  • Make sure your name = sections match EXCATLY just the hostname of that node. The hostnames must match and they must all be in DNS / the hosts file.
  • Make sure the cluster name is the same on all entries.
  • Change the IPs to your own.
  • The member IDs start with 0.

Edit the file and distribute to each node in the Kubernetes cluster.

3b. Restart the o2cb and ocfs2 services:

systemctl restart o2cb ocfs2

3c. To be sure, check your cluster names in the ocfs2-tools. Make sure its the same one you used in each of the cluster config files and node entires.

sudo dpkg-reconfigure ocfs2-tools

4. Here we go! We’re ready to mount the new volume and join the cluster!

On each node, run:

mkdir /mnt/kube
mount -t ocfs2 /dev/sdb /mnt/kube

If you get any errors (logic errors, errors joining the cluster, etc), check your cluster config files on each node and make sure the hostnames and cluster names match in every entry.

We should now have clustered storage mounted on every Kubernetes node.

We are now ready to install and configure kubectl and Longhorn:

  1. Make sure you have kubectl set up. Be default, you can go to the first master and run:
k3s kubectl get nodes

You should see a list of your Kubernetes nodes. This means Kubectl is working.

We need to alias the k3s command:

nano ~/.bashrc

Add the following line:

alias kubectl='k3s kubectl'

Save the file and run:

source ~/.bashrc

You can now use kubectl commands instead of k3s kubectl.

Test it:

kubectl get nodes

2. We can now deploy Longhorn, which will manage our storage:

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.5.1/deploy/longhorn.yaml

If you want to watch the deployment:

kubectl get pods \
--namespace longhorn-system \
--watch

This should take a few minutes and can be a long process. Wait until things have completed before proceeding.

You should now have Longhorn installed.

You can go the Longhorn web interface (see Longhorn install instructions on ports and how to get to it).

3. You can go to the “Nodes” section in the Longhorn dashboard and click the + icon to see all storage on that node. If you want to add any storage, such as the OCFS2 volume, go to the right side menu and select “Edit Node and Disks” for each node. You can then add the volume to each node, along with any other storage you may want to use.

To use the Longhorn storage, simply deploy your application with the Longhorn storage class.

Next up is the Kubernetes Dashboard (if you want to use it).

The guides on Github have been cleaned up, so I will just point you there:

  1. Install: https://github.com/kubernetes/dashboard (deployment commands can be found on the releases page).

2. Creating a user to get into the dashboard: https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md

And that’s it! You now have a production ready, highly available Kubernetes cluster with load balancing, an external ETCD cluster, and managed storage on top of an OCFS2 cluster filesystem.

Enjoy working with your new cluster, and thank you for reading!

~Pavel

--

--

Pavel Glukhikh

Leader in technology, consulting, and cyber security fields. CEO / founder of 2 tech startups. Astrophysics / cosmology / engineering ultra-enthusiast.