K8s on Oracle Cloud [Part 8]: Setting up Longhorn

11 minute read

Longhorn is another basic service that is installed on the k8s cluster to be used by other services and applications. It is not mandatory but it is extremely useful as offers volumes replicated between cluster nodes and automated backups to S3 object storage.

To speed the process up, add automation and to make sure the entire installation can be easily replayed we use a set of scripts available on gihub repository: k8s-scripts. While there is some documentation for the scripts and you can look in scripts source code to get more details, this guide expands on the details explaining various options and suggesting optimal settings.

Personal notes: My personal notes on how to setup stuff to make it easier to repeat next time.

Step 1: Prerequisites

K8s on Oracle Cloud “Part 4” Basic Setup is completed.
K8s on Oracle Cloud “Part 5” Sealed Secrets is completed.
K8s on Oracle Cloud “Part 6” Ingress Nginx is completed.
K8s on Oracle Cloud “Part 7” Cert-Manager is completed. Cert-manager is not used by the Longhorn installation script right now but it is very likely it will be used in the future..
S3 object storage prepared with access credentials for Longhorn backups. This is optional but highly recommended to setup automated backups for our data.

Step 2: Configuration

There is no mandatory configuration options to be set for Longhorn and you can just skip to the Installation section to set it up. There are further configuration options that can only be set from the Longhorn UI.

However, there a few options worth looking at, as adjusting them for your cluster might be necessary.

Version adjustment

k8s-scripts define versions for services which are up to date and tested at the when the project was last updated by it’s developers. These versions may become outdated over time, or perhaps you need/want to use a very specific version of the package.

To adjust the cert-manager package version look at the ~/.tigase-flux/envs/versions.env file and change value of the LH_VER property:

                  # Longhorn
LH_VER="1.2.4"

To check what is the latest available version for the package run command:

                  ~$ helm search hub --max-col-width 60 longhorn | grep "URL\|longhorn/longhorn"
URL                                                         	CHART VERSION	APP VERSION	DESCRIPTION
https://artifacthub.io/packages/helm/longhorn/longhorn      	1.2.4        	v1.2.4     	Longhorn is a distributed...

                

Adjust the versions.env file with the latest CHART VERSION.

Custom values

Custom values for the “longhorn” service can be found in ~/.tigase-flux/envs/longhorn_values.yaml file:

                      defaultSettings:
      defaultReplicaCount: 3
      defaultDataLocality: "best-effort"
      backupTarget: ""
      backupTargetCredentialSecret: "aws-s3-backup"

                

`defaultReplicaCount`

This property specifies how many replicas would be maintained for your longhorn volumes. The greater the number is, the more redundancy and also more traffic within the cluster to synchronize replicas and more disk space is used.

If there are 3 cluster nodes and replica count is set to 3, it means volumes would replicated and maintained on each cluster node. This provides additional benefit of data being available on each node hence better performance of reading the data. However, the disk space on each cluster noed is used as well.

`defaultDataLocality`

This property instructs Longhorn to “try” to keep volume data on the same cluster node as the pod using it is. Obviously locally available data results in greater reading performance. best-effort is recommended setting and this is what we use.

`backupTarget`

This property points Longhorn to an S3 backup storage. It is an s3 URI to your S3 storage bucket. It should look like: s3://s3-bucket-name@aws-region-id/optional-path/. The optional path is useful if you plan to store backups from different services in the same bucket.

`backupTargetCredentialSecret`

S3 access credentials are stored on the cluster in encrypted form. The secret object with the credentials is named: aws-s3-backup. Unless you changed the script and the secret name, this property should be left with this default value.

Backups to S3 storage

Longhorn offers a built-in automated backups to S3 storage.

In order to enable it, we need to create an S3 object storage and prepare it for use as a backup and obtain S3 access key and S3 secret key with access permissions to this S3 bucket. Then, open envs/cluster.env file and edit set values to the following properties:

                  ### Longhorn backup settings
export LH_S3_BACKUP_ACCESS_KEY=""
export LH_S3_BACKUP_SECRET_KEY=""

                

Step 3: Installation

Note. The installation script sets Longhorn volumes as default and makes oci storage class or any other as non-default.

Once again, the installation step is pretty simple. We just have to run a correct script from the k8s-scripts package: scripts/cluster-longhorn.sh. Before we run the script, I suggest to execute flux get all -A and/or flux get hr -A before and after to see the difference.

The installation script makes following changes:

Installs the chart source repository
Creates Longhorn UI access credentials and encrypts them using Sealed Secrets
Creates helm release manifest for Longhorn in FluxCD’s git repository
Changes default volume storage to longhorn and makes oci volume non-default
Creates ‘/lh/’ ingress for Lonhorn UI

Now, as you are ready, let’s run the installation script and answer questions as requested. Command to run is:

./scripts/cluster-longhorn.sh

Example:

                  ~/temp/k8s-scripts$ ./scripts/cluster-longhorn.sh 
      Adding longhorn source at https://charts.longhorn.io
/home/t/.tigase-flux/projects/cluster-name
[master 318bac5] longhorn deployment
 2 files changed, 11 insertions(+)
 create mode 100644 infra/common/sources/longhorn.yaml
Enumerating objects: 12, done.
Counting objects: 100% (12/12), done.
Delta compression using up to 16 threads
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 773 bytes | 773.00 KiB/s, done.
Total 7 (delta 3), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
To https://github.com/a/cluster-name
   3e96610..318bac5  master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/318bac522f6d4a537c3cdae972014c3601f5434c
Waiting for the system to be ready
Provide Longhorn user name: xxxxx
Provide Longhorn user password: xxxxxxxxxxxxxx
   Deploying longhorn
Creating folder for longhorn-system namespace...
+ flux create helmrelease longhorn \
    --interval=3h \
    --release-name=longhorn \
    --source=HelmRepository/longhorn \
    --chart-version=1.2.4 \
    --chart=longhorn \
    --namespace=flux-system \
    --target-namespace=longhorn-system \
    --values=/home/t/.tigase-flux/envs/longhorn-values.yaml \
    --create-target-namespace \
    --depends-on=flux-system/sealed-secrets \
    --export
+ set +x
Update service kustomization
/home/t/.tigase-flux/projects/cluster-name
Update namespace kustomization
/home/t/.tigase-flux/projects/cluster-name
Update common kustomization
/home/t/.tigase-flux/projects/cluster-name
[master cc86a2a] longhorn deployment
 5 files changed, 42 insertions(+)
 create mode 100644 infra/common/longhorn-system/kustomization.yaml
 create mode 100644 infra/common/longhorn-system/longhorn/kustomization.yaml
 create mode 100644 infra/common/longhorn-system/longhorn/longhorn.yaml
 create mode 100644 infra/common/longhorn-system/namespace.yaml
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 16 threads
Compressing objects: 100% (11/11), done.
Writing objects: 100% (11/11), 1.25 KiB | 1.25 MiB/s, done.
Total 11 (delta 3), reused 1 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (3/3), completed with 2 local objects.
To https://github.com/a/cluster-name
   318bac5..cc86a2a  master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/cc86a2a27348cbe2a836cb18869b6ce846248e0c
Waiting for the system to be ready
2022-05-12 16:45:12 System not ready yet, waiting 20
      Making oci storage class non-default
storageclass.storage.k8s.io/oci patched
storageclass.storage.k8s.io/oci patched
t:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
secret/basic-auth created
      Temporarily disabling LH ingress
You can access LH UI using 'kubectl proxy --port 8001' and then open link in your browser:
http://localhost:8001/api/v1/namespaces/longhorn-system/services/http:longhorn-frontend:80/proxy/
Update service kustomization for infra/common/longhorn-system/longhorn 
    in /home/t/.tigase-flux/projects/cluster-name
/home/t/.tigase-flux/projects/cluster-name
[master 2e3170c] longhorn deployment
 3 files changed, 46 insertions(+)
 create mode 100644 infra/common/longhorn-system/longhorn/aws-s3-backup-credentials-sealed.yaml
 create mode 100644 infra/common/longhorn-system/longhorn/longhorn-ingress.yaml
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 16 threads
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 2.45 KiB | 2.45 MiB/s, done.
Total 9 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/a/cluster-name
   cc86a2a..2e3170c  master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/2e3170cbb466a78220a4a7c0a1ce56ebfb1eef4a

                

And a quick check if it installed and ready:

                  ~$ flux get hr -A
NAMESPACE      	NAME          	READY	MESSAGE                         	REVISION	SUSPENDED 
cert-manager   	cert-manager  	True 	Release reconciliation succeeded	v1.8.0  	False    	
flux-system    	sealed-secrets	True 	Release reconciliation succeeded	2.1.8   	False    	
ingress-nginx  	ingress-nginx 	True 	Release reconciliation succeeded	4.1.1   	False    	
longhorn-system	longhorn      	True 	Release reconciliation succeeded	1.2.4   	False    	

                

At this point you have Longhorn installed and ready to use. There is no need for a further tweaking and configuration unless you want to enable backups. If you plan to enable backups, make sure that LH_S3_BACKUP_ACCESS_KEY and LH_S3_BACKUP_SECRET_KEY are set correctly in the ~/.tigase-flux/envs/cluster.env.

Step 3: Tweaking and additional configuration

There is quite a lot to configure, tweak and adjust in Longhorn from the UI, however, in this post we only focus on enabling backups.

We now have Longhorn working and ready to use on our cluster. However, backups are not yet configured and this can only be done from the Longhorn web UI.

During the installation process, the script asked you for a user name and password to access Longhorn UI. The UI is configured to be accessible through a public IP of the load balancer for your cluster under ‘/lh/’ path. To obtain the public IP address run the following command:

                  $ kubectl get ingress longhorn-ingress -o yaml -n longhorn-system
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/auth-realm: Authentication Required
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
  creationTimestamp: "2022-05-13T19:52:57Z"
  generation: 1
  labels:
    kustomize.toolkit.fluxcd.io/name: common
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: longhorn-ingress
  namespace: longhorn-system
  resourceVersion: "5345292"
  uid: b7cdb225-8ce6-42c3-a3c3-14cb5def63e6
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: longhorn-frontend
            port:
              number: 80
        path: /lh(/|$)(.*)
        pathType: Prefix
status:
  loadBalancer:
    ingress:
    - ip: 123.124.128.124

                

At the very end of the command output you can find the IP address you are looking for. You could now, enter the following address in your web browser to get to the LH UI:

http://123.124.128.124/lh/

When you try to open the page, your web browser should ask you about user and password, which you provided at installation time.

There is also, another way to access LH UI. Without a password but even more secure. You can run the following command in terminal:

$ kubectl proxy --port 8001

which opens a proxy channel to your kubernetes cluster.

and then, in your web browser open following URL:

                  http://localhost:8001/api/v1/namespaces/longhorn-system/services/http:longhorn-frontend:80/proxy/

                

This opens a website from your kubernetes cluster through the proxy you created above.

In either case, you get the same web UI: alt

This is a standard dashboard for a new LH installation with no volumes, storage size depending on your cluster configuration and number of nodes on your cluster.

First thing now to do is to finish setting up backups. Go to Settings/General menu (Longhorn ver. 1.2.4) and scroll down to Backup section: alt

Backup Target is an s3 URI to your S3 storage bucket. It should look like: ‘s3://s3-bucket-name@aws-region-id/optional-path/’. The optional path is useful if you plan to store backups from different services in the same bucket.
Backup Target Credential Secret this is a name of k8s object which keeps s3 access credentials, ‘access_key’ and ‘secret_key’ you entered in envs/cluster.env file for LH_S3_BACKUP_ACCESS_KEY and LH_S3_BACKUP_SECRET_KEY properties. You just need to enter the ‘aws-s3-backup’ there.

If you look at Backup tab right now you will find it empty. However, the Recurring Job tab shows one position, if you had LH_S3_BACKUP_ACCESS_KEY set to some value during executing installation script. This is because the script automatically installs job for daily backups.

alt

If you open/edit the recurring job you should see something like this:

alt

Name - is the name of the job, here it is ‘backup-daily-4-7’
Task - is a type of the recurring job. It is set to backup, another possible option is ‘snapshot’.
Retain - is a number of copies preserved, or in another way, for how long the backup is stored. 30 for daily backups means we have a month worth of historical data and 30 last days.
Concurrency - how many jobs can run concurrently. Let’ say we have more than 1 volumes to backup. With higher concurrency we can run backups for a few volumes at the same time
Cron - this just crontab definition. It specifies when the job is run. Here it says, run it daily at 4:07AM. More details on crontab.
Groups - we can group our volumes in separate… groups. And a job can be run for specific group or groups only. ‘default’ groups applies to any volume in default group or to volume which is not assigned to any group. Therefore this job would execute for every new volume not assigned to any group by default.

You can create a test volume, attach it to a cluster node and wait for a backup to be executed to see if it all works. If you do not want to wait until 4:07AM to see if the backup works you can either create own recurring job executed at greater frequency or at different time or run a manual backup on the volume.

Otherwise, just deploy some services or apps what would create volumes and watch it in a real world use.

Uninstallation

To uninstall Longhorn, please follow instructions about Sealed Secrets uninstallation described in K8s on Oracle Cloud “Part 5”.

Note. You will not be able to remove/uninstall Longhorn if there are any services running on the cluster which use longhorn volumes. You will end-up with a conflict and the cluster reconcilation would be suspended until other services which use longhorn volumes are uninstalled.

Another note. Removing Longhorn this way, that is by removing Longhorn manifest and config data from git repository seems to work. Longhorn pods disappear from the cluster, however, subsequent attempt to deploy Longhorn fails. That is cluster gets stuck in “kustomization/common” reconcilation forever. So far I was not able to resolve this issue. Any ideas or suggestions are very welcomed.

Share on

Twitter Facebook LinkedIn

kobit