K8s on Oracle Cloud [Part 7]: Setting up cert-manager
cert-manager should be installed as one of the first services as many services depend on it and may fail to deploy corectly if cert-manager is missing.
To speed the process up, add automation and to make sure the entire installation can be easily replayed we use a set of scripts available on gihub repository: k8s-scripts. While there is some documentation for the scripts and you can look in scripts source code to get more details, this guide expands on the details explaining various options and suggesting optimal settings.
Personal notes: My personal notes on how to setup stuff to make it easier to repeat next time.
Step 1: Prerequisites
- K8s on Oracle Cloud “Part 4” is completed.
- K8s on Oracle Cloud “Part 6” and all before that, although not all of the services are used by cert-manager, they all will be needed later on.
- cmctl installed. This is optional but the command line tool is very useful to check and diagnose cert-manager deployment. You can skip this but I highly recommend to get it and use it.
Step 2: Configuration
Please read carefuly Letsencrypt section as it is critical for correctly setting up cert-manager.
Version adjustment
k8s-scripts define versions for services which are up to date and tested at the when the project was last updated by it’s developers. These versions may become outdated over time, or perhaps you need/want to use a very specific version of the package.
To adjust the cert-manager package version look at the ~/.tigase-flux/envs/versions.env
file and change value of the CM_VER
property:
# Cert-Manager
CM_VER="1.8.0"
To check what is the latest available version for the package run command:
helm search hub --max-col-width 65 cert-manager | grep "URL\|cert-manager/cert-manager "
Example:
~$ helm search hub --max-col-width 65 cert-manager | grep "URL\|cert-manager/cert-manager "
URL CHART VERSION APP VERSION DESCRIPTION
https://artifacthub.io/packages/helm/cert-manager/cert-manager 1.8.0 v1.8.0 A Helm chart for cert-manager
Adjust the versions.env
file with the latest CHART VERSION.
Letsencrypt settings
There are 3 settings affecting cert-manaher deployment. One is critical to set correctly and 2 others are importent to be left as they are:
SSL_EMAIL="EMAIL_FOR_LETSENCRYPT"
SSL_STAG_ISSUER="letsencrypt-staging"
SSL_PROD_ISSUER="letsencrypt"
ROUTE53_ACCESS_KEY=""
ROUTE53_SECRET_KEY=""
-
SSL_EMAIL
- must be set to a correct and working email address. Technically, the email is not verified so anything email-like would work but Letsencrypt is going to send notifications about certificate expiration, renewal and possibly others. I do not know how would Letsencrypt handle it if emails are bounced back with unknown recipient. Maybe they would stop issuing certificates? It is better to set it to a correct working email address. -
SSL_STAG_ISSUER and SSL_PROD_ISSUER
these are just certificate issuer identification for your k8s cluster. In theory they can be set to anything. k8s-scripts scripts for other services may have hardcoded certificate issuer set to the default values, so it is better to left these as they are unless you really want to check them. If changed, make sure other services use a correct issuer as well. -
ROUTE53_ACCESS_KEY, ROUTE53_SECRET_KEY
are used to configure DNS01 challenge issuer. This is used when the DNS domain is hosted on AWS Route53. DNS01 challenge issuer is more flexible than HTTP01 as domains and hostnames do not have to point to the cluster’s IP address to obtain SSL certificates. Good tutorial on how to create AWS user with a correct access credentials.- Go to
IAM
Add users
- Set a
User name
- route53-man -
Select AWS credential type
- Programmatic access -
Set permissions
- Attach existing policies directly -
Create policy
- route53-man{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "route53:GetChange", "Resource": "arn:aws:route53:::change/*" }, { "Effect": "Allow", "Action": "route53:ChangeResourceRecordSets", "Resource": "arn:aws:route53:::hostedzone/*" }, { "Effect": "Allow", "Action": "route53:ListHostedZonesByName", "Resource": "*" } ] }
- Attach the new policy to the user
- Copy user’s
Access keu ID
andSecret access key
toenvs/cluster.env
- Go to
Custom values
There is not much of custom configuration added by default.
installCRDs: true
prometheus:
# set to true if you want to enable prometheus monitoring for cert-manager
enabled: false
-
installCRDs
must be set totrue
or cert-manager will not be able to work correctly. It needs several custom resources to be installed. If this property is set totrue
it installs all automaticall. Otherwise you would need to install them manually. -
prometheus
is set to false by default because cert-manager is installed as one of the first services before monitoring and prometheus is installed. So it is not yet available. If prometheus is already installed or if prometheus is installed later, this setting can be changed totrue
.
On top of this the cert-manager installation script adds additional custom files with certificate issuer configuration, which is set to obtain certificates from Letsencrypt service.
Step 2: Installation
Once all the settings are adjusted and ready, the installation step is pretty simple. We just have to run a correct script from the k8s-scripts package: scripts/cluster-cert-manager.sh. Before we run the script, I suggest to execute flux get all -A
and/or flux get hr -A
before and after to see the difference.
The installation script makes following changes:
- Installs the chart source repository
- Creates Helm release manifest for cert-manager in the FluxCD’s git repository
- Installs 2 certificate issuers: 3.1. letsencrypt-staging - for test certificates 3.2. letsencrypt - for production certificates
Optionally check the cert-manager installation using cmctl
tool:
~/temp/k8s-scripts$ cmctl check api
Not ready: the cert-manager CRDs are not yet installed on the Kubernetes API server
And this is correct becuase we have not installed cert-manager yet.
Now is the time to run the installation script:
~/temp/k8s-scripts$ ./scripts/cluster-cert-manager.sh
Adding cert-manager source at https://charts.jetstack.io
/home/t/.tigase-flux/projects/cluster-name
[master 8fbc46c] cert-manager deployment
2 files changed, 11 insertions(+)
create mode 100644 infra/common/sources/cert-manager.yaml
Enumerating objects: 12, done.
Counting objects: 100% (12/12), done.
Delta compression using up to 16 threads
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 891 bytes | 891.00 KiB/s, done.
Total 7 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/a/cluster-name
491aa23..8fbc46c master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/8fbc46cd7da63164b98d9fc3fbb743df914569ee
Waiting for the system to be ready
Deploying cert-manager
Creating folder for cert-manager namespace...
Update service kustomization
/home/t/.tigase-flux/projects/cluster-name
Update namespace kustomization
/home/t/.tigase-flux/projects/cluster-name
Update common kustomization
/home/t/.tigase-flux/projects/cluster-name
[master 0781bce] cert-manager deployment
5 files changed, 42 insertions(+)
create mode 100644 infra/common/cert-manager/cert-manager/cert-manager.yaml
create mode 100644 infra/common/cert-manager/cert-manager/kustomization.yaml
create mode 100644 infra/common/cert-manager/kustomization.yaml
create mode 100644 infra/common/cert-manager/namespace.yaml
Enumerating objects: 14, done.
Counting objects: 100% (14/14), done.
Delta compression using up to 16 threads
Compressing objects: 100% (10/10), done.
Writing objects: 100% (11/11), 1.39 KiB | 1.39 MiB/s, done.
Total 11 (delta 1), reused 2 (delta 1), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To https://github.com/a/cluster-name
8fbc46c..0781bce master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/0781bce54c1913f4348157a58ff5c7066f5cb42e
Waiting for the system to be ready
Update service kustomization for infra/common/cert-manager/cert-manager
in /home/t/.tigase-flux/projects/cluster-name
/home/t/.tigase-flux/projects/cluster-name
[master 3e96610] cert-manager deployment
3 files changed, 30 insertions(+)
create mode 100644 infra/common/cert-manager/cert-manager/issuer-production.yaml
create mode 100644 infra/common/cert-manager/cert-manager/issuer-staging.yaml
Enumerating objects: 14, done.
Counting objects: 100% (14/14), done.
Delta compression using up to 16 threads
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 1.12 KiB | 1.12 MiB/s, done.
Total 9 (delta 2), reused 1 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 1 local object.
To https://github.com/a/cluster-name
0781bce..3e96610 master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/3e96610f70cae2163eb4c410043a7a246473b0c1
Seems to be successful, let check it out:
/.tigase-flux$ flux get hr -A
NAMESPACE NAME READY MESSAGE REVISION SUSPENDED
cert-manager cert-manager True Release reconciliation succeeded v1.8.0 False
flux-system sealed-secrets True Release reconciliation succeeded 2.1.8 False
ingress-nginx ingress-nginx True Release reconciliation succeeded 4.1.1 False
and kubectl
:
~/temp/k8s-scripts$ kubectl get pods -n cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-789bb474bd-n2fsm 1/1 Running 0 4m5s
cert-manager-cainjector-6bc9d758b-gb48s 1/1 Running 0 4m5s
cert-manager-webhook-586d45d5ff-z7t8d 1/1 Running 0 4m5s
and cert-manager API:
~/temp/k8s-scripts$ cmctl check api
The cert-manager API is ready
Let’s get more details:
~/temp/k8s-scripts$ kubectl describe deployment cert-manager -n cert-manager
Name: cert-manager
Namespace: cert-manager
CreationTimestamp: Wed, 11 May 2022 16:53:30 -0700
Labels: app=cert-manager
app.kubernetes.io/component=controller
app.kubernetes.io/instance=cert-manager
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=cert-manager
app.kubernetes.io/version=v1.8.0
helm.sh/chart=cert-manager-v1.8.0
helm.toolkit.fluxcd.io/name=cert-manager
helm.toolkit.fluxcd.io/namespace=cert-manager
Annotations: deployment.kubernetes.io/revision: 1
meta.helm.sh/release-name: cert-manager
meta.helm.sh/release-namespace: cert-manager
Selector: app.kubernetes.io/component=controller,
app.kubernetes.io/instance=cert-manager,
app.kubernetes.io/name=cert-manager
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=cert-manager
app.kubernetes.io/component=controller
app.kubernetes.io/instance=cert-manager
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=cert-manager
app.kubernetes.io/version=v1.8.0
helm.sh/chart=cert-manager-v1.8.0
Service Account: cert-manager
Containers:
cert-manager:
Image: quay.io/jetstack/cert-manager-controller:v1.8.0
Port: 9402/TCP
Host Port: 0/TCP
Args:
--v=2
--cluster-resource-namespace=$(POD_NAMESPACE)
--leader-election-namespace=kube-system
Environment:
POD_NAMESPACE: (v1:metadata.namespace)
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: cert-manager-789bb474bd (1/1 replicas created)
Events: <none>
More important and interesting is, perhaps, certificate issuer information:
~/temp/k8s-scripts$ kubectl get clusterissuer -A
NAME READY AGE
letsencrypt True 20h
letsencrypt-staging True 20h
Which gives us a list of all certificate issuers available on the cluster.
Now let’s see details information about a specific issuer:
~/temp/k8s-scripts$ kubectl describe clusterissuer letsencrypt
Name: letsencrypt
Namespace:
Labels: kustomize.toolkit.fluxcd.io/name=common
kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations: <none>
API Version: cert-manager.io/v1
Kind: ClusterIssuer
Metadata:
Creation Timestamp: 2022-05-11T23:53:57Z
Generation: 1
Managed Fields:
API Version: cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name:
f:kustomize.toolkit.fluxcd.io/namespace:
f:spec:
f:acme:
f:email:
f:privateKeySecretRef:
f:name:
f:server:
f:solvers:
Manager: kustomize-controller
Operation: Apply
Time: 2022-05-11T23:53:57Z
API Version: cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:acme:
.:
f:lastRegisteredEmail:
f:uri:
f:conditions:
.:
k:{"type":"Ready"}:
.:
f:lastTransitionTime:
f:message:
f:observedGeneration:
f:reason:
f:status:
f:type:
Manager: cert-manager-clusterissuers
Operation: Update
Subresource: status
Time: 2022-05-11T23:53:57Z
Resource Version: 4451893
UID: f9f88019-fadb-48b3-8368-46b9cc4b09a8
Spec:
Acme:
Email: cluster-name@domain.com
Preferred Chain:
Private Key Secret Ref:
Name: letsencrypt
Server: https://acme-v02.api.letsencrypt.org/directory
Solvers:
http01:
Ingress:
Class: nginx
Status:
Acme:
Last Registered Email: cluster-name@domain.com
Uri: https://acme-v02.api.letsencrypt.org/acme/acct/539454786
Conditions:
Last Transition Time: 2022-05-11T23:53:57Z
Message: The ACME account was registered with the ACME server
Observed Generation: 1
Reason: ACMEAccountRegistered
Status: True
Type: Ready
Events: <none>
It seems everything is ready and waiting.
Testing and verification that it really works
Ok, now everything seems to be working. Let’s check it out.
First we need a domain with DNS for that domain configured to point to our cluster load balancer, the one which is used by ingress. Yes, this is why we need ingress service installed.
Let’s say we want to use very rare and unique domain called example.com
. I know, very unusual. Anyway. To make things more interesting, we want to obtain this certificate for a few subdomains as well: www.example.com
and mail.example.com
. Make sure all the domains and hostnames are configured and resolve correctly to the ingress LB IP addres:
~$ host example.com
example.com has address 123.456.789.123
Now, we have to create manifest file ~/cert-man-test.yaml
for the certificate request for cert-manager on our cluster:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com
namespace: default
spec:
secretName: example-com-tls
issuerRef:
name: letsencrypt-staging
kind: ClusterIssuer
commonName: example.com
dnsNames:
- example.com
- www.example.com
- mail.example.com
Let’s load the file to the cluster and see what happens:
~$ kubectl apply -f ~/cert-man-test.yaml
certificate.cert-manager.io/example-com created
Now, we can check status of our request:
~$ kubectl get certificaterequest -A
NAMESPACE NAME APPROVED DENIED READY ISSUER REQUESTOR AGE
default example-com-2lspx True True letsencrypt-staging system:serviceaccount:cert-manager:cert-manager 21m
default example-com-kjcrg True False letsencrypt-staging system:serviceaccount:cert-manager:cert-manager 12m
It usually takes a while, a few secs up to a few minutes, to receive a certificate. As we can see above, there are 2 requests, one is successfull, second is still not ready. To get more details on what is going on with the one “READY=False”, we can check our orders:
~$ kubectl get order -A
NAMESPACE NAME STATE AGE
default example-com-2lspx-473502887 valid 22m
default example-com-kjcrg-1627368428 pending 13m
We see one order is still pending. This most likely happens if one of the listed domains or hostnames does not have a proper DNS configuration or the DNS change we made is not yet propagated on the Internet.
You can check the order details for more information:
~$ kubectl describe order example-com-kjcrg-1627368428
...
Once the certificate request is successfull, you can edit the ~/cert-man-test.yaml
file and replace ‘letsencrypt-staging’ with ‘letsencrypt’ to check if production certificates are being generated correctly.
Uninstallation
To uninstall cert-manager, you can run the installation script with --remove
parameter:
~/temp/k8s-scripts$ ./scripts/cluster-cert-manager.sh --remove
Switched to context "cluster-name".
Preparing to remove: cert-manager
Removing: infra/common/cert-manager
Update service kustomization for infra/common/ in /home/k/.tigase-flux/projects/cluster-name
/home/k/.tigase-flux/projects/cluster-name
Removing: infra/common/sources/cert-manager.yaml
Update service kustomization for infra/common/sources in /home/k/.tigase-flux/projects/cluster-name
/home/k/.tigase-flux/projects/cluster-name
[master c61d6e8] Removing cert-manager deployment
11 files changed, 121 deletions(-)
delete mode 100644 infra/common/cert-manager/cert-manager/cert-manager.yaml
delete mode 100644 infra/common/cert-manager/cert-manager/issuer-production-dns.yaml
delete mode 100644 infra/common/cert-manager/cert-manager/issuer-production.yaml
delete mode 100644 infra/common/cert-manager/cert-manager/issuer-staging.yaml
delete mode 100644 infra/common/cert-manager/cert-manager/kustomization.yaml
delete mode 100644 infra/common/cert-manager/cert-manager/route53-secret.yaml
delete mode 100644 infra/common/cert-manager/kustomization.yaml
delete mode 100644 infra/common/cert-manager/namespace.yaml
delete mode 100644 infra/common/sources/cert-manager.yaml
Enumerating objects: 12, done.
Counting objects: 100% (12/12), done.
Delta compression using up to 16 threads
Compressing objects: 100% (6/6), done.
Writing objects: 100% (7/7), 726 bytes | 726.00 KiB/s, done.
Total 7 (delta 3), reused 1 (delta 1), pack-reused 0
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
To https://github.com/a/cluster-name
ce8f2e5..c61d6e8 master -> master
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision master/c61d6e8f44b8bc3965929ed014b2d78077512d0e
Note, if you have services which depends on cert-manager, uninstallation may fail or may be successful but the cluster can become unstable.