Discovering Kubernetes
This website has historically been hosted on a VPS costing me about 10$ a month. It's a static blog behind an Nginx server with Let's Encrypt certificates. A VPS, which is basically a full-fledged Linux virtual machine, is a bit much for static pages, since you can get cheaper hosting for those (e.g. for free on GitHub), but I like the flexibility that it provides. Kubernetes makes even less sense, since I can already manage and update my website easily, and it doesn't need to scale.
But I wanted to try anyway, at least as a learning experience. In this post, I will explain how I migrated this classical VPS deployment to Kubernetes.
The container
Kubernetes runs containers, so we need to containerize the website first.
The following snippet is the Dockerfile
. It uses two stages, one to generate the static
pages, the other to serve those static pages with Nginx on port 80 (no HTTPS here, we'll
let the Kubernetes cluster handle that).
FROM debian:10 AS build RUN apt-get update \ && apt-get install -y \ python3-pip \ curl RUN curl -sL https://deb.nodesource.com/setup_12.x | bash - RUN apt-get update \ && apt-get install -y nodejs RUN pip3 install lektor WORKDIR /app COPY src src WORKDIR src RUN cd webpack && npm install --package-lock-only RUN lektor build \ --extra-flag webpack \ --buildstate-path /tmp/lektor \ --output-path /app/html RUN find /app/html FROM nginx EXPOSE 80 COPY --from=build /app/html/ /usr/share/nginx/html
I use Lektor for this blog but you can apply this approach to any other static site generator. You could even run a dynamic website in such a container.
You can test the container with the following commands:
docker build --tag web . docker run --publish 8000:80 web
And then you can try and visit http://localhost:8000 with your browser.
That was the easy part. The next sections will tackle more complex issues.
The cluster
We expect the following from Kubernetes:
- Run the container on a virtual machine.
- Expose it on a public IP address on ports 80 and 443.
- Get certificates from Let's Encrypt.
Running the container
Before Kubernetes can run the container it needs to download the container image.
I created a private container registry at DigitalOcean and uploaded my image to it:
docker tag web registry.digitalocean.com/<registry_name>/web docker push registry.digitalocean.com/<registry_name>/web
Ideally, you would use proper tags to control what version of your image will be deployed,
but we'll just rely on the implicit latest
tag in this post.
Now that you have a registry, you will need a Kubernetes cluster. I used DigitalOcean but there are other options. For the purpose of this tutorial, it is enough to configure it with just one node (e.g. 10$ per month).
Once it is running, you must configure kubectl
to authenticate against it and be able to
run commands such as the following:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
foo-3jr7e Ready <none> 5d6h v1.16.8
Then, configure the default
service account to use the correct credentials for your
registry:
kubectl create secret generic regcred \ --from-file=.dockerconfigjson=<path_to_docker_config_json> \ --type=kubernetes.io/dockerconfigjson
kubectl patch serviceaccount default \ --patch '{"imagePullSecrets": [{"name": "regcred"}]}'
You can download the docker-config.json
from DigitalOcean, preferably the read-only
version since your default service account won't need to push images.
There are other ways to configure authentication to the container registry but I find this one to be practical.
From now on, I'll start showing you YAML files to configure the cluster. You can apply
them individually with kubectl apply --filename <yaml_file>
but what I usually do is put
them all in a directory and run kubectl apply --recursive --filename <directory>
every
time I add a new one or change an existing one. It might not be the best way to deploy to
Kubernetes but it sure is simple.
This first YAML file will make Kubernetes download the image and run it in a container:
apiVersion: apps/v1 kind: Deployment metadata: name: web labels: app: web spec: replicas: 1 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: main image: registry.digitalocean.com/<registry_name>/web imagePullPolicy: IfNotPresent ports: - containerPort: 80
Don't worry, it is simpler than it looks:
kind: Deployment
says that we're defining aDeployment
resource. This is a practical resource for deploying pods. You could definePod
objects directly but they wouldn't survive events such as node failures. ADeployment
combinesPod
objects with aReplicaSet
to ensure your pods remain in the desired state.metadata: ...
defines a name and a label for the deployment. It will be used in other YAML files to refer to this one.replicas: 1
sets to one the number of pods that should be running. If anything causes the pod to die, it will be recreated so that there is always one pod running.selector: ...
designates the pods we want to run.template: ...
specifies how the pods should be created. This includes the image the container should be instantiated from and what network port to expose (80
here).
You will find a lot more details in the Kubernetes documentation for deployments.
This is not the only way to deploy containers but it simple enough for our test.
Once you've run kubectl apply
, you should be able to see your pod in the list:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
web-58746c97cd-v96kb 1/1 Running 0 5d7h
You can try to change the number of replicas in the configuration, run kubectl apply
again, and see what happens to the list of pods.
The pod isn't accessible from outside the cluster but you can access it from your computer with port forwarding:
$ kubectl port-forward web-58746c97cd-v96kb 8000:80
Forwarding from 127.0.0.1:8000 -> 80
Forwarding from [::1]:8000 -> 80
Then, visit http://localhost:8000 with your browser and you should see your website.
So, that wasn't so difficult, was it? You'd better get used to all this YAML because a lot more is to come.
Note: While kubectl
can do everything, I find it useful to also use the Kubernetes
Dashboard or k9s for ease of use and better visibility on the pods. For instance,
they make it quite easy to inspect the logs of containers when you don't know their names.
Exposing the container to the outside: Basics
The Service
There are many ways to expose a pod to outside the cluster, but a common first step is to define a service:
apiVersion: v1 kind: Service metadata: name: web spec: type: ClusterIP selector: app: web ports: - protocol: TCP port: 80 targetPort: 80
You can see that we're reusing the app: web
selector defined before. This configuration
will make the website available to other pods inside the cluster under the name "web".
That's nice because your pods could change names or IP addresses, and services abstract
away those details. The service also acts as a load balancer internally.
You can see your services with kubectl get services
.
The service is internal because of the type: ClusterIP
attribute. The official
documentation shows two other types that will enable you to expose it
to the outside:
NodePort
exposes the service directly on the node, on a port higher than 30000 (chosen or random).LoadBalancer
tells the cloud provider (DigitalOcean in our case) to use a load balancer to redirect to the service. You can choose any port.
NodePort
isn't suited to our use case on its own:
- It requires the node to have a public IP address.
- You can't easily listen on ports 80 and 443 (port range constraints and potential port conflicts).
- The service will change IP addresses if the pod if rescheduled on a different node.
In summary, you could probably make it work for your blog but it wouldn't be easy.
The Load Balancer
The LoadBalancer
option seems more promising. It costs an additional $10 per month but
doesn't have the problems of the NodePort
.
Actually, we'll need to use a fancy Kubernetes mechanism to make this work but before that I'll explain why DigitalOcean's load balancer alone is not enough:
- It requires a few DigitalOcean-specific annotations to enable TLS.
- Certificate management cannot easily be automated unless you let DigitalOcean manage your DNS.
Before we leave that section, note that you can determine the IP address associated with
your load balancer with kubectl get services
in the EXTERNAL-IP
column.
Exposing the container to the outside: Advanced
The Ingress
An Ingress is a Kubernetes resource that abstracts away several reverse proxy features like load balancing and HTTP routing. This feature is still in beta but it's already quite popular. Unfortunately, this is also where things become complicated, so you are likely to encounter some bumps before you get it to work.
There are two components to configure: the Ingress resource and the Ingress Controller. In this subsection, we'll focus on the Ingress resource.
Before we add an ingress, ensure that the type of your service is ClusterIP
(in case you
had changed it to LoadBalancer
or anything else).
Now, let's add the following resource:
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: ingress-web annotations: kubernetes.io/ingress.class: "nginx" cert-manager.io/issuer: "letsencrypt-staging" spec: rules: - host: <domain_name> http: paths: - path: / backend: serviceName: web servicePort: 80 tls: - hosts: - <domain_name> secretName: <secret_name>
There are quite a few new things here:
- The annotation
kubernetes.io/ingress.class: "nginx"
tells the ingress controller to check this ingress resource. But it does nothing for now since we haven't installed a controller yet. - The annotation
cert-manager.io/issuer: "letsencrypt-staging"
does the same thing for the certificate manager, which I will cover in a later section. - The rule starting with
host: <domain_name>
describes a route from a domain name to a service, which is the serviceweb
listening on port80
we created earlier. You can have several rules like that, which is probably the main feature of ingress resources. - The
tls
section enables HTTPS. It defines the secret to use for storing the certificate (and its private key) and to what domains that secret applies.
The Ingress Controller
The controller is a set of Kubernetes resources that will pick up our ingress declaration from the previous section and expose it accordingly.
I used the NGINX Ingress Controller but other ingress controllers exist.
To install it, I recommend following the installation guide. For
me, it consisted in applying DigitalOcean-specific YAML file with kubectl
.
At this point, your website should be accessible via HTTPS and present a self-signed certificate.
The Certificate Manager
To make the website accessible by anyone, we want a valid certificate. The certificate needs to be obtained from a certificate authority and then provided to NGINX. It will then need to be renewed regularly. Cert-manager can do all that automatically.
As with the ingress controller, you'll need to install the component by downloading and applying a YAML file. See the installation instructions for more details.
Next, you'll need to configure it. Below is an Issuer
resource configured to use the
staging instance of Let's Encrypt:
apiVersion: cert-manager.io/v1alpha2 kind: Issuer metadata: name: letsencrypt-staging spec: acme: server: https://acme-staging-v02.api.letsencrypt.org/directory email: <user_email_address> privateKeySecretRef: name: letsencrypt-staging solvers: - http01: ingress: class: nginx
I suggest the use of staging here because it will make it possible for you to ensure that the configuration is correct before switching to production, which can get you banned you if you do too many requests.
There's a tweak you'll need on DigitalOcean. Update the annotations of the NGINX Ingress service in the YAML file you downloaded in the previous section so that they look like the following:
apiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/do-loadbalancer-enable-proxy-protocol: 'true' service.beta.kubernetes.io/do-loadbalancer-hostname: '<domain>'
The do-loadbalancer-hostname
annotation makes it possible for your containers to access
services via the load balancer. This is needed because cert-manager performs a local test
before contacting Let's Encrypt, and that test will fail if you don't configure the load
balancer that way.
It turns out to be a simple fix but it took me a lot of time to figure out the first time. See DigitalOcean's internal documentation for more information.
The website should soon get a valid certificate and be ready for visitors. Congratulations if you've gone this far; it was probably not easy.
Conclusion
So, what's the point of all this if we could have achieved the same with a simple VPS and few scripts?
Well, first, it's nice to learn new technologies on a simple example.
Second, the same technology and techniques would apply to a dynamic website. You could even follow the same steps as in this article if your dynamic website is simple enough. Then, you would be able to adapt the Kubernetes configuration as your website grows in size and complexity.
This power comes at a price, however. The learning curve is steep and there is so much more to cover before your website becomes really reliable. For instance:
- How to upgrade the Kubernetes cluster (preferably without downtime)?
- How to upgrade NGINX Ingress and cert-manager? Maybe manage them with Helm?
- How to debug problems when they arise? It's easy to make a mistake in the configuration and break the website.
- How to ensure that the cluster and your pods are secure?
Many of these questions would not come up with a more traditional approach to infrastructure because we are more used to that. But for Kubernetes, there are still grey areas, at least for me.
I hope this article made Kubernetes less obscure to you and that it gave you a better idea of whether you need it or can afford it.