Orchestration for Scalability and Reliability
10 Nov 2021Goal
- To deploy and scale our helloworld microservice Docker container across multiple instances using Kubernetes orchestration.
Discussion
In the previous post, we improved the deployability of our service by containerizing it. That was nice. Still, it was just on one instance. Our journey is going to be long and arduous before we reach the utopian helloworld.
In this post, we will see how we can achieve some scale and improve reliability. Traditionally, this is where we would introduce a load balancer, deploy our container across a few instances and put them all behind the load balancer. That is still a very logical thing to do here, but I thought I would take a slightly different direction and attempt to introduce Orchestration via Kubernetes. To be clear, we are certainly going to introduce a load balancer to do some fun things real soon. Looking at Kubernetes now just gives us a few niceties upfront:
- we can get used to the container orchestration model early, and put it in the background of everything we do.
- we can understand how orchestration effectively manages deployment / provisioning of our container across ‘nodes’.
- we can see how nodes can be part of a cluster, demonstrating scalability (we will not discuss the internals of how Kubernetes actually achieves scalability. That comes later!).
- we can see how elastic the cluster can be, quickly bringing up and down nodes.
- we can try bringing down a node and see how we can still get reliable service.
Based on the above points, we already understand to a reasonable extent what Kubernetes does. Basically, it provides a platform / framework for managing containerized services (loosely using the term here). We can scale up and down the available copies of the containerized service, provide self-healing capabilities for the containers, automate rollouts and rollbacks and many more. This post isn’t meant to deep-dive into Kubernetes architecture or features. So, it is best to refer to the official documentation for more info.
We will focus on using Kubernetes to deploy our service container across two instances, managed by a Kubernetes Control Plane, which is deployed on a separate instance.
Infra
New toys! We will spin up 3 VM instances running on top of Hyper-V, provisioned with Ubuntu 20.04 Live Server. One will function as the Kubernetes Control Plane while the other two will function as the Kubernetes worker nodes.
Stack
The microservice hasn’t changed. So, it is the same stack as the helloworld post.
Containerization
- Docker
Orchestration
- Kubernetes
Architecture
Setup
There are two aspects to the setup here. First, we need to spin up those 3 VMs and put them in an internal network, preferably assigning static IPs for ease of access. It is a bit of a distraction to be detailing setup notes for these steps. Instead, I will refer to this excellent blog post which details all the steps needed to set up VMs on Hyper-V and network them as needed for our Kubernetes cluster.
A little gotcha:
The step where static IPs are reserved will differ from router to router. Also, in my experience, I couldn’t get the IP reservation honored on VM restart. So, you may want to do the additional step of setting up static IP on the VMs themselves, using a process like this.
The blog post then goes to describe how to install docker and kubernetes software (kubeadm, kubelet and kubectl). Following that, it details steps needed to be taken in the control plane and on the worker nodes, to set up the cluster and join it. Control plane used to be called ‘master’, and they moved away from that terminology for obvious reasons. The post uses the ‘master’ terminology since it pre-dates the change.
Once we complete all those steps, we have the kubernetes cluster ready and raring to go! Kubectl is our friend from now on. We can quickly find out more about the cluster by issuing kubectl commands on the control plane. First, let’s try to get some cluster info:
kubectl cluster-info
Kubernetes control plane is running at https://192.168.1.10:6443
CoreDNS is running at https://192.168.1.10:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
This indicates that there is a control plane and a CoreDNS service (Why? we will visit this later). Next, let’s try to learn about the nodes in the cluster:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-control-plane Ready control-plane,master 8d v1.22.4
kube-node-1 Ready <none> 8d v1.22.4
kube-node-2 Ready <none> 8d v1.22.4
Ah, coolness. There is our control plane and there are the two worker nodes, all ready to go.
We will now do an important step which sounds a bit silly at first. We are going to ‘taint’ the control plane so that it can run normal workloads as well. Now, why would we want to do that? One obvious use case is if we have limited resources and want the control plane to pitch in as well. But, my use case is a bit different. In upcoming posts, we will be exploring a few features where ‘controller’ workloads of third-party software will need to be deployed on the control plane. If we don’t allow them to be deployed on the control plane as a normal workload, they will be deployed on a worker node, where they simply won’t start or work correctly. Here is how we taint the control plane:
kubectl taint nodes --all node-role.kubernetes.io/master-
The next logical step is to use the Cluster to do something, like, deploying our helloworld container. To do that, we need to create a Deployment. A deployment is how we get these containers on to the Nodes we created. Essentially, by creating a deployment, we will create a Pod on a node. A pod can hold one or more containers. In our case, the pod will hold our helloworld container.
Then, if we scale this deployment, it will create the same pod on more nodes. All these pods are replicas of each other, and is aptly called a ReplicaSet. Finally, to expose our little pod ensemble for consumption to the outside world, we will create a Service. There, that wasn’t too bad, was it? :)
An important point to understand here is that we will perform all of these activities on the control plane alone. We wouldn’t do anything directly on the worker nodes! Kubernetes takes care of the actual deployment into the nodes.
Since we are going to deploy the container from our docker hub, we need to first login to docker from the control plane.
sudo docker login
kubectl create deployment helloworld --image=<my_username>/helloworld
We can check the deployment by a get and we will see details about it. We can also check what pods are running, and we will see that a single pod is running now.
kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
helloworld 1/1 1 1 9s
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
helloworld-7866444b8b-hbbqp 1/1 Running 0 2m34s 10.36.0.1 kube-node-2
Excellent, now we are cooking. What if we want to scale this up so that we run it on both the nodes?
kubectl scale deployment/helloworld --replicas=2
kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
helloworld 2/2 2 2 9m41s
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
helloworld-7866444b8b-hbbqp 1/1 Running 0 9m1s 10.36.0.1 kube-node-2
helloworld-7866444b8b-mv774 1/1 Running 0 23s 10.44.0.1 kube-node-1
Alright, now we have scaled up to two pods on two nodes. All fine and dandy. Can we now access the microservice? Er, not yet. These little pods are not exposing anything to the external world. They are internal to the cluster. To expose them, we need to do the following:
kubectl expose deployment helloworld --type=NodePort --port=80 --target-port=8000
The port and target port nuance in that command essentially says: port 80 of the node points to port 8000, where our microservice actually runs. Now, let’s execute another command to see what this exposed:
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
helloworld NodePort 10.106.198.208 <none> 80:32120/TCP 116s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8d
Notice the port info on the helloworld service. It says 80:32120. This means that we can access our helloworld microservice by using the control plane URL like this:
http://<control-plane-url>:32120 # the port number will change everytime
Note that we can also get access our service by directly hitting the worker IP addresses as well. This is because we used the type NodePort. We may or may not want this. We can figure out some alternative configurations later.
We can scale down the nodes to 1, or even yank the metaphorical power cable of one of the nodes (just turn it off from Hyper-V manager). After a few moments, a get nodes on the control plane will show this:
NAME STATUS ROLES AGE VERSION
kube-control-plane Ready control-plane,master 8d v1.22.4
kube-node-1 Ready <none> 8d v1.22.4
kube-node-2 NotReady <none> 8d v1.22.4
The service will still continue be served by the other remaining node. When we bring back up the downed node, it will just auto-join the cluster. The control plane will look at the number of needed pods (we need 2) and just spin up the new pod right away on the newly joined node.
The terrific trifecta in action:
- Scalability (we could scale up the resources)
- Reliability (nodes could go down without affecting the service) in action.
- Elasticity (we could scale down when resources were not needed)
Code
No code. All config!
Summary
Tenet | State | Observation |
---|---|---|
Deployable | Better | By using Docker-based containerization and by managing and orchestrating container deployments via Kubernetes, we have improved the deployability of our microservice. |
Scalable | Better | Since Kubernetes can quickly scale up and down as necessary (and with minimal to no intervention, if we configure it so), we have improved the scalability of our microservice and elasticity of our production environment. |
Reliable | Better | Due to the orchestrated and scaled environment, individual node failures no longer bring down the entire service, and hence we have improved the reliability and availability of our microservice. |
The Docker-Kubernetes 1-2 punch will work well for us going forward, but these are only two among the many tools and techniques we will explore, in our quest for scalability and distributedness. There are more problems to solve, and I would like to remind ourselves that we are still dealing with helloworld :)
We have consciously kept away from any application complexity to establish a baseline infrastructure that will serve us well. This also means that we have not touched upon anything related to Data, which is perhaps the bedrock of any serious modern application / service. We will continue to spend some more time on the ‘web’ layer before delving into Data.