API Versioning and Canary Releases

18 Nov 2021

Goal

To introduce a new version of our helloworld microservice, deploy it via Kubernetes using a rolling update strategy, and then set up a canary release using Ambassador, so that only a portion of the requests will be directed to the new version.

Discussion

We have been YAMLing for a while now. So, maybe it is good to get back to a couple of extra lines of code?

One of the desired features of an API gateway is to support canary releases. A canary release is a deployment strategy where a portion of incoming traffic is diverted to an early-release of a new version of a service. This way, we can verify the functionality with a small subset of requests or users, and validate the new version before a large-scale release to all users. It also helps rollback quickly to the current working version since it is just a matter of switching to the older code that is currently serving the other users. Of course, a canary release can be done for a totally new feature too, but I figured we can marry that with a quick API versioning exercise as well.

Infra

Same old 3 VM instances running on top of Hyper-V.

Stack

Same old helloworld post stack, but we will introduce a package called fastapi-versioning to help with API versioning in FastAPI.

Containerization and Orchestration

Docker and Kubernetes

API Gateway

Ambassador Edge Stack, MetalLB

Architecture

Helloworld with Versioning and Canary Release Architecture Diagram

Setup and Code

Let’s quickly install fastpi-versioning and get to work. It is a typical pip install:

pip install fastapi-versioning

For our docker image though, this means just adding fastapi-versioning to requirements.txt. The python code becomes:

from fastapi import FastAPI
from fastapi_versioning import VersionedFastAPI, version

app = FastAPI()

@app.get("/")
@version(1)
def read_main():
    return {"message": "Hello World"}
    
@app.get("/")
@version(2)
def read_main():
    return {"message": "Greetings, Planet!"}
    
app = VersionedFastAPI(app,
    version_format='{major}',
    prefix_format='/v{major}',
    enable_latest=True)

The method signature remains the same while code inside the method changes. Note the introduction of a new decorator called @version, which comes from fastapi-versioning package. This supports both major-minor versioning as well as just major versioning. Here I am using just major versioning for simplicity. The last line of code shows how the API URL will be formatted using only major version. Also note the usage of enable_latest flag, which allows for the biggest version number to be considered as version ‘latest’.

Now, let’s use the docker commands we saw earlier to build and push it to Docker Hub. Once done, it is ready to be rolled out to our Kubernetes cluster. We can use a rollout command to seamlessly roll this out in place of the older version in our cluster.

kubectl rollout restart deploy helloworld

In the goal definition, we referred to a rolling update strategy, which basically means that Kubernetes will not deploy the change to all pods, thereby making the whole system unavailable. Instead, it will apply the change gradually in a rolling fashion.

So, you may be thinking why there isn’t any special configuration to do so. Well, there is, but it was already auto-applied when we let Kubernetes create the Deployment for our helloworld, way back here. If we hand-crafted the deployment YAML, we would have written something like this:

  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate

This defines a rolling update strategy that ensures the following:

We will tolerate a max of 25% of pods to be unavailable at any point in time during the rollout.
If Kubernetes needs to create additional (surge) pods during the update to maintain availability, we will allow only 25% extra pods to be created.

Deployment strategies like this can be applied to any resource/service we have running on our pods. So, this helps not just our code deployments, but also patches to any software running on our cluster nodes. This is a huge out-of-the-box maintainability benefit that we get from using Kubernetes.

Once our rollout is done, just using our existing Ambassador Listener, Host and Mapping as it is, we can access the new API URLs which will look like:

http://<ambassador-ip-addr>/helloworld/v1/
http://<ambassador-ip-addr>/helloworld/v2/
http://<ambassador-ip-addr>/helloworld/latest/

Awesome. There is just one pickle. We just broke all the clients by changing the URL. A quick discussion on some possibilities here:

This could actually be the intended change. Meaning, we want to support two versions for a business reason and let our clients decide which version to use. This is done quite often but ideally the approach is pre-determined so that we begin with a v1 instead of introducing a v1 somewhere down the line.
We intend for the multiple versions to be a short-term situation until all clients can eventually move to the latest and greatest version. In this case, the above URL strategy isn’t quite what we should be doing.
The versioning is transparent to the clients and we just want a canary release approach to expose it to a random/deterministic set of clients and slowly expand the footprint.

We will assume that our situation is #3 here. #2 requires a nice discussion which we can dig into sometime later.

Kubernetes itself can support a coarse version of canary release. This is achieved by creating two Kubernetes Deployments, one for each version, and distributing pods by creating varying number of replicas for these deployments. It is functional, but far from ideal, and doesn’t work with autoscaling as well.

Instead, we will use the canary release feature of Ambassador which provides fine-grained control using Mappings and just a single deployment. It also works fine with autoscaling. The key player here is a weight attribute, which specifies how much traffic will be routed to which mapping. If there is only one mapping and weight isn’t specified, it is assumed to be 100%. If there are multiple mappings for a resource, we can specify weights, leaving the main one without a weight so that Ambassador can do the math for it. Ambassador will correctly load balance it based on the weights and send the right portion of requests to the new version.

In the mapping YAML below, we create two mappings for v1 and v2. We use the rewrite property so that both mappings will be for the same prefix, but they rewrite the resulting URL to two different versions. We also specify the weight value as 10% only for v2:

apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: helloworld-backend-v1
spec:
  hostname: "<your-domain-name>"
  prefix: /helloworld/
  rewrite: /v1/
  service: helloworld:8000
  resolver: endpoint
  load_balancer:
    policy: round_robin
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: helloworld-backend-v2
spec:
  hostname: "<your-domain-name>"
  prefix: /helloworld/
  rewrite: /v2/
  weight: 10
  service: helloworld:8000
  resolver: endpoint
  load_balancer:
    policy: round_robin

We can now delete the previous mapping and create these new ones via kubectl apply. Now, if we hit the same old helloworld link, roughly 10% of the time we will get the new version.

We can gather valuable info about the canary release by using various monitoring techniques. That is a nice segue into monitoring and observability using Ambassador, which is the topic of our next post!

Summary

Tenet	State	Observation
Deployable	Better	By using the canary release feature of Ambassador, along with Kubernetes and Docker, we have improved fine-grained control over our deployments.
Maintainable	Better	Patches to code or any dependencies can be applied without bringing down the pods, thereby reducing the need for maintenance windows and increasing the maintainability of the overall architecture.
Evolvable	Beginnings	While our code is too simple to be discussing evolvability, we have attempted to version our APIs, thereby solving a piece of the puzzle for an evolvable architecture.

In the next post, we will look at how Ambassador can work with Prometheus and Grafana to help monitor our infrastructure and gain insights into what’s happening with our service.

System Design Labs