Kubernetes Rolling Updates and Rollbacks
Rolling Updates and Rollbacks
The last topic with respect to deployments is how updates work. Kubernetes uses roll outs to update deployments.
A Kubernetes rollout is the process of updating or replacing replicas with replicas matching a new deployment template.
Changes may be configuration, such as changing environment variables or labels, or also code changes which result in updating the image key of the deployment template.
In a nutshell, any change to the deployment’s template will trigger a rollout.
Rollout strategies
RollingUpdate (default)
Deployments have different rollout strategies. Kubernetes uses rolling updates by default.
Replicas are updated in groups instead of all at once until the rollout completes.
This allows service to continue uninterrupted while the update is being rolled out. However, you need to consider that during the rollout there will be pods using both the old and the new configuration and the application should gracefully handle that.
Recreate strategy
As an alternative, deployments can also be configured to use the recreate strategy, which kills all the old template pods before creating all the new ones. That of course incurs downtime for the application. In this post I will focus on rolling updates.
Scaling is an orthogonal concept to rollouts
We actually rolled out an update in the previous post when we added the cpu request to the app-tier
deployment’s pod template. Scaling events do not create rollouts. Recall that the number of replicas is not part of the deployment’s template, so it does not trigger a rollout.
Demo of RolloutUpdate
kubectl
includes commands to conveniently
- check,
- pause,
- resume, and
- rollback rollouts.
Let’s see how all of this work. Create the namespace and deployments to get started.
ubuntu@ip-10-0-128-5:~/src# kubectl create -f 5.1-namespace.yaml
namespace/deployments created
ubuntu@ip-10-0-128-5:~/src# kubectl create -n deployments -f 5.2-data_tier.yaml -f 6.1-app_tier_cpu_request.yaml -f 5.4-support_tier.yaml
service/data-tier created
deployment.apps/data-tier created
service/app-tier created
deployment.apps/app-tier created
deployment.apps/support-tier created
ubuntu@ip-10-0-128-5:~/src# kubectl get deployments. -n deployments
NAME READY UP-TO-DATE AVAILABLE AGE
app-tier 0/5 5 0 9s
data-tier 1/1 1 1 9s
support-tier 0/1 1 0 9s
ubuntu@ip-10-0-128-5:~/src# kubectl get deployments. -n deployments
NAME READY UP-TO-DATE AVAILABLE AGE
app-tier 5/5 5 5 16s
data-tier 1/1 1 1 16s
support-tier 1/1 1 1 16s
ubuntu@ip-10-0-128-5:~/src# kubectl get -n deployments pods
NAME READY STATUS RESTARTS AGE
app-tier-74c7df4f88-2584t 1/1 Running 0 72s
app-tier-74c7df4f88-4ksht 1/1 Running 0 72s
app-tier-74c7df4f88-mkdtw 1/1 Running 0 72s
app-tier-74c7df4f88-trqmb 1/1 Running 0 72s
app-tier-74c7df4f88-wn7jl 1/1 Running 0 72s
data-tier-599bc4fcf8-5bf6p 1/1 Running 0 72s
support-tier-58d5d545b6-t8jkc 2/2 Running 0 72s
ubuntu@ip-10-0-128-5:~/src#
Autoscaling and rollouts are compatible but for us to easily observe rollouts as they progress we’ll need many replicas in action. Next, let’s edit the app tier deployment with kubectl edit -n deployments deployment app-tier
, this will open a vi
editor
#
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2020-05-03T21:08:26Z"
generation: 1
labels:
app: microservices
tier: app
name: app-tier
namespace: deployments
resourceVersion: "4657"
selfLink: /apis/extensions/v1beta1/namespaces/deployments/deployments/app-tier
uid: a628e3b6-70dc-42d9-abb0-d92eca54e6c1
spec:
progressDeadlineSeconds: 600
replicas: 10
revisionHistoryLimit: 10
selector:
matchLabels:
tier: app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: microservices
tier: app
spec:
containers:
- env:
- name: REDIS_URL
value: redis://$(DATA_TIER_SERVICE_HOST):$(DATA_TIER_SERVICE_PORT_REDIS)
image: lrakai/microservices:server-v1
imagePullPolicy: IfNotPresent
name: server
ports:
- containerPort: 8080
protocol: TCP
#resources:
#requests:
#cpu: 20m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 5
conditions:
- lastTransitionTime: "2020-05-03T21:08:38Z"
lastUpdateTime: "2020-05-03T21:08:38Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2020-05-03T21:08:26Z"
lastUpdateTime: "2020-05-03T21:08:39Z"
message: ReplicaSet "app-tier-74c7df4f88" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 5
replicas: 5
updatedReplicas: 5
It’ll be easier to see the rollout in action with a large number of replicas. Edit the replicas to be 10
and then search for resources and delete the resources which are commented out above. This will avoid any potential problems with scheduling the replicas if all 10 of the cpu requests can’t be satisfied. Then watch the deployment until all the replicas are ready. watch -n 1 kubectl get -n deployments deployments app-tier
Every 1.0s: kubectl get -n deployments deployments app-tier
NAME READY UP-TO-DATE AVAILABLE AGE
app-tier 10/10 10 10 92m
Just to confirm get the pods:
ubuntu@ip-10-0-128-5:~/src# kubectl get -n deployments pods
NAME READY STATUS RESTARTS AGE
app-tier-748cdbdcc5-59mms 1/1 Running 0 100s
app-tier-748cdbdcc5-6rm6t 1/1 Running 0 93s
app-tier-748cdbdcc5-8h57v 1/1 Running 0 100s
app-tier-748cdbdcc5-dmsrs 1/1 Running 0 100s
app-tier-748cdbdcc5-f6gwg 1/1 Running 0 100s
app-tier-748cdbdcc5-hvxdq 1/1 Running 0 96s
app-tier-748cdbdcc5-qgpkk 1/1 Running 0 93s
app-tier-748cdbdcc5-tjlk4 1/1 Running 0 95s
app-tier-748cdbdcc5-wjpm8 1/1 Running 0 100s
app-tier-748cdbdcc5-xr4ms 1/1 Running 0 93s
data-tier-599bc4fcf8-5bf6p 1/1 Running 0 93m
support-tier-58d5d545b6-t8jkc 2/2 Running 0 93m
ubuntu@ip-10-0-128-5:~/src#
Edit the deployment
Now it’s time to trigger a rollout. Open the app-tier deployment with kubectl edit. kubectl edit -n deployments deployment app-tier
, this is the same output as shown above. From this we can see that the server added some default values for the deployment strategy, specifically the type is rolling update and the corresponding maxSurge
and maxUnavailable
fields control the rate at which updates are rolled out.
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
Configure maxSurge
and maxUnavailable
Maxsurge
: specifies how many replicas over the desired total are allowed during a rollout. A higher surge allows new pods to be created without waiting for old ones to be deleted.Maxunavailable
: controls how many old pods can be deleted without waiting for new pods to be ready. We’ll keep the defaults of 25%.
You may want to configure them if you want trade off the impact on availability or resource utilization with the speed of the rollout. For example, you can have all the new pods start immediately but in the worst case you could have all the new pods and all the old pods consuming resources at the same time effectively doubling the resource utilization for a short period.
Trigger a rollout
With those fields out of the way, we can trigger a rollout.
Remember that any change to the deployment’s template triggers a rollout.
Run kubectl edit -n deployments deployments. app-tier
. In the pod template spec change the name from server
to api
as shown in the below snippet then save and quit.
spec:
containers:
- env:
- name: REDIS_URL
value: redis://$(DATA_TIER_SERVICE_HOST):$(DATA_TIER_SERVICE_PORT_REDIS)
image: lrakai/microservices:server-v1
imagePullPolicy: IfNotPresent
#name: server
name: api
ports:
- containerPort: 8080
protocol: TCP
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
Check rollout status
Check the status kubectl rollout -n deployments status deployment app-tier
by running this immediately, it will all happen in a flash, so try doing it in two windows with a tmux
session. Run tmux
, then run ctrl+b %
to split the screen vertically. Use ctrl+b ->
left arrow and right arrow to move between screens.
ubuntu@ip-10-0-128-5:~/src# kubectl edit -n deployments deployments. app-tier
deployment.extensions/app-tier edited
ubuntu@ip-10-0-128-5:~/src# kubectl rollout -n deployments status deployment app-tier
Waiting for deployment "app-tier" rollout to finish: 5 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 5 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 5 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 5 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 5 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 6 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 6 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 6 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 6 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 6 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 7 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 7 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 8 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 8 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 8 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 9 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 9 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 9 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 9 out of 10 new replicas have been updated...
Waiting for deployment "app-tier" rollout to finish: 3 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 3 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 3 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 2 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 2 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 2 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "app-tier" rollout to finish: 8 of 10 updated replicas are available...
Waiting for deployment "app-tier" rollout to finish: 9 of 10 updated replicas are available...
deployment "app-tier" successfully rolled out
ubuntu@ip-10-0-128-5:~/src#
An ilustration of pause and resume
For the purpose of illustration, we will edit the deployment again using kubectl edit -n deployments deployments app-tier
. This time, change the name to a different name then save and quit, this will trigger the rollout. Quickly run the pause command. kubectl rollout -n deployments pause deployment app-tier
Now the rollout is paused, but pausing is won’t pause Replicas that were created before pausing. They will continue to progress to ready. However, there will be no new replicas created after the rollout is paused. We can try a few things at this point. One thing you can do is inspect the new pods before deciding to continue or rollback. We’ll simply get the deployment. As shown below on the right side window, we are running kubectl rollout -n deployments status deployment app-tier
, which shows the status of the deployment which says 9 out of 10 new replicas have been updated... The same fact is reflected on the left side where we ran kubectl get deployments. -n deployments app-tier
.
And let’s say everything is fine and we decided to resume it. We can then run the kubectl rollout -n deployments resume deployment app-tier
. The rollout picks up right where it left off and goes about its business.
Rollbacks
So now consider you found a bug in this new revision and need to rollback. kubectl rollout undo
to the rescue. This will rollback to the previous revision. You may also rollback to a specific version. Use kubectl rollout history
to get a list of all versions, then pass the specific revision to kubectl rollout undo.
ubuntu@ip-10-0-128-5:~/src# kubectl rollout history -n deployments deployment app-tier
deployment.extensions/app-tier
REVISION CHANGE-CAUSE
1 <none>
2 <none>
3 <none>
ubuntu@ip-10-0-128-5:~/src#
ubuntu@ip-10-0-128-5:~/src# kubectl rollout -n deployments undo deployment app-tier
deployment.extensions/app-tier rolled back
ubuntu@ip-10-0-128-5:~/src#
Doing another rollback, with the status running in parallel:
Do a describe to check what is the name of the container:
When you rollback again, you will be taken back to new version and not the previous-previous version. Notice below the name of the container. This was version 3. (version 1 was server, version 2 was api, version 3 was api-pause). As shown here, you will be taken to version 2.
That’s all for this demonstration of rolling updates and rollbacks, but before we move on let’s scale back the app tier to one replica to give back some CPU resources
ubuntu@ip-10-0-128-5:~/src# kubectl scale -n deployments deployment app-tier --replicas=1
deployment.extensions/app-tier scaled
ubuntu@ip-10-0-128-5:~/src# kubectl get -n deployments deployments.
NAME READY UP-TO-DATE AVAILABLE AGE
app-tier 1/1 1 1 86m
data-tier 1/1 1 1 86m
support-tier 1/1 1 1 86m
ubuntu@ip-10-0-128-5:~/src#
Conclusion
Deployments and rollouts are very powerful constructs. Their features cover a large swath of use cases. Lets summarize what we learned in this post:
- We learned that rollouts are triggered by updates to a deployments template.
- Kubernetes uses a rolling update strategy by default.
- We also learned how to pause, resume, and undo rollouts of deployments.
There’s still so much more we can do with deployments. Rollouts depend on container status. K8s assumes that created containers are immediately ready and the rollout should continue. This does not work in all cases. We may need to wait for the web server to accept connections. Here’s another scenario, consider an application using a relational database. The containers may start, but it will fail until a database and tables are created. These scenarios must be considered to build reliable applications. This is where probes and init containers come into the picture. We’ll integrate probes and init containers in the next posts.