Clean up old containers and images in your kubernetes cluster

An active Kubernetes cluster can accumulate old containers and images. Ensuring discarded resources are removed when redundant helps to free up resources on your cluster’s nodes. Here’s how to approach garbage collection in Kubernetes.

Container Images

Kubernetes has a built-in garabage collection system that can clean up unused images. It’s managed by Kubelet, the Kubernetes worker process that runs on each node.

Kubelet automatically monitors unused images and will remove them periodically. Deletion decisions are made by assessing the image’s disk usage and the time at which it was last used. A large image that has been unused for a week will usually be cleaned up before a small one that was used yesterday.

You can customise when garbage collection runs by specifying high and low thresholds for disk usage. Disk usage above the “high” threshold will trigger garbage collection. The procedure will try to reduce disk usage down to the “low” threshold.

The thresholds are defined using two Kubelet flags:

image-gc-high-threshold – Sets the high threshold; defaults to 85%.
image-gc-low-threshold – Sets the low threshold; defaults to 80%.

These settings should already be active in your cluster. Kubelet will try to bring disk usage down to 80% after it becomes 85% full.

You can set Kubectl flags in /var/lib/kubelet/kubeadm-flags.env:

KUBELET_KUBEADM_ARGS="--image-gc-high-threshold=60 --image-gc-low-threshold=50"

After editing the file, restart Kubectl:

systemctl daemon-reload
systemctl restart kubelet

Clearing Old Containers

Kubelet also handles clean up of redundant containers. Any containers which are stopped or unidentified will be candidates for removal.

You can grant old containers a grace period before deletion by defining a minimum container age. Additional flags let you control the total number of dead containers allowed to exist in a single pod and on the node:

maximum-dead-containers – Maximum number of old containers to retain. When set to -1 (the default), no limit applies.
maximum-dead-containers-per-container – Set the number of older instances to be retained on a per-container basis. If a container is replaced with a newer instance, this many older versions will be allowed to remain.
minimum-container-ttl-duration – Garbage collection grace period for dead containers. Once a container is this many minutes old, it becomes eligible for garbage collection. The default value of 0 means no grace period applies.

You can configure these settings with Kubelet flags using the same procedure as described above.

Should I Manually Intervene?

You should not make manual efforts to remove dead containers or images. If disk space is filling up, or garbage collection doesn’t seem to be working, try adjusting your Kubelet flags towards more aggressive settings.

Kubernetes warns against performing external garbage collection. Don’t manually delete resources, either using cluster management APIs or third-party tools. This risks creating an inconsistent state which could impact Kubelet’s operation.

Kubelet is responsible for managing the containers allocated to each node. When a new container gets scheduled, Kubelet will download its image. Successful cluster operation is dependent on Kubelet’s expectations being met. A missing image or container can lead to Kubelet issues.

The Future: Evictions

The settings described above are supported in current Kubernetes versions. However, they are being deprecated in favour of a more robust “evictions” system. Evictions are a unified way to clean up Kubernetes resources; they’ll eventually replace garbage collection.

An eviction can occur for several reasons. Kubelet will monitor multiple factors, including available hardware resources and user configuration for retention periods.

This new system facilitates the removal of garbage collection as a dedicated mechanism. The same process which terminates pods due to a low-memory scenario will delete redundant images as disk space becomes constrained.

Two types of eviction are defined: hard and soft. A hard eviction will take immediate action to remove the target resource. There is no grace period. A soft eviction has a user-configured grace period; the resource will be targeted once the grace period expires. If the cause of the eviction gets resolved during the grace period, such as more disk space becoming available, the removal can be terminated.

The evictions system isn’t yet fully supported for container clean ups. The dead-containers flags are already deprecated, ready for the future. You can already use it with container images – set --eviction-hard or --eviction-soft instead of the threshold flags.

--eviction-hard=imagefs.available<1Gi

This example instructs Kubelet to remove all unused container images if the available disk space for image storage drops below 1GB.

–eviction-soft=imagefs.available<1Gi –eviction-soft-grace-period=imagefs.available=5m

This second example shows how a “soft” eviction can be used instead. In this case, images won’t be deleted unless the available disk space has been below 1GB for at least five minutes.

Summary

Kubernetes has garbage collection enabled by default. Dead containers and redundant images will be cleaned up periodically. The default schedule targets disk usage of 80% or lower; containers are cleaned up quite aggressively once they’ve been stopped. You can use Kubelet flags to adjust the thresholds in the process.

Garbage collection as a concept will eventually be removed in favour of evictions. Evictions have a simplified configuration which better aligns with other forms of resource removal. You can setup evictions in your cluster today. Remember that the garbage collection flags do not map directly to their eviction counterparts.

Source:
https://www.howtogeek.com/devops/how-to-clean-up-old-containers-and-images-in-your-kubernetes-cluster/

Last updated on 6 Jul 2022
Published on 6 Jul 2022
Edit on GitHub