Kubernetes

Guide to Kubernetes Liveness Probes : Defining with Examples

Kubernetes Liveness Probes

In this article, we will take a look at liveness probes in Kubernetes (K8S), with some useful examples. Defining probes correctly can improve pod resilience and availability.

What is a Kubernetes Liveness Probe?

The liveness probe ensures that an application within a container is live and operational based on a specified test.

The kubelet uses liveness probes to know when to restart a container. Applications that error or transition to broken states will be picked up and can be fixed in many instances by being restarted.

If the configured liveness probe is successful, no action is taken, and no logs are recorded. If it fails, the event is logged, and the kubelet kills the container according to the configured restartPolicy.

A liveness probe should be used when a pod may appear to be running, but the application may not function correctly. For example, in a deadlock situation, the pod may be running but will be unable to serve traffic and is effectively not working.

They are not necessary where the application is configured to crash the container on failure as the kubelet will check the restartPolicy and will automatically restart the container if it is set to Always or OnFailure. In the case of the NGINX application, this starts up quickly and will exit if it runs into an error that stops it from serving pages. In this situation, we do not need a liveness probe.

What Other Probes Can I Use in Kubernetes?

This article will focus on the use of liveness probes, but you should be aware of the other types of probes available for use in Kubernetes:

Readiness probes

Readiness probes monitor when the application becomes available. If it fails, no traffic will be sent to the Pod. These are used when an app needs configuration before it becomes ready. An application may also become overloaded with traffic and cause the probe to fail, preventing more traffic from being sent to it, and allowing it to recover. If it fails, the endpoints controller removes the Pod.

If the readiness probe fails but the liveness probe succeeds, the kubelet determines that the container is not ready to receive network traffic but is still working to become ready.

Startup probes

Startup probes are used by the kubelet to enable it to know when a container application has started. When these are configured, liveness and readiness checks are disabled until they are successful, ensuring startup probes don’t interfere with the application startup.

These are particularly useful with slow-starting containers, avoiding them getting killed by the kubelet before they are up and running when a liveness probe fails. If liveness probes are used on the same endpoint as a startup probe, set the failureThreshold of the startup probe higher, to support long startup times.

If it fails, the event is logged, and the kubelet kills the container according to the configured restartPolicy.

How do Kubernetes Probes Work?

Probes are managed by the kubelet. The kubelet is the primary “node agent” that runs on each node.

To effectively use a Kubernetes probe, the application must support one of the following handlers:

  • ExecAction handler — runs a command inside the container, and the diagnostic succeeds if the command completes with status code 0.
  • TCPSocketAction handler — attempts a TCP connection to the IP address of the pod on a specific port. The diagnostic succeeds if the port is found to be open.
  • HTTPGetAction handler — performs an HTTP GET request using the IP address of the pod, a specific port, and a specified path. The diagnostic succeeds if the response code returned is between 200–399.
  • gRPC handler — As of Kubernetes version v.1.24, and if your application implements gRPC Health Checking Protocol, kubelet can be configured to use it for application liveness checks. You must enable the GRPCContainerProbe feature gate in order to configure checks that rely on gRPC.

When the kubelet performs a probe on a container, it responds with either Success, if the diagnostic passed, Failure if it failed, or Unknown, if the diagnosis did not complete for some reason.

Define a Liveness Probe

In each example shown below, the periodSeconds field specifies that the kubelet should perform a liveness probe every 5 seconds. The initialDelaySeconds field tells the kubelet that it should wait 5 seconds before performing the first probe.

In addition to these options, you can also configure:

  • timeoutSeconds – Time to wait for the reply – default = 1.
  • successThreshold – Number of successful probe executions to mark the container healthy – default = 1.
  • failiureThreshold – Number of failed probe executions to mark the container unhealthy – default = 3.

These five parameters can be used in all types of liveness probes.

Before defining a probe, the system behavior and average startup times of the Pod and its containers should be observed so you can determine the correct thresholds. Also, the probe options should be updated as the infrastructure or application evolves. For example, the Pod may be configured to use more system resources which might affect the values that need to be configured for the probes.

ExecAction handler example

The below example shows a usage of the exec command to check if a file exists at the path /usr/share/liveness/html/index.html by using the cat command. If no file exists, then the liveness probe will fail and the container will be restarted.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness:0.1
    ports:
    - containerPort: 8080
    livenessProbe:
      exec:
        command:
        - cat
        - /usr/share/liveness/html/index.html
      initialDelaySeconds: 5
      periodSeconds: 5

TCPSocketAction handler example

In this example, the liveness probe uses the TCP handler to check port 8080 is open and responding. With this configuration, the kubelet will attempt to open a socket to your container on the specified port. If the liveness probe fails, the container will be restarted.

apiVersion: v1
kind: Pod
metadata:
  name: liveness
  labels:
    app: liveness-tcp
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness:0.1
    ports:
    - containerPort: 8080
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5

HTTPGetAction handler example

This example shows the HTTP handler, which will send an HTTP GET request on port 8080 to the /health path. If a code between 200–400 is returned, the probe is considered successful. If a code outside of this range is returned, the probe is unsuccessful, and the container is restarted. The httpHeaders option is used to define any custom headers you want to send.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness:0.1
    livenessProbe:
      httpGet:
        path: /health
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: ItsAlive
      initialDelaySeconds: 5
      periodSeconds: 5

gRPC handler example

This example shows the use of the gRPC health checking protocol to check port 2379 is responding. To use a gRPC probe, port must be configured. If the health endpoint is configured on a non-default service, you must also specify the service. All errors are considered as probe failures as there are no error codes for gRPC built-in probes.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-gRPC
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness:0.1
    ports:
    - containerPort: 2379
    livenessProbe:
      grpc:
        port: 2379
      initialDelaySeconds: 5
      periodSeconds: 5

Key Points

Combining liveness probes with readiness and startup probes correctly can improve pod resilience and availability by triggering an automatic restart of a container once a failure of a specified test is detected. In order to correctly define them, the application must be understood so the correct options can be specified.

Also, take a look at how Spacelift helps you manage the complexities and compliance challenges of using Kubernetes. Anything that can be run via kubectl can be run within a Spacelift stack. Find out more about how Spacelift works with Kubernetes, and get started on your journey by creating a free trial account.

The most Flexible CI/CD Automation Tool

Spacelift is an alternative to using homegrown solutions on top of a generic CI. It helps overcome common state management issues and adds several must-have capabilities s for infrastructure management.

Start free trial