The liveness probe ensures an application within a container is live and operational based on a specified test. Defining probes correctly can improve Pod resilience and availability.
In this article, we will look at liveness probes in Kubernetes (K8S) and provide useful examples. What we’ll cover:
A Kubernetes liveness probe is a mechanism used to determine if a container is still running properly. If a liveness probe fails, Kubernetes will restart the container to restore functionality.
Liveness probes are commonly used to detect application deadlocks, crashes, or unresponsive states that cannot be self-recovered. They are configured using commands, HTTP requests, or TCP socket checks, ensuring Kubernetes can intervene when a container stops responding.
The role of liveness probes in Kubernetes
The kubelet uses liveness probes to determine when to restart a container. Applications that experience errors or transition to broken states are identified, and restarting them often resolves the issue.
If the configured liveness probe is successful, no action is taken, and no logs are recorded. If it fails, the event is logged, and the kubelet kills the container according to the configured restartPolicy
.
A liveness probe should be used when a pod appears to be running, but the application does not function correctly. For example, in a deadlock situation, the pod may be running but will be unable to serve traffic and is effectively not working.
They are not necessary if the application is configured to crash the container on failure, as the kubelet will check the restartPolicy
and will automatically restart the container if it is set to Always
or OnFailure
. In the case of the NGINX application, this starts up quickly and will exit if it runs into an error that stops it from serving pages. A liveness probe is unnecessary in this situation.
Types of liveness probes
Liveness probes support three types of checks, each of which is suitable for different applications, ensuring Kubernetes maintains healthy workloads:
- HTTP GET Probe – Uses an HTTP GET request to a specified endpoint. It fails if the HTTP response code is not 2xx or 3xx
- TCP Socket Probe – Checks if a TCP connection to a specified port is successful, which is useful for applications that do not have an HTTP endpoint
- Command Probe (Exec Probe) – Runs a command inside the container; if the command exits with a non-zero status, the probe fails.
Kubernetes has three types of probes used to monitor the health and readiness of containers: liveness probe, readiness probe, and startup probe. This article will focus on the use of liveness probes, but you should also be aware of the other types:
Readiness probes
Readiness probes monitor when the application becomes available. If it fails, no traffic will be sent to the Pod. They are used when an app needs configuration before it becomes ready. An application may also become overloaded with traffic and cause the probe to fail, preventing more traffic from being sent to it, and allowing it to recover. If it fails, the endpoint controller removes the Pod.
If the readiness probe fails but the liveness probe succeeds, the kubelet determines that the container is not ready to receive network traffic but is still working to become ready.
Startup probes
The kubelet uses startup probes to detect when a container application has started. When these are configured, liveness and readiness checks are disabled until they are successful, ensuring startup probes don’t interfere with the application startup.
These are particularly useful with slow-starting containers, preventing the kubelet from killing them before they are up and running when a liveness probe fails. If liveness probes are used on the same endpoint as a startup probe, set the failureThreshold
of the startup probe higher to support long startup times.
If it fails, the event is logged, and the kubelet kills the container according to the configured restartPolicy
.
How do Kubernetes probes work?
The kubelet manages the probes and is the primary “node agent” that runs on each node.
To effectively use a Kubernetes probe, the application must support one of the following handlers:
- ExecAction handler — runs a command inside the container, and the diagnostic succeeds if the command completes with status code 0
- TCPSocketAction handler — attempts a TCP connection to the IP address of the pod on a specific port. The diagnostic succeeds if the port is found to be open
- HTTPGetAction handler — performs an HTTP GET request using the Pod’s IP address, a specific port, and a specified path. The diagnostic succeeds if the response code returned is between 200 and 399.
- gRPC handler — As of Kubernetes version v.1.24, and if your application implements gRPC Health Checking Protocol, kubelet can be configured to use it for application liveness checks. You must enable the
GRPCContainerProbe
feature gate in order to configure checks that rely on gRPC.
When the kubelet performs a probe on a container, it responds with either Success if the diagnostic passed, Failure
if it failed, or Unknown
if the diagnosis failed to complete for some reason.
What is the difference between liveness and readiness probes?
Liveness probes are useful for detecting when an application is stuck due to deadlocks or unexpected failures. If the probe fails, Kubernetes restarts the container. Readiness probes, on the other hand, help manage rolling updates and startup sequences by delaying traffic routing until the application is fully ready.
For example, a database-dependent application might fail its readiness probe until the database connection is established, while a liveness probe ensures it keeps running without getting stuck. Both probes improve application resilience by ensuring smooth operation and failure recovery.
You define a liveness probe in the pod’s YAML configuration under the livenessProbe
field.
In the example below, we defined a liveness probe for a container running in a pod. The probe checks the container’s health by making an HTTP request to the /health
endpoint on port 8080
. If this request fails, Kubernetes considers the container unhealthy and restarts it.
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: example-container
image: myapp:latest
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 3
periodSeconds: 5
- The
initialDelaySeconds: 3
specifies that the probe should wait for 3 seconds after the container starts before performing the first check. - The
periodSeconds: 5
ensures that the probe runs every 5 seconds to verify the container’s health continuously.
Kubernetes also supports alternative probe mechanisms, such as exec
, which runs a command inside the container, and tcpSocket
, which checks if a specific TCP port is open.
For example, in an exec
probe, the container executes the command cat /tmp/healthy
. If the command fails (i.e., the file is missing or unreadable), the container is deemed unhealthy and restarted.
livenessProbe:
exec:
command: ["cat", "/tmp/healthy"]
initialDelaySeconds: 5
periodSeconds: 10
In each example shown below, the periodSeconds
field specifies that the kubelet should perform a liveness probe every five seconds. The initialDelaySeconds
field tells the kubelet to wait five seconds before performing the first probe.
In addition to these options, you can also configure:
timeoutSeconds
– Time to wait for the reply – default = 1.successThreshold
– Number of successful probe executions to mark the container healthy – default = 1.failiureThreshold
– Number of failed probe executions to mark the container unhealthy – default = 3.
These five parameters can be used in all types of liveness probes.
Before defining a probe, observe the system behavior and average startup times of the Pod and its containers to determine the correct thresholds. Also, update the probe options as the infrastructure or application evolves. For example, the Pod may be configured to use more system resources, which might affect the values that need to be configured for the probes.
Example 1: ExecAction handler
The example below shows the use of the exec
command to check if a file exists at the path /usr/share/liveness/html/index.html
using the cat
command. If no file exists, then the liveness probe will fail, and the container will be restarted.
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: registry.k8s.io/liveness:0.1
ports:
- containerPort: 8080
livenessProbe:
exec:
command:
- cat
- /usr/share/liveness/html/index.html
initialDelaySeconds: 5
periodSeconds: 5
Example 2: TCPSocketAction handler
In this example, the liveness probe uses the TCP handler to check port 8080 is open and responding. With this configuration, the kubelet will attempt to open a socket to your container on the specified port. If the liveness probe fails, the container will be restarted.
apiVersion: v1
kind: Pod
metadata:
name: liveness
labels:
app: liveness-tcp
spec:
containers:
- name: liveness
image: registry.k8s.io/liveness:0.1
ports:
- containerPort: 8080
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
Example 3: HTTPGetAction handler
This example shows the HTTP handler, which will send an HTTP GET request on port 8080 to the /health
path. If a code between 200–400 is returned, the probe is considered successful. If a code outside of this range is returned, the probe is unsuccessful, and the container is restarted. The httpHeaders
option is used to define any custom headers you want to send.
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http
spec:
containers:
- name: liveness
image: registry.k8s.io/liveness:0.1
livenessProbe:
httpGet:
path: /health
port: 8080
httpHeaders:
- name: Custom-Header
value: ItsAlive
initialDelaySeconds: 5
periodSeconds: 5
Example 4: gRPC handler
This example shows how to use the gRPC health checking protocol to check whether port 2379 is responding. To use a gRPC probe, the port
must be configured. If the health endpoint is configured on a non-default service, you must also specify the service
. All errors are considered probe failures, as there are no error codes for gRPC built-in probes.
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-gRPC
spec:
containers:
- name: liveness
image: registry.k8s.io/liveness:0.1
ports:
- containerPort: 2379
livenessProbe:
grpc:
port: 2379
initialDelaySeconds: 5
periodSeconds: 5
To check if a liveness probe is working in Kubernetes, follow these steps:
- Check pod events – Run
kubectl describe pod <pod-name> -n <namespace>
and under the Events section for messages related to the liveness probe, such as failures or restarts. - Inspect pod logs – Check if the container is restarting due to probe failures with
kubectl logs <pod-name> -n <namespace>
. If the container restarts repeatedly, the liveness probe might be failing. - Check pod status – Run:
kubectl get pods -n <namespace>
. If the pod is in a CrashLoopBackOff or Restarting state, the liveness probe may be failing. - Describe deployment – For deployments, inspect the configuration:
kubectl get deployment <deployment-name> -o yaml -n <namespace>
. Look for thelivenessProbe
section under containers. - Manually test liveness probe endpoint – If the probe uses an HTTP endpoint, you can test it using
curl
:curl http://<pod-ip>:<liveness-port>/<probe-path>
.
For a command-based probe, try running the command inside the container: kubectl exec <pod-name> -- <probe-command>
- Check Kubernetes events – Events provide more details on probe failures:
kubectl get events --sort-by=.metadata.creationTimestamp
.
If the liveness probe is failing, review the configuration in your Pod definition and ensure the probe’s command, HTTP endpoint, or TCP socket is correctly defined and accessible. You can also try the troubleshooting tips below.
Let’s consider three common scenarios where Kubernetes liveness probes fail, along with troubleshooting steps.
1. Container startup takes too long
A common reason for a failing liveness probe is that the containerized application takes longer to start than the configured probe threshold. By default, Kubernetes starts checking the liveness probe immediately after the container starts (unless a delay is configured).
If the application is not ready, Kubernetes may restart the container repeatedly, preventing it from fully initializing.
Troubleshooting steps:
- Check pod logs (
kubectl logs <pod-name>
) to see if the application is still initializing when the liveness probe fails. - Increase the
initialDelaySeconds
in the probe configuration to give the application more time to start. - If the application’s initialization process is slow(e.g., database migrations, loading large models), consider implementing a readiness probe to ensure traffic isn’t sent prematurely.
2. Incorrect probe configuration (Path, Port, or Protocol)
If the probe is configured incorrectly, such as pointing to a non-existent endpoint, using the wrong port, or expecting an incorrect response format, Kubernetes will continuously fail the liveness check and restart the container.
Troubleshooting steps:
- Verify the application’s health check endpoint (
path
) is correctly exposed. - Check that the container listens on the correct port (mismatches often happen when ports are changed in deployment configurations).
- Use
kubectl exec
to manually test the liveness probe command inside the container.
3. Resource constraints causing timeouts
When a container is under high CPU or memory pressure, the application might become unresponsive, leading to timeout failures in the liveness probe. Kubernetes will interpret this as a failure and restart the pod, worsening the issue (especially in resource-constrained environments).
Troubleshooting steps:
- Check pod resource usage with
kubectl top pods
orkubectl describe pod <pod-name>
. - Increase the
timeoutSeconds
value if the application takes longer to respond under load. - Adjust resource requests and limits to prevent resource starvation.
Below are some of the best practices you should follow when using the Kubernetes liveness probes:
- Keep liveness probes simple and lightweight. Misconfigured probes can impact application performance if they run too frequently or cause containers to sit in an unhealthy state for extended periods of time. Some containers don’t need probes where they execute simple operations and terminate quickly, so avoid unnecessary probe configurations.
- Use a combination of readiness and liveness probes to ensure that your application is running properly and can handle incoming traffic.
- If a liveness probe takes too long to complete, Kubernetes may assume that the application is not running and restart it, even if it is actually still running, so setting a realistic and appropriate timeout value is recommended. Check how long the probe’s command, API request, or gRPC call takes to actually complete, then set a value with a small extra time buffer.
- If a liveness probe fails a certain number of times, Kubernetes will restart the application. Set a failure threshold for your liveness probes to avoid unnecessary restarts of your application.
- Check that your container restart policies are applied after probes. This means your containers need
restartPolicy: Always
(the default) orrestartPolicy: OnFailure
so Kubernetes can restart them after a failed probe. Using the Never policy will keep the container in a failed state. - Use the
kubectl
command-line tool to test your probes and make sure that they are correctly configured. - Choose the appropriate probe type for your application. For example, use an HTTP probe for web applications, while a TCP probe might be more appropriate for a database. The target of your probe’s command or HTTP request should generally be independent of your main application, so it can run to completion even during failure conditions.
- Monitor your liveness probes to ensure that they are working as expected. As changes are made to your application, be sure to update the probes to reflect any changes. Set up alerts to notify you if a probe fails, and monitor the logs for any errors related to your probes.
If you need assistance managing your Kubernetes projects, look at Spacelift. It brings with it a GitOps flow, so your Kubernetes Deployments are synced with your Kubernetes Stacks, and pull requests show you a preview of what they’re planning to change.
You can also use Spacelift to mix and match Terraform, Pulumi, AWS CloudFormation, and Kubernetes Stacks and have them talk to one another.
To take this one step further, you could add custom policies to reinforce the security and reliability of your configurations and deployments. Spacelift provides different types of policies and workflows that are easily customizable to fit every use case. For instance, you could add plan policies to restrict or warn about security or compliance violations or approval policies to add an approval step during deployments.
You can try Spacelift for free by creating a trial account or booking a demo with one of our engineers.
Correctly combining liveness probes with readiness and startup probes can improve Pod resilience and availability by triggering an automatic restart of a container once a failure of a specified test is detected. In order to define them correctly, the application must be understood so the appropriate options can be specified.
Manage Kubernetes easier and faster
Spacelift allows you to automate, audit, secure, and continuously deliver your infrastructure. It helps overcome common state management issues and adds several must-have features for infrastructure management.