CronJobs are a fundamental part of Linux/Unix automation, providing a straightforward way to schedule tasks to run automatically at specific times or intervals. In this article, we will dive into how to define a CronJob in Kubernetes, look at how to implement them in K8S manifest files with an example, and the available options.
Kubernetes CronJobs are a way to run a task on a time-based schedule and have been around for a long time in Linux and UNIX systems. They are a vital tool for system maintenance and automation. They can be used in to run recurring tasks such as backup jobs, triggering emails, report generation, or automating restarts of containers.
Traditionally on Unix-based systems, CronJobs work as follows:
- The user creates a cron job by using the crontab command to edit their “crontab” file. This file contains a list of commands or scripts to be executed and the times when they should be executed.
- The cron daemon, a background process that runs continuously, reads the crontab files of all users and checks if any jobs are scheduled to run at the current time.
- If a job is scheduled to run, the cron daemon executes the command or script associated with the job.
In Kubernetes, CronJobs are automatically managed by the cluster control plane. The cluster creates regular jobs with the pod spec from your CronJob object. CronJobs are a higher-level abstraction than standard K8S jobs that repeat the cycle periodically.
The main advantage and benefit of using a CronJob in Kubernetes is that it allows you to automate recurring tasks, such as backups, data synchronization, batch processing, and maintenance jobs. You should use them anywhere applicable where they are the most appropriate option to save manual effort.
Some common use cases for Cronjobs in K8S include:
1. Data backup
Schedule periodic backups of data within your applications or databases, ensuring that important data is regularly saved to a persistent storage or external location.
2. Database maintenance
Automate tasks such as database cleanup, reindexing, or data migration on a regular basis to maintain database performance and data integrity.
3. Log rotation and cleanup
Rotate and manage log files generated by applications to prevent log files from growing too large, which can impact system performance and storage space.
4. Certificate renewal
Automate the renewal of SSL/TLS certificates, ensuring that your applications always use up-to-date and valid certificates for secure communication.
5. Data synchronization
Periodically synchronize data between different systems or databases to keep information up-to-date across services.
6. Scheduled reports
Generate and send scheduled reports, such as daily, weekly, or monthly summaries, to users or stakeholders.
7. Batch processing
Run batch jobs for processing large volumes of data at specific intervals, such as nightly data aggregations, ETL (Extract, Transform, Load) processes, or data import/export tasks.
8. Scheduled cleanup
Automate the removal of temporary files, outdated data, or other resources to maintain system cleanliness and prevent resource exhaustion. You could also use a Cronjob to regularly clean up unused or expired resources to optimize resource utilization and reduce costs.
9. Maintenance tasks
Schedule routine maintenance tasks for your applications, such as database schema updates, software updates, or health checks.
10. Resource scaling
Scale application resources up or down based on demand, such as increasing the number of worker nodes during peak hours and reducing them during off-peak times.
11. Security scanning
Run regular security scans and vulnerability assessments on your applications and infrastructure to identify and mitigate security risks.
12. Monitoring and alerts
Automate the collection of metrics, logs, and system health checks at specific intervals, and trigger alerts or actions based on the collected data.
13. Content publishing
Schedule content publishing or updates for websites, blogs, or content management systems.
14. Compliance audits
Automate compliance checks and audits to ensure that your applications and infrastructure meet regulatory requirements.
15. Cache invalidation
Invalidate caches or perform cache purges on a predefined schedule to ensure that applications serve the latest data to users.
Kubernetes CronJob Schedule Syntax
To define a Cronjob, the schedule is defined using the CronJob syntax below:
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sun to Sat;
# │ │ │ │ │ 7 is also Sunday on some systems)
# │ │ │ │ │ OR sun, mon, tue, wed, thu, fri, sat
# │ │ │ │ │
# * * * * *
The Cronjob schedule is specified using five entries separated by spaces.
For example, running a job every day at 11 p.m. would be defined as:
0 23 * * *
A job running every minute would look like this:
* * * * *
Check out Crontab.guru to experiment with defining CronJobs.
For CronJobs with no time zone specified, the kube-controller-manager interprets schedules relative to its local time zone. As of Kubernetes v1.25 [beta] the CronJobTimeZone
feature gate can be enabled, which enables a specific time zone to be specified should it be required. For example:
spec.timeZone: "Etc/UTC"
How to Create CronJob in Kubernetes
To see a CronJob in action, we first need to specify the CronJob in a manifest file.
Create the below file and name it cronjob.yaml. Note the kind is set to CronJob, and the .spec.schedule
is set to run the job every minute. The .spec.schedule
is a required field of the .spec
. It takes a Cron format string, as detailed previously.
The name you provide must be a valid DNS subdomain name no longer than 52 characters. This is because the CronJob controller will automatically append 11 characters to the Job name provided, and there is a constraint that the maximum length of a Job name is no more than 63 characters.
The example itself will output ‘Hello from the Kubernetes cluster’ to the logs.
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox:1.28
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Create the deployment:
kubectl create -f .\cronjob.yaml
Verify the CronJob has been created:
kubectl get cronjob hello
The LAST SCHEDULE field shows how long ago the job last ran.
The ACTIVE field shows how many jobs are in progress, 0 meaning it has either already been completed or failed.
Note that the job name is different from the pod name, which is specified in the manifest file.
You can view the running jobs in real-time using the --watch
argument.
kubectl get jobs --watch
To view the pods that have been created to run the jobs:
kubectl get pods
If you have lots of running pods you’ll want to filter the selection to the job name in your system using the --selector
argument. Note that only the last three pods will be shown by default unless a different value has been specified by the optional field spec.successfulJobsHistoryLimit
.
kubectl get pods --selector=job-name=hello-27827258
To view the logs from the pod to verify the command ran successfully:
kubectl logs hello-27827258--1-rdf4s
To clean up, delete the CronJob. Deleting the CronJob removes all the jobs and pods it created and stops it from creating additional jobs:
kubectl delete cronjob hello
See how to delete Pods from a Kubernetes Node.
There are a number of optional fields that can be used to further control CronJobs.
1. startingDeadlineSeconds
The Starting Deadline is an optional field that specifies the deadline in seconds for starting the job if it misses its scheduled time for any reason. After the deadline, CronJob does not start the job. Jobs that do not meet their deadline in this way count as failed jobs.
If this field is not specified, the jobs have no deadline. For example, if it is set to 60
, it allows a job to be created for up to 60 seconds after the actual schedule.
Note that if the startingDeadlineSeconds
is set to a value of less than 10 seconds, the CronJob may not be scheduled. This is because the CronJob controller checks things every 10 seconds.
2. concurrencyPolicy
The Concurrency policy is also an optional field that specifies how to handle concurrently running jobs. This may be useful for jobs that need to run independently to avoid unwanted results. By default, if this is not specified, concurrently running jobs are always allowed.
Allow
(default): The CronJob allows concurrently running jobs.Forbid
: The CronJob does not allow concurrent runs; if it is time for a new job run and the previous job run hasn’t finished yet, the CronJob skips the new job run.Replace
: If it is time for a new job run and the previous job run hasn’t finished yet, the CronJob replaces the currently running job run with a new job run.
3. suspend
The Suspend field is optional that can be set to either true
or false
. If it is set to true, then all subsequent operations are suspended. The current suspend status can be viewed using the command below:
kubectl get cronjob hello
4. successfulJobsHistoryLimit
An optional field to specify the number of successful jobs to keep in the history, which by default is 3.
5. failedJobsHistoryLimit
Similar to the above option, the failed jobs history limit can also be specified. This defaults to 1.
Common errors and problems with CronJobs include:
Kubernetes not scheduling CronJob or CronJob stops scheduling jobs
If you encounter errors when setting up a CronJob, check the following:
- Syntax: CronJobs use the same syntax as traditional UNIX cron jobs, which can be complex and difficult to get right. Common errors include specifying the wrong number of fields, using incorrect wildcards, or mistyping the cron schedule. Check the syntax section of the article and copy your expression into crontab.guru to make sure the syntax is correct.
- Timezone: CronJobs run in the timezone of the Kubernetes cluster by default, which may not match the timezone of the user or application. This can lead to scheduling conflicts or unexpected behavior.
- Image: If a CronJob specifies an image that is unavailable, the job will fail.
- Resources: Resource limits on images that are set too high will cause jobs to fail.
- Job concurrency: If a CronJob is set to run too frequently or with too many replicas, it might lead to excessive load on the cluster and cause other jobs to fail.
- Permissions: Check the CronJob has sufficient permissions to access resources or perform actions that are defined.
To troubleshoot CronJob errors, the Kubernetes logs will be the first port of call. You can use the kubectl logs
command to interrogate the Kubernetes server API logs, CronJob controller logs, Pod logs, and Container logs.
kubectl logs -n <namespace> <cronjob-controller-pod-name> -c cronjob-controller
If you are using a centralized logging solution for your cluster (as is recommended), such as Elasticsearch, Fluentd, or Kibana (EFK) to collect and analyze logs from multiple nodes and containers in the cluster you should check the logs using those tools for deeper insight.
Monitoring solutions such as Prometheus, Datadog, Grafan, or New Relic can be used to track job execution, such as the number of successful and failed jobs, job duration, and resource usage.
Error status on Kubernetes CronJob with connection refused
If you’re encountering a Connection Refused error status in a Kubernetes CronJob, it typically indicates that the CronJob or the associated pods are unable to establish a network connection to the specified target or endpoint.
You should run through general network troubleshooting steps to resolve this, including:
- Verify that the target service, server, or endpoint the CronJob is trying to connect to is up and running.
- Ensure that the target service’s host and port information is correctly configured in your CronJob specification.
- If you are using network policies in your Kubernetes cluster, make sure that the CronJob pods have the necessary network policies to allow outgoing connections to the target.
- Ensure that DNS resolution is working correctly within your Kubernetes cluster. Pods should be able to resolve the DNS of the target service.
- Check if there are any external firewalls, network security groups, or cloud provider security settings that might be blocking the outgoing connections from your Kubernetes cluster to the target. Similarly, check if there are any internal network policies, firewalls, or proxy configurations within the Kubernetes cluster that could be affecting network connections. Don’t forget any egress controls or firewall rules that may be in place at the cluster level.
- If the target is a service within your Kubernetes cluster, make sure that the service and endpoints are correctly defined.
- Examine logs on the target side to see if there are any errors or issues that may help identify the cause of the “Connection Refused” error.
- Review any pod security policies or admission controllers that might be preventing the CronJob pods from making outbound connections.
- If you are relying on Kubernetes service discovery, confirm that the service’s DNS name is correct and that it resolves to the expected IP address.
- Verify that the container image used in the CronJob’s pods includes the necessary dependencies and configurations for making outbound network connections. It should not have any restrictions or misconfigurations that prevent networking.
- Consider implementing timeouts and retry mechanisms in your application to handle transient network issues. Sometimes, connection refused errors can be temporary.
CronJobs are used to schedule recurring tasks in Kubernetes.
For more information on CronJobs check out the official documentation on kubernetes.io.
And take a look at how Spacelift helps you manage the complexities and compliance challenges of using Kubernetes. Anything that can be run via kubectl can be run within a Spacelift stack. Find out more about how Spacelift works with Kubernetes.
The Most Flexible CI/CD Automation Tool
Spacelift is an alternative to using homegrown solutions on top of a generic CI. It helps overcome common state management issues and adds several must-have capabilities s for infrastructure management.