Kubernetes Deployments are objects that manage a set of identical Pods. Kubernetes creates Pods based on the Deployment’s spec and then ensures the correct number of Pods stay running. If a Pod fails, it’s automatically replaced. This lets you run your services with high availability.
You can scale Kubernetes Deployments by adding or removing Pods at any time. Changing a Deployment’s replica count allows you to increase service capacity or optimize resource consumption.
In this article, we’ll examine one of the easiest methods for scaling Deployments: the kubectl scale
CLI command.
What we’ll cover:
The Kubernetes Deployment object is one of the main ways to scale a set of Pods. It provides declarative Pod configuration, rolling updates, and a built-in scaling mechanism.
Creating Pods using a Deployment allows you to scale the replica count up or down at any time, without having to add or remove Pod objects manually.
When you run kubectl scale
, Kubernetes automatically starts and destroys Pods as required. It’ll use the Pod spec contained within the Deployment object to configure newly created Pods. Internally, Deployments use lower-level ReplicaSet objects to ensure the expected number of Pod replicas remains continually available.
Deployments are the most popular way to run stateless workloads in Kubernetes, but alternative Pod scaling mechanisms suit other use cases. StatefulSets allow you to scale stateful services such as file servers and databases, for example, while DaemonSets automatically distribute identical Pods across your cluster’s Nodes. For this article, we’re just focusing on how to scale Deployments using kubectl scale
.
Here are some reasons why you may need to scale a Kubernetes Deployment’s replica count with kubectl scale
:
- Increase service capacity: Increasing a Deployment’s replica count means more Pods will be available to serve traffic, increasing your service’s maximum capacity.
- Ensure stable performance: Adding replicas using
kubectl scale
can help distribute service traffic more evenly, preventing slowdowns due to resource contention. - Optimize resource usage: You can use
kubectl scale
to remove unneeded Pods during quieter times, freeing up resources for other workloads. - Reduce operating costs: Scaling down may reduce your cluster’s total resource usage to a level that lets you also remove Nodes. This enables you to minimize spending when there’s low load.
- Respond to load changes: Scaling your Deployments lets you respond to changes in your Deployment’s usage, enabling more efficient Kubernetes cluster operations.
Each of these points shares a common theme: Scaling a Deployment lets you change the number of Pods available for a specific service in your cluster.
The kubectl scale
command lets you quickly scale up to increase resiliency or scale down to reduce resource utilization and save costs.
The kubectl scale deployment
command adjusts the number of replicas in a Kubernetes Deployment.
It modifies the Deployment’s .spec.replicas
field to increase or decrease the number of running Pods. This is typically used to manually scale workloads based on demand or during testing.
For example, the command:
kubectl scale deployment my-app --replicas=5
updates the my-app
Deployment to maintain five active Pods. Kubernetes ensures that the desired number of replicas is running by creating or terminating Pods as needed.
Let’s walk through some simple examples of how to use kubectl scale
to change a Kubernetes Deployment’s replica count.
1. Creating a Deployment
First, use kubectl create
to start a Deployment for this tutorial:
$ kubectl create deployment demo --image=nginx:latest --replicas 3
deployment.apps/demo created
This command creates a new Deployment called demo
. It’s configured to run three Pod replicas with the nginx:latest
image:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
demo 3/3 3 3 68m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
demo-57c86c54c5-kdf57 1/1 Running 0 70m
demo-57c86c54c5-tsgww 1/1 Running 0 70m
demo-57c86c54c5-tsk6s 1/1 Running 0 70m
The kubectl get
commands shown above demonstrate that the three expected Pod replicas are running. Now you can use kubectl scale
to scale your Deployment by adding and removing replicas.
2. Scaling a Deployment
The kubectl scale
command requires two main arguments:
- The name of the Deployment to scale.
- A
--replicas
flag that specifies the number of replicas to scale the Deployment to.
Here’s an example that scales our demo
Deployment up to five replicas:
$ kubectl scale deployment/demo --replicas 5
deployment.apps/demo scaled
Repeating the kubectl get deployments
and kubectl get pods
command from earlier should now show that five replicas are running:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
demo 5/5 5 5 72m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
demo-57c86c54c5-kdf57 1/1 Running 0 73m
demo-57c86c54c5-nwr8f 1/1 Running 0 51s
demo-57c86c54c5-tsgww 1/1 Running 0 73m
demo-57c86c54c5-tsk6s 1/1 Running 0 73m
demo-57c86c54c5-zd4g4 1/1 Running 0 51s
You’ve successfully used kubectl scale
to scale your Deployment, without having to manually create any more Pods. Kubernetes has created new Pods automatically, based on your Deployment’s configuration.
You can now try scaling back to three replicas:
$ kubectl scale deployment/demo --replicas 3
deployment.apps/demo scaled
Kubernetes will automatically remove surplus Pods to achieve the new desired replica count:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
demo-57c86c54c5-kdf57 1/1 Running 0 80m
demo-57c86c54c5-tsgww 1/1 Running 0 80m
demo-57c86c54c5-tsk6s 1/1 Running 0 80m
Importantly, scaling changes happen automatically in either direction: You don’t need to tell kubectl scale
to “scale up” or “scale down.” Kubernetes always applies the correct actions to achieve the new desired state. You just declare the number of replicas that you want to have running.
3. Preventing unexpected scaling changes
kubectl scale
can have safety implications because the command immediately applies the requested scaling change. This could lead to earlier changes being overwritten during incidents and other fast-paced scenarios.
For instance, imagine if one engineer tries to scale up to five replicas, believing there are currently three replicas running, but another developer has already scaled up to six replicas.
In this case, running kubectl scale --replicas 5
will actually remove one of the new replicas, unbeknownst to the first engineer.
kubectl scale
allows you to prevent this problem by including the --current-replicas
flag with your commands. When this flag is present, the scaling change will only proceed if the number of existing replicas matches --current-replicas
:
# Scale up to 5 replicas, IF there is currently 1 replica running
$ kubectl scale deployment/demo --replicas 5 --current-replicas 1
error: Expected replicas to be 1, was 3
The example above shows Kubernetes rejecting the scaling change because three replicas are currently running instead of the expected one.
4. Scaling down to zero (pause or stop a Deployment)
You can use kubectl scale
to pause a Deployment by scaling it right down to zero. This will remove all the Pods, but retain the Deployment object itself. You can then scale back up again in the future without having to recreate the entire Deployment.
$ kubectl scale deployment/demo --replicas 0
deployment.apps/demo scaled
After applying the scaling change, your Pods will be removed so your service will no longer serve traffic:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
demo 0/0 0 0 96m
$ kubectl get pods
No resources found in default namespace.
5. Scaling multiple deployments simultaneously
The examples above show how to use kubectl scale
with a single Deployment, but you can also scale multiple Deployments at once.
To scale multiple named Deployments, specify each Deployment as a separate argument to the command:
$ kubectl scale deployment/demo deployment/demo-1 deployment/demo-2 --replicas 5
You can optionally scale all the Deployments in a namespace by setting the --all
flag:
# Scale every Deployment in the "default" namespace to five replicas
$ kubectl scale deployment -n default --all --replicas 5
deployment.apps/demo scaled
Finally, kubectl scale
also supports standard Kubectl flags to scale matching Deployments by label or from a Kubernetes manifest file:
# Scale Deployments labelled app=demo-app
$ kubectl scale deployment --replicas 5 -l app=demo-app
# Scale Deployments defined by YAML manifests in the manifests/ directory
$ kubectl scale deployment --replicas 5 -f manifests/
6. More uses for the kubectl scale command
The kubectl scale
command isn’t just for Deployments — it also works with other scalable Kubernetes objects, including ReplicaSets and StatefulSets.
Simply replace deployment
in the examples above with replicaset
or statefulset
. Alternatively, you can use the shorthand terms rs
and sts
, respectively.
# Scale the "database" StatefulSet to 3 replicas
$ kubectl scale sts database --replicas 3
Other options for scaling Kubernetes Deployments
The kubectl scale
command isn’t the only way to scale Kubernetes Deployments. The command is simple and convenient, but its limitations mean it’s generally most useful for ad hoc testing and development processes.
Although kubectl scale
lets you declare the number of Pods that should be running, the command itself is an imperative operation. In production environments, it’s best practice to use fully declarative configuration instead. This requires writing Kubernetes manifest files that you can then apply to your cluster with the kubectl apply
command.
Here’s an example of the manifest file for a simple Kubernetes Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-deployment
spec:
replicas: 3
selector:
matchLabels:
app: demo-deployment
template:
metadata:
labels:
app: demo-deployment
spec:
containers:
- name: nginx
image: nginx:latest
The manifest defines a Deployment called demo-deployment
that runs three Pod replicas. You can use kubectl apply
to add this Deployment to your cluster:
$ kubectl apply -f deployment.yml
deployment.apps/demo-deployment created
You can then scale your Deployment by simply editing the replicas
field within the manifest file, such as by changing replicas: 3
to replicas: 5
. Kubernetes will then automatically apply the changes to your cluster when you run kubectl apply
:
$ kubectl apply -f deployment.yml
deployment.apps/demo-deployment configured
Compared with kubectl scale
, this approach is less susceptible to errors. Using Kubernetes manifest files with kubectl apply
lets you manage your Deployments using IaC and GitOps. You can commit your manifests to a Git repository, open pull requests to apply scaling changes, and then run kubectl apply
to update your cluster in a CI/CD pipeline. This reduces the risk of conflicts when several DevOps engineers are scaling Deployments up and down.
Finally, you can also automate Deployment scaling operations using the Kubernetes HorizontalPodAutoscaler component. This allows you to dynamically change Pod replica counts based on observed conditions such as CPU utilization.
Autoscaling ensures the number of running replicas is always matched to load, removing the need to manually run kubectl scale
or edit your Deployment’s replicas
manifest field.
To scale a Kubernetes deployment effectively, combine manual scaling with autoscaling and infrastructure-aware strategies.
- Use Horizontal Pod Autoscaler (HPA): Enable dynamic scaling based on CPU, memory, or custom metrics using
kubectl autoscale
. This adapts the deployment to real-time demand. - Use
kubectl scale
for controlled changes: For fixed scaling needs or temporary overrides, usekubectl scale deployment <name> --replicas=<count>
. Avoid this when HPA is active to prevent conflicts. - Set resource requests and limits: Define
resources.requests
andresources.limits
for each container. This ensures accurate scheduling and enables effective HPA decisions. - Plan cluster capacity: Confirm the cluster has enough resources to support new pods. Use Cluster Autoscaler for dynamic node scaling in cloud environments.
- Use readiness probes: Configure
readinessProbes
so only healthy pods receive traffic during scaling events. This ensures availability and smooth rollouts. - Leverage rolling updates: Ensure deployments are configured for rolling updates to prevent service disruption during scaling or version changes.
- Monitor scaling behavior: Use monitoring tools like Prometheus or CloudWatch to observe pod metrics and validate that scaling matches demand patterns.
- Use declarative manifests for consistency: Define replica counts and HPA configs in manifests for version control and traceability, especially in GitOps workflows.
If you need help managing your Kubernetes projects, consider Spacelift. It brings with it a GitOps flow, so your Kubernetes Deployments are synced with your Kubernetes Stacks, and pull requests show you a preview of what they’re planning to change.
With Spacelift, you get:
- Policies to control what kind of resources engineers can create, what parameters they can have, how many approvals you need for a run, what kind of task you execute, what happens when a pull request is open, and where to send your notifications
- Stack dependencies to build multi-infrastructure automation workflows with dependencies, having the ability to build a workflow that can combine Terraform with Kubernetes, Ansible, and other infrastructure-as-code (IaC) tools such as OpenTofu, Pulumi, and CloudFormation.
- Self-service infrastructure via Blueprints, enabling your developers to do what matters – developing application code while not sacrificing control
- Creature comforts such as contexts (reusable containers for your environment variables, files, and hooks), and the ability to run arbitrary code
- Drift detection and optional remediation
If you want to learn more about Spacelift, create a free account today or book a demo with one of our engineers.
We’ve learned how to use the kubectl scale command to scale Kubernetes Deployments. This simple command allows you to quickly add and remove Deployment replicas to increase service capacity or free up resources for neighboring workloads. We’ve also seen how kubectl scale can be used with other scalable Kubernetes objects like StatefulSets.
The kubectl scale command is one of the easiest ways to instantly scale Kubernetes Deployments. It’s a good choice when you’re experimenting with Kubernetes, managing short-lived testing workloads, or making urgent changes in production.
For other scenarios, we recommend using Kubernetes manifest files and kubectl apply to declaratively configure your Deployment’s replica count. This allows you to version your changes, improve deployment safety, and avoid manual kubectl scale commands.
Ready to read more about Kubernetes scaling? Check out our other blog articles to learn about horizontal, vertical, and cluster scaling or the different Deployment rollout strategies you can use.
Manage Kubernetes easier and faster
Spacelift allows you to automate, audit, secure, and continuously deliver your infrastructure. It helps overcome common state management issues and adds several must-have features for infrastructure management.
Frequently asked questions
How to scale Kubernetes deployment to 0?
To scale a Kubernetes deployment to 0, use the
kubectl scale
command or modify the deployment manifest to setreplicas: 0
.What does the kubectl scale command do?
The
kubectl scale
command adjusts the number of replicas for a Kubernetes resource such as a Deployment, StatefulSet, or ReplicaSet. It is used to manually increase or decrease the number of pod instances.