[Live Q&A] Top Questions of Teams Switching from HCP/TFE to Spacelift

➡️ Register Now

Kubernetes

How to Use Kubectl Scale Deployment in Kubernetes

kubect scale deployment

🚀 Level Up Your Infrastructure Skills

You focus on building. We’ll keep you updated. Get curated infrastructure insights that help you make smarter decisions.

Kubernetes Deployments are objects that manage a set of identical Pods. Kubernetes creates Pods based on the Deployment’s spec and then ensures the correct number of Pods stay running. If a Pod fails, it’s automatically replaced. This lets you run your services with high availability.

You can scale Kubernetes Deployments by adding or removing Pods at any time. Changing a Deployment’s replica count allows you to increase service capacity or optimize resource consumption. 

In this article, we’ll examine one of the easiest methods for scaling Deployments: the kubectl scale CLI command.

What we’ll cover:

  1. How to scale Kubernetes Pods with Deployments?
  2. Use cases for Kubernetes Deployment scaling
  3. What is kubectl scale deployment command?
  4. How to scale Kubernetes Pods using kubectl scale deployment
  5. How to scale Kubernetes Deployment to 0?
  6. Best practices for scaling Deployments

How to scale Kubernetes Pods with Deployments?

The Kubernetes Deployment object is one of the main ways to scale a set of Pods. It provides declarative Pod configuration, rolling updates, and a built-in scaling mechanism.

Creating Pods using a Deployment allows you to scale the replica count up or down at any time, without having to add or remove Pod objects manually. 

When you run kubectl scale, Kubernetes automatically starts and destroys Pods as required. It’ll use the Pod spec contained within the Deployment object to configure newly created Pods. Internally, Deployments use lower-level ReplicaSet objects to ensure the expected number of Pod replicas remains continually available.

Deployments are the most popular way to run stateless workloads in Kubernetes, but alternative Pod scaling mechanisms suit other use cases. StatefulSets allow you to scale stateful services such as file servers and databases, for example, while DaemonSets automatically distribute identical Pods across your cluster’s Nodes. For this article, we’re just focusing on how to scale Deployments using kubectl scale.

Use cases for Kubernetes Deployment scaling

Here are some reasons why you may need to scale a Kubernetes Deployment’s replica count with kubectl scale:

  • Increase service capacity: Increasing a Deployment’s replica count means more Pods will be available to serve traffic, increasing your service’s maximum capacity.
  • Ensure stable performance: Adding replicas using kubectl scale can help distribute service traffic more evenly, preventing slowdowns due to resource contention.
  • Optimize resource usage: You can use kubectl scale to remove unneeded Pods during quieter times, freeing up resources for other workloads.
  • Reduce operating costs: Scaling down may reduce your cluster’s total resource usage to a level that lets you also remove Nodes. This enables you to minimize spending when there’s low load.
  • Respond to load changes: Scaling your Deployments lets you respond to changes in your Deployment’s usage, enabling more efficient Kubernetes cluster operations.

Each of these points shares a common theme: Scaling a Deployment lets you change the number of Pods available for a specific service in your cluster. 

The kubectl scale command lets you quickly scale up to increase resiliency or scale down to reduce resource utilization and save costs.

What is kubectl scale deployment command?

The kubectl scale deployment command adjusts the number of replicas in a Kubernetes Deployment.

It modifies the Deployment’s .spec.replicas field to increase or decrease the number of running Pods. This is typically used to manually scale workloads based on demand or during testing. 

For example, the command:

kubectl scale deployment my-app --replicas=5

updates the my-app Deployment to maintain five active Pods. Kubernetes ensures that the desired number of replicas is running by creating or terminating Pods as needed. 

How to scale Kubernetes Pods using kubectl scale deployment

Let’s walk through some simple examples of how to use kubectl scale to change a Kubernetes Deployment’s replica count.

1.  Creating a Deployment

First, use kubectl create to start a Deployment for this tutorial:

$ kubectl create deployment demo --image=nginx:latest --replicas 3
deployment.apps/demo created

This command creates a new Deployment called demo. It’s configured to run three Pod replicas with the nginx:latest image:

$ kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   3/3     3            3           68m

$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
demo-57c86c54c5-kdf57   1/1     Running   0          70m
demo-57c86c54c5-tsgww   1/1     Running   0          70m
demo-57c86c54c5-tsk6s   1/1     Running   0          70m

The kubectl get commands shown above demonstrate that the three expected Pod replicas are running. Now you can use kubectl scale to scale your Deployment by adding and removing replicas.

2. Scaling a Deployment

The kubectl scale command requires two main arguments:

  • The name of the Deployment to scale.
  • A --replicas flag that specifies the number of replicas to scale the Deployment to.

Here’s an example that scales our demo Deployment up to five replicas:

$ kubectl scale deployment/demo --replicas 5
deployment.apps/demo scaled

Repeating the kubectl get deployments and kubectl get pods command from earlier should now show that five replicas are running:

$ kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   5/5     5            5           72m

$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
demo-57c86c54c5-kdf57   1/1     Running   0          73m
demo-57c86c54c5-nwr8f   1/1     Running   0          51s
demo-57c86c54c5-tsgww   1/1     Running   0          73m
demo-57c86c54c5-tsk6s   1/1     Running   0          73m
demo-57c86c54c5-zd4g4   1/1     Running   0          51s

You’ve successfully used kubectl scale to scale your Deployment, without having to manually create any more Pods. Kubernetes has created new Pods automatically, based on your Deployment’s configuration.

You can now try scaling back to three replicas:

$ kubectl scale deployment/demo --replicas 3
deployment.apps/demo scaled

Kubernetes will automatically remove surplus Pods to achieve the new desired replica count:

$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
demo-57c86c54c5-kdf57   1/1     Running   0          80m
demo-57c86c54c5-tsgww   1/1     Running   0          80m
demo-57c86c54c5-tsk6s   1/1     Running   0          80m

Importantly, scaling changes happen automatically in either direction: You don’t need to tell kubectl scale to “scale up” or “scale down.” Kubernetes always applies the correct actions to achieve the new desired state. You just declare the number of replicas that you want to have running.

3. Preventing unexpected scaling changes

kubectl scale can have safety implications because the command immediately applies the requested scaling change. This could lead to earlier changes being overwritten during incidents and other fast-paced scenarios.

For instance, imagine if one engineer tries to scale up to five replicas, believing there are currently three replicas running, but another developer has already scaled up to six replicas. 

In this case, running kubectl scale --replicas 5 will actually remove one of the new replicas, unbeknownst to the first engineer.

kubectl scale allows you to prevent this problem by including the --current-replicas flag with your commands. When this flag is present, the scaling change will only proceed if the number of existing replicas matches --current-replicas:

# Scale up to 5 replicas, IF there is currently 1 replica running
$ kubectl scale deployment/demo --replicas 5 --current-replicas 1
error: Expected replicas to be 1, was 3

The example above shows Kubernetes rejecting the scaling change because three replicas are currently running instead of the expected one.

4. Scaling down to zero (pause or stop a Deployment)

You can use kubectl scale to pause a Deployment by scaling it right down to zero. This will remove all the Pods, but retain the Deployment object itself. You can then scale back up again in the future without having to recreate the entire Deployment.

$ kubectl scale deployment/demo --replicas 0
deployment.apps/demo scaled

After applying the scaling change, your Pods will be removed so your service will no longer serve traffic:

$ kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   0/0     0            0           96m

$ kubectl get pods
No resources found in default namespace.

5. Scaling multiple deployments simultaneously

The examples above show how to use kubectl scale with a single Deployment, but you can also scale multiple Deployments at once.

To scale multiple named Deployments, specify each Deployment as a separate argument to the command:

$ kubectl scale deployment/demo deployment/demo-1 deployment/demo-2 --replicas 5

You can optionally scale all the Deployments in a namespace by setting the --all flag:

# Scale every Deployment in the "default" namespace to five replicas
$ kubectl scale deployment -n default --all --replicas 5
deployment.apps/demo scaled

Finally, kubectl scale also supports standard Kubectl flags to scale matching Deployments by label or from a Kubernetes manifest file:

# Scale Deployments labelled app=demo-app
$ kubectl scale deployment --replicas 5 -l app=demo-app

# Scale Deployments defined by YAML manifests in the manifests/ directory
$ kubectl scale deployment --replicas 5 -f manifests/

6. More uses for the kubectl scale command

The kubectl scale command isn’t just for Deployments — it also works with other scalable Kubernetes objects, including ReplicaSets and StatefulSets. 

Simply replace deployment in the examples above with replicaset or statefulset. Alternatively, you can use the shorthand terms rs and sts, respectively.

# Scale the "database" StatefulSet to 3 replicas
$ kubectl scale sts database --replicas 3

Other options for scaling Kubernetes Deployments

The kubectl scale command isn’t the only way to scale Kubernetes Deployments. The command is simple and convenient, but its limitations mean it’s generally most useful for ad hoc testing and development processes.

Although kubectl scale lets you declare the number of Pods that should be running, the command itself is an imperative operation. In production environments, it’s best practice to use fully declarative configuration instead. This requires writing Kubernetes manifest files that you can then apply to your cluster with the kubectl apply command.

Here’s an example of the manifest file for a simple Kubernetes Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: demo-deployment
  template:
    metadata:
      labels:
        app: demo-deployment
    spec:
      containers:
        - name: nginx
          image: nginx:latest

The manifest defines a Deployment called demo-deployment that runs three Pod replicas. You can use kubectl apply to add this Deployment to your cluster:

$ kubectl apply -f deployment.yml
deployment.apps/demo-deployment created

You can then scale your Deployment by simply editing the replicas field within the manifest file, such as by changing replicas: 3 to replicas: 5. Kubernetes will then automatically apply the changes to your cluster when you run kubectl apply:

$ kubectl apply -f deployment.yml
deployment.apps/demo-deployment configured

Compared with kubectl scale, this approach is less susceptible to errors. Using Kubernetes manifest files with kubectl apply lets you manage your Deployments using IaC and GitOps. You can commit your manifests to a Git repository, open pull requests to apply scaling changes, and then run kubectl apply to update your cluster in a CI/CD pipeline. This reduces the risk of conflicts when several DevOps engineers are scaling Deployments up and down.

Finally, you can also automate Deployment scaling operations using the Kubernetes HorizontalPodAutoscaler component. This allows you to dynamically change Pod replica counts based on observed conditions such as CPU utilization. 

Autoscaling ensures the number of running replicas is always matched to load, removing the need to manually run kubectl scale or edit your Deployment’s replicas manifest field.

Best practices for scaling a Deployment

To scale a Kubernetes deployment effectively, combine manual scaling with autoscaling and infrastructure-aware strategies.

  1. Use Horizontal Pod Autoscaler (HPA): Enable dynamic scaling based on CPU, memory, or custom metrics using kubectl autoscale. This adapts the deployment to real-time demand.
  2. Use kubectl scale for controlled changes: For fixed scaling needs or temporary overrides, use kubectl scale deployment <name> --replicas=<count>. Avoid this when HPA is active to prevent conflicts.
  3. Set resource requests and limits: Define resources.requests and resources.limits for each container. This ensures accurate scheduling and enables effective HPA decisions.
  4. Plan cluster capacity: Confirm the cluster has enough resources to support new pods. Use Cluster Autoscaler for dynamic node scaling in cloud environments.
  5. Use readiness probes: Configure readinessProbes so only healthy pods receive traffic during scaling events. This ensures availability and smooth rollouts.
  6. Leverage rolling updates: Ensure deployments are configured for rolling updates to prevent service disruption during scaling or version changes.
  7. Monitor scaling behavior: Use monitoring tools like Prometheus or CloudWatch to observe pod metrics and validate that scaling matches demand patterns.
  8. Use declarative manifests for consistency: Define replica counts and HPA configs in manifests for version control and traceability, especially in GitOps workflows.

Managing Kubernetes resources with Spacelift

If you need help managing your Kubernetes projects, consider Spacelift. It brings with it a GitOps flow, so your Kubernetes Deployments are synced with your Kubernetes Stacks, and pull requests show you a preview of what they’re planning to change. 

With Spacelift, you get:

  • Policies to control what kind of resources engineers can create, what parameters they can have, how many approvals you need for a run, what kind of task you execute, what happens when a pull request is open, and where to send your notifications
  • Stack dependencies to build multi-infrastructure automation workflows with dependencies, having the ability to build a workflow that can combine Terraform with Kubernetes, Ansible, and other infrastructure-as-code (IaC) tools such as OpenTofu, Pulumi, and CloudFormation.
  • Self-service infrastructure via Blueprints, enabling your developers to do what matters – developing application code while not sacrificing control
  • Creature comforts such as contexts (reusable containers for your environment variables, files, and hooks), and the ability to run arbitrary code
  • Drift detection and optional remediation

If you want to learn more about Spacelift, create a free account today or book a demo with one of our engineers.

Key points

We’ve learned how to use the kubectl scale command to scale Kubernetes Deployments. This simple command allows you to quickly add and remove Deployment replicas to increase service capacity or free up resources for neighboring workloads. We’ve also seen how kubectl scale can be used with other scalable Kubernetes objects like StatefulSets.

The kubectl scale command is one of the easiest ways to instantly scale Kubernetes Deployments. It’s a good choice when you’re experimenting with Kubernetes, managing short-lived testing workloads, or making urgent changes in production. 

For other scenarios, we recommend using Kubernetes manifest files and kubectl apply to declaratively configure your Deployment’s replica count. This allows you to version your changes, improve deployment safety, and avoid manual kubectl scale commands.

Ready to read more about Kubernetes scaling? Check out our other blog articles to learn about horizontal, vertical, and cluster scaling or the different Deployment rollout strategies you can use.

Manage Kubernetes easier and faster

Spacelift allows you to automate, audit, secure, and continuously deliver your infrastructure. It helps overcome common state management issues and adds several must-have features for infrastructure management.

Learn more

Frequently asked questions

  • How to scale Kubernetes deployment to 0?

    To scale a Kubernetes deployment to 0, use the kubectl scale command or modify the deployment manifest to set replicas: 0.

  • What does the kubectl scale command do?

    The kubectl scale command adjusts the number of replicas for a Kubernetes resource such as a Deployment, StatefulSet, or ReplicaSet. It is used to manually increase or decrease the number of pod instances.

The Practitioner’s Guide to Scaling Infrastructure as Code

Transform your IaC management to scale

securely, efficiently, and productively

into the future.

ebook global banner
Share your data and download the guide