Kubernetes is an open-source container orchestrator that automates container deployment, scaling, and administration tasks. It provides helpful abstraction layers for running containerized workloads in production.
What we will cover in this article:
Kubernetes is a distributed system that horizontally scales containers across multiple physical hosts termed Nodes. This produces fault-tolerant deployments which adapt to conditions such as Node resource pressure, instability, and elevated external traffic levels. If one Node suffers an outage, Kubernetes can reschedule your containers onto neighboring healthy Nodes.
The system provides auto-scaling functionality that changes deployment replica counts automatically as the load grows. Managed Kubernetes services such as Google GKE and Amazon EKS support dynamic Node creation too. This helps you ride out traffic spikes by dynamically increasing your cluster’s capacity without requiring manual intervention.
All this functionality means the Kubernetes architecture is relatively complex. Several different components work together to create a functioning cluster.
In this article, we’ll look at each major item in turn to help you understand better how Kubernetes works.
Kubernetes distributed system follows a client-server architecture with two main components: the control pane and Worker Nodes.
Worker Nodes are the machines where the actual application workloads (packaged as pods) run. Each Worker Node contains essential components such as kubelet (to communicate with the control plane), a container runtime (e.g., Docker or containerd), and the kube-proxy (to manage networking and load balancing).
Worker Nodes are part of the data plane, but the data plane includes both the infrastructure (Worker Nodes) and the components interacting with the workloads.
The control plane is the central management entity of the Kubernetes cluster, responsible for maintaining the desired state of the cluster. It consists of:
- API Server: Acts as the interface for managing the cluster and communicates with all components
- Scheduler: Assigns pods to Nodes based on resource availability and policies
- Controller Manager: Handles background tasks such as maintaining node health, scaling, and other cluster-wide operations
- etcd: A key-value store that stores all cluster data, including configuration and state
In addition to the core components, Kubernetes clusters often include add-ons to extend functionality. These could be, for example:
- Networking solutions (e.g., Calico, Cilium)
- Monitoring tools (e.g., Prometheus, Grafana)
- Ingress controllers (e.g., NGINX Ingress Controller)
- Storage solutions (e.g., Ceph, Longhorn)
Kubernetes control plane vs data plane
The control plane manages and orchestrates the cluster. It acts as the “brain” of the Kubernetes cluster, maintaining the desired system state. The data plane runs the actual application workloads. It consists of the Worker Nodes and the components on them that interact with the control plane to execute tasks.
This clear separation between the control plane and data plane ensures that Kubernetes remains scalable, modular, and capable of maintaining high availability.
Control Plane | Data Plane | |
Purpose | Manages and orchestrates the cluster | Executes workloads and provides networking |
Location | Runs on dedicated master Nodes | Runs on worker Nnodes |
Components | API Server, Scheduler, Controllers, etcd | Kubelet, Kube-Proxy, Container Runtime, Pods |
Interaction | Interfaces with administrators and users | Interfaces with workloads and control plane |
Kubernetes aims to lower the management overhead of running fleets of containers in production. It achieves this by pooling multiple compute Nodes into one logical platform termed a cluster. Deploying workloads to your Kubernetes cluster automatically starts containers on one or more of the Nodes.
Kubernetes includes several abstraction layers that you use to define your application. Here are some of the most common workload objects:
1. Pod
Pods are the fundamental compute unit in Kubernetes. A Pod is a group of one or more containers that share the same specification. All the containers in the Pod are scheduled to the same host.
The diagram below shows the architecture of a Pod:
2. Deployment
A Deployment wraps the lower-level ReplicaSet object. It guarantees a certain number of replicas of a Pod will be running in your cluster. Deployments also provide declarative updates for Pods; you describe the desired state, and the Deployment will automatically add, replace, and remove Pods to achieve it.
3. Service
Services expose Pods as a network service. You use services to permit access to Pods, either within your cluster via automatic service discovery, or externally through an Ingress. (Read more: What is a Kubernetes Service?)
4. Job
A Job starts one or more Pods and waits for them to successfully terminate. Kubernetes also provides CronJobs to automatically create Jobs on a recurring schedule.
5. DaemonSets and StatefulSets
Other kinds of workloads include DaemonSets and StatefulSets. DaemonSets replicate a Pod to every Node in your cluster, while StatefulSets provide persistent replica identities.
See StatefulSet vs. Deployment.
Kubernetes control plane components
The Kubernetes control plane is your cluster’s management surface. It stores the cluster’s state, monitors for changes, and applies any required actions. Users could initiate actions via the API or in response to Node events, such as increased memory pressure.
The control plane gets created automatically when you deploy a cluster. It often runs on a dedicated Node, ensuring it’s isolated from your workloads for maximum performance and security. This is not mandatory, though — the machine that runs the control plane can also be used as a regular Node.
The control plane is a collective term for many different components. Together, they provide everything needed to administer your cluster but not actually start and run containers. Node-level software provides those functions, which we’ll see in the next section.
1. API Server (kube-apiserver)
The API server is the control plane component that exposes the Kubernetes REST API. You’re using this API whenever you run a command with Kubectl. You’ll lose management access to your cluster when the API server fails, but your workloads won’t necessarily be affected.
2. Controller manager (kube-controller-manager)
Much of Kubernetes is built upon the controller pattern. A controller is a loop that continually monitors your cluster and performs actions when certain events occur. The Deployment controller launches new Pods when you create a Deployment object, for example.
The controller manager oversees all the controllers in your cluster. It starts their processes and ensures they’re operational while your cluster’s running.
3. Scheduler (kube-scheduler)
The scheduler is responsible for placing newly created Pods onto the Nodes in your cluster. The scheduling process works by first filtering out Nodes that can’t host the Pod, and then scoring each eligible Node to identify the most suitable placement.
Nodes could be filtered out because of insufficient CPU or memory, inability to satisfy the Pod’s affinity rules, or other factors such as being cordoned for maintenance. The scoring process prioritizes Pods that satisfy non-mandatory conditions like preferred affinities. If several Nodes appear to be equally suitable, Kubernetes will try to evenly distribute your Pods across them.
4. Etcd
Etcd is a distributed key-value storage system. Kubernetes stores your cluster’s state within etcd. The main role of etcd is to hold every API object, including config values and sensitive data you store in ConfigMaps and Secrets.
Etcd is the most security-critical control plane component. Successfully compromising it would permit full access to your Kubernetes data. It’s important that etcd receives adequate hardware resources, too, as any starvation can affect the performance and stability of your entire cluster.
5. Cloud Controller Manager
The Cloud Controller Manager integrates Kubernetes with your cloud provider’s platform. It facilitates interactions between your cluster and its outside environment. This component is involved whenever Kubernetes objects change your cloud account, such as by provisioning a load balancer, adding a block storage volume, or creating a virtual machine to act as a Node.
Kubernetes node components
Nodes are the physical or virtual machines that host the Pods in your cluster. Although it’s possible to run a cluster with a single Node, production environments should include several so you can horizontally scale your resources and achieve high availability.
Nodes join the cluster using a token issued by the control plane. Once a Node is admitted, the control plane starts scheduling new Pods for it. Each Node runs several software components to start containers and maintain communication with the control plane.
The diagram below shows the architecture of a node:
1. Kubelet
Kubelet is the Node-level process that acts as the control plane’s agent. It periodically checks in with the control plane to report the state of the Node’s workloads. The control plane can contact Kubelet when it wants to schedule a new Pod on the Node.
Kubelet is also responsible for running Pod containers. It pulls the images required by newly scheduled Pods and starts containers to produce the desired state. Once the containers are up, Kubelet monitors them to ensure they remain healthy.
Read more about Kubernetes Image Pull Policy.
2. Kube-Proxy
The kube-proxy component facilitates network communications between the Nodes in your cluster. It automatically applies and maintains networking rules so that Pods exposed by Services are able to reach each other. If kube-proxy fails, Pods on that Node won’t be reachable over the network.
3. Container runtime
Each Node requires a CRI-compatible runtime so it can start your containers.
The containerd runtime is the most popular option, but alternatives such as CRI-O and Docker Engine can be used instead. The runtime uses operating system features such as cgroups to achieve containerization.
Kubernetes is highly extensible, so you can customize it to suit your environment. The control plane and Node-level software stacks are the most important architecture aspects, but several other aspects are significant.
Kubernetes networking uses a plugin-based approach. A CNI-compatible networking plugin must be installed to allow Pods to reach each other. Most popular Kubernetes distributions include a plugin, but you’ll have to manually install a solution such as Calico or Flannel when you deploy a cluster from scratch.
Storage provisioning can work very differently depending on your cloud provider. Storage Classes provide a consistent interface for accessing different types of storage in your workloads. You can add storage classes to save data to different platforms, such as a local volume on a Node’s filesystem or your cloud platform’s block storage volumes.
Kubernetes also has an external dependency on a container registry. You’ll need somewhere central to store your container images. You can run a registry inside your cluster, but this is not included with the default Kubernetes distribution.
Custom functionality
You can add your own Kubernetes abstractions with custom resource definitions (CRDs). CRDs extend the API with support for your own data structures.
You can build the functionality around your CRDs by writing controllers and operators. These facilitate advanced automated workflows, such as automatically provisioning a database when you add a PostgresDatabaseConnection object to your cluster.
They use the same fundamental concepts as built-in functionality: you author a control loop that watches for new objects and performs tasks when they occur.
Complexity and many moving parts create the potential for security problems. Hardening Kubernetes is a substantial topic: the system purports to be production-ready, but in practice, you need to take several manual actions to protect yourself fully.
Enabling etcd encryption is one essential step. Your cluster’s data isn’t encrypted by default, so passwords and certificates in secrets are stored in plain text. It’s also important to secure the API server, avoid running other software on your Nodes, and ensure you use features like networking policies to fully isolate your workloads from each other. You can learn how to strengthen your cluster in our Kubernetes security guide.
You can create Kubernetes clusters in several ways:
- Using cloud providers – Amazon EKS, Google GKE, Azure AKS, and other cloud providers offer you fully fledged K8s services that can be deployed easily.
- Self-hosted deployments – You can leverage kubeadm to bootstrap a K8s cluster into your own infrastructure.
- Lightweight options – You can use Minikube, Kind (K8s IN Docker), MicroK8s, or K3s for running lightweight distributions of K8s.
For this example, we will set up a Kind cluster. To do that, we need to have Docker installed on our machine.
As I’m using MacOS, I will install kind by simply running:
brew install kind
Check out this installation guide if you are using a different operating system.
Now, we can easily set up a K8s cluster by running:
kind create cluster –name k8s
You can add any name you want, but if you omit the option, your cluster will be called kind.
This is the output of creating the cluster:
kind create cluster --name k8s
Creating cluster "k8s" ...
✓ Ensuring node image (kindest/node:v1.31.2) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-k8s"
You can now use your cluster with:
kubectl cluster-info --context kind-k8s
Have a nice day! 👋
Now to use the cluster, run the kubectl cluster-info command specified about. Let’s also check the nodes our cluster has:
kubectl cluster-info --context kind-k8s
Kubernetes control plane is running at
CoreDNS is running at
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
kubectl get nodes
k8s-control-plane Ready control-plane 2m14s v1.31.2
Kubernetes architecture requires the control plane components to remain available at all times. Although the system is naturally resilient to Node failures, a problem in the control plane will affect the whole system. Its central role makes it a potential weak point in the architecture.
You can mitigate this risk by running multiple replicas of each component. This produces a highly available control plane that’s distributed across several Nodes, similarly to your Pods. It takes more time to set up but provides additional safety for production clusters. If you’re using a managed Kubernetes service, check with your provider to see if a high-availability control plane option is available.
If you need assistance managing your Kubernetes projects, look at Spacelift. It brings with it a GitOps flow, so your Kubernetes Deployments are synced with your Kubernetes Stacks, and pull requests show you a preview of what they’re planning to change.
You can also use Spacelift to mix and match Terraform, Pulumi, AWS CloudFormation, and Kubernetes Stacks and have them talk to one another.
To take this one step further, you could add custom policies to reinforce the security and reliability of your configurations and deployments. Spacelift provides different types of policies and workflows that are easily customizable to fit every use case. For instance, you could add plan policies to restrict or warn about security or compliance violations or approval policies to add an approval step during deployments.
You can try Spacelift for free by creating a trial account or booking a demo with one of our engineers.
Kubernetes is a complete platform for deploying, scaling, and managing distributed systems using containers. The price of this powerful functionality is a complex and potentially confusing architecture. Kubernetes clusters are formed from several pieces, each with its own vital role.
In this article, you’ve learned how the fundamental components, such as Kubelet, kube-scheduler, etcd, and the API server combine to create an operational cluster. You can now continue your Kubernetes journey by deploying your own environment, either from scratch with Kubeadm or using a managed cloud service such as Amazon EKS, Google GKE, or DigitalOcean DOKS. You can get all the information you need to use Kubernetes in our beginners guide, or try some of the advanced topics elsewhere on our blog.
Manage Kubernetes Easier and Faster
Spacelift allows you to automate, audit, secure, and continuously deliver your infrastructure. It helps overcome common state management issues and adds several must-have features for infrastructure management.