Kubernetes Pods allows you to have multiple containers sharing the same network space and can also share the same storage. Using a single container is often the right choice for a Pod. Still, there are several common patterns for when you should use multiple containers. Multiple containers in a Pod allow for better separation of concerns and can improve container image reusability, scalability, and fault isolation.
In this article, you will learn about one of the multi-container patterns — the sidecar container pattern — its use cases, best practices, and walk through a detailed demo of implementing it in a Pod.
We will cover:
A Kubernetes side container is an additional container that runs alongside a primary application container within a Pod. The sidecar container pattern follows the principle of separating concerns and keeping individual components of an application isolated.
The primary application containers typically contain the main business logic or application code, while the sidecar container serves as a helper container providing complementary functionality or services to support the primary container. The sidecar container runs in the same network as the primary containers, enabling them to communicate and share resources efficiently.
Both sidecar containers and init containers in Kubernetes are used within a Pod to extend or enhance the primary containers’ functionality. However, they serve different purposes and have different characteristics.
A sidecar container provides supplementary services, functionalities, or features to support the primary containers or the application as a whole. An init container, on the other hand, runs and completes its execution before the primary containers start running. It performs initialization or setup tasks required by the primary containers or the Pod, such as downloading or preparing data, configuring resources, or waiting for specific conditions.
Sidecar containers in Kubernetes Pods offer a versatile approach to extend and enhance the functionality of the primary containers. Here are some common use cases for sidecar containers:
1. Logging and monitoring:
A sidecar container can collect logs generated by the primary containers and forward them to a centralized logging system. It can also capture metrics and monitoring data from the primary containers and send them to a monitoring system or dashboard.
2. Service discovery and load balancing:
A sidecar container can handle service discovery within the pod or communicate with external service discovery mechanisms. It can perform load balancing across multiple instances of the primary containers or distribute incoming traffic based on specific rules or algorithms.
3. Security and authentication:
A sidecar container can handle SSL termination for the primary containers, offloading the encryption and decryption tasks. It can also provide authentication and authorization mechanisms, ensuring secure access to the primary containers and handling user or token validation.
4. Caching and content delivery:
A sidecar container can implement caching mechanisms to improve the primary containers’ performance by storing frequently accessed data. It can integrate with content delivery networks (CDNs) or reverse proxy functionality to cache and serve static assets, reducing the load on the primary containers.
5. Data synchronization and replication:
A sidecar container can handle data synchronization tasks, keeping the data in the primary container(s) in sync with external databases or services. It can replicate data across multiple primary container instances to ensure consistency and availability.
6. File watching and hot reloading:
A sidecar container can monitor file changes within the pod and trigger hot reloading or dynamic configuration updates in the primary containers without requiring a restart.
The above are just a few examples of the various use cases for sidecar containers within Kubernetes pods. The flexibility and modularity provided by sidecar containers allow for integrating additional functionalities and services into a pod without directly modifying the primary containers.
In this demo, you will implement a sidecar container as a primary container’s logging agent. The sidecar will stream logs from the primary container to a storage location.
This demo’s sidecar container will use Fluentd — an open-source data collector often used as a logging layer — with an S3 plugin installed to stream log files in the primary container to Amazon S3 — an object storage infrastructure for storing and retrieving vast amounts of data over the internet.
Note: Though this article uses Fluentd and Amazon S3, it is important to know that you can use any other data collector like Logstash as the logging agent and any other storage like MinIO to store log data.
Step 1: Prerequisites
Before you implement a sidecar container for logging with Fluentd and Amazon S3, ensure you have the following prerequisites:
- A Kubernetes cluster. The demos in this article were done using minikube — a single Node Kubernetes cluster.
- The kubectl command-line tool configured to communicate with the cluster.
- An AWS account for S3 access. You can create a free account if you don’t have an AWS account.
- AWS CLI installed in your terminal and configured to your AWS account. If you don’t have an AWS CLI, learn how to install it here and how to connect it to your account here.
Also, create a namespace to store the demo resources for easy cleanup:
$ kubectl create namespace sidecar-logging-demo
Set the namespace as the default for the current context so all resources you create will be in there:
$ kubectl config set-context $(kubectl config current-context) --namespace=sidecar-logging-demo
Step 2: Create the S3 bucket
In Amazon S3, you store data as objects within resources called “buckets”. You can create a bucket via the AWS CLI, programmatically or via the S3 Console. We will create an S3 bucket using the AWS CLI.
In your terminal, run the following command to create a demo S3 bucket:
$ aws s3 mb s3://<your-bucket-name>
The above command will create an S3 bucket with your specified bucket name in your default AWS region and return output as in the screenshot below.
Your bucket name must be globally unique and must not contain spaces or uppercase letters. To specify a region to create the bucket in, use the
--region option. There are other options you can use when creating a s3 bucket. To learn more about them, see the AWS CLI s3 reference.
You can verify that the bucket was created with the
$ aws s3 ls command, or by navigating to the S3 Console.
You should see your bucket like in the image below.
Step 3: Create the multi-container Pod
As you read earlier, a sidecar container is an additional container that runs alongside a primary application container(s) within a Pod.
For this demo, the sidecar will collect the logs from a primary container (BusyBox) and forward them to your S3 bucket.
To use Fluentd to collect data from the primary container, you need to create a fluentd.conf configuration file which will define the source — a description of where all the data comes from — and match what Fluentd should do with the data.
The match config for this demo will tell Fluentd to send data to an S3 bucket. There are other directives you can add to a fluentd.conf file like filter, worker, but they are not needed for us now.
To create the fluentd.conf in Kubernetes, you use a ConfigMap API object. In any directory of your choice, create a new file fluentd-sidecar-config.yaml with the following command:
$ touch fluentd-sidecar-config.yaml
In the fluentd-sidecar-config.yaml file, add the following Kubernetes ConfigMap configuration using your vim or nano editor:
# First log source (tailing a file at /var/log/1.log)
# Second log source (tailing a file at /var/log/2.log)
# S3 output configuration (Store files every minute in the bucket's logs/ folder)
In the ConfigMap above:
- There are two source definitions, one for tailing the first log source file and the other for the second log source. Don’t worry about the sources of these logs yet, as you are yet to create them.
In the next step, you will define the sources when you create the Pod. Just know that two log sources are configured in the /var/log directory and their log messages will be tagged with count.format1 and count.format2. The primary container in the Pod will stream logs to those two files.
- The match configuration sends the files to the S3 bucket every minute. To gain access to the S3 bucket, aside from the bucket information, you will need to add your AWS credentials to the match configuration. Suppose your Kubernetes cluster is an AWS EKS cluster. In that case, you can use
hostNetwork: truein the Pod config to allow sidecar access to IAM instance profile credentials.
Note: The ConfigMap above poses a security risk because ConfigMaps are meant only to store non-confidential data. Don’t do this in production. If you want to write the credentials like the above, set least-privilege access to ConfigMap as the default setting with RBAC rules. A more secure approach is to use a secrets manager like Hashicorp Vault, Conjur, etc. or your cloud-specific secrets manager to inject the credentials into the Pod.
With all that said, create the demo ConfigMap with the following command:
$ kubectl apply -f fluentd-sidecar-config.yaml
Next, you will create the multi-container Pod with the ConfigMap created as in the above image.
In the same directory, create a new file multi-container-pod.yaml, with the following command:
$ touch multi-container-pod.yaml
In the multi-container-pod.yaml file, add the following Pod configuration using your Vim or nano editor:
- name: count
command: ["/bin/sh", "-c"]
# Write two log files along with the date and a counter
# every second
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
# Mount the log directory /var/log using a volume
- name: varlog
- name: logging-agent
- name: FLUENTD_ARGS
value: -c /fluentd/etc/fluent.conf
# Mount the log directory /var/log using a volume
# and the config file
- name: varlog
- name: config-volume
# Declare volumes for log directory and ConfigMap
- name: varlog
- name: config-volume
In the Pod configuration above:
- The first container count is the BusyBox image, and the second container logging-agent is a Fluentd with an S3 plugin installed image — lrakai/fluentd-s3:latest.
- The count container writes the date and a counter variable ($i) in two different log formats to two different log files in the /var/log directory every second. The /var/log directory is mounted as a Volume in the primary count container and the logging-agent sidecar so both containers can access the logs. The sidecar also mounts the ConfigMap to access the fluentd.conf configuration file.
Using a ConfigMap, the same sidecar container can be used for any configuration compared to storing the configuration in the image and managing separate container images for each configuration.
Create the Pod by applying the above configuration with the following command:
$ kubectl apply -f multi-container-pod.yaml
In the image above, you can see that the two containers are running.
Step 4: Test your containers
While testing this demo, I noticed that the two containers can show running as in the image above, and there will still be an error in the sidecar container.
To verify that the Fluentd is running successfully, view the logs of the logging-agent container with the following command:
$ kubectl logs -f counter -c logging-agent
If the output of the above command doesn’t show any error, then you are sure that the configuration works.
To verify that the logs are being forwarded to s3, navigate to the S3 Console and click the bucket name. You will see a logs/ folder, and after opening the folder, you can confirm that the primary container logs are forwarded every minute to S3.
Step 5: Clean up
Clean up the entire setup by deleting the namespace, which deletes all the ConfigMap and Pod you created with the following command:
$ kubectl delete ns sidecar-logging-demo
Remember to switch back your kubectl context to your default context.
When implementing a Sidecar container, there are several best practices to consider in order to ensure a successful and efficient implementation. Here are the top three best practices:
1. Ensure the single responsibility principle
One of the key principles when working with Sidecar containers is the single responsibility principle. Each Sidecar container should have a clear and distinct purpose or responsibility that complements the main container it is paired with. It’s essential to identify the specific functionality or service that the Sidecar container will provide and ensure that it aligns with this principle. By following this practice, you maintain a modular and maintainable architecture that allows for independent scaling, updates, and debugging of individual components.
2. Set resource limits and monitor sidecar containers
Set resource limits for Sidecar containers to prevent them from monopolizing resources or negatively impacting the performance of the main container. Monitor the resource usage of Sidecar containers to identify potential bottlenecks and optimize their resource allocation accordingly.
3. Enforce security and permissions
When deploying Sidecar containers, enforcing proper security measures and permissions is crucial. Each Sidecar container should have the necessary access rights and privileges to perform its specific tasks while ensuring that it does not have unnecessary permissions that could compromise the security of the main container or the cluster.
Apply the principle of least privilege, granting only the required permissions to each Sidecar container. Additionally, implement security best practices such as using secure communication protocols, encrypting sensitive data, and regularly patching and updating Sidecar container images to mitigate potential security vulnerabilities.
In this article, you learned what a Kubernetes sidecar container is, how it differs from init containers, its use cases, and how to implement a sidecar container for Logging with Fluentd and Amazon S3. Also, you learned the best practices to follow when using sidecar containers.
You can also take a look at how Kubernetes is integrated into Spacelift. Spacelift helps you manage the complexities and compliance challenges of using Kubernetes. It brings with it a GitOps flow, so your Kubernetes Deployments are synced with your Kubernetes Stacks, and pull requests show you a preview of what they’re planning to change. If you aren’t already using Spacelift, sign up for a free trial.
The Most Flexible CI/CD Automation Tool
Spacelift is an alternative to using homegrown solutions on top of a generic CI. It helps overcome common state management issues and adds several must-have capabilities for infrastructure management.