Going to AWS re:Invent 2024?

➡️ Book a meeting with Spacelift

AWS

Scaling AWS Infrastructure – Tools and Features

How To Scale Your AWS Infrastructure

Scaling AWS infrastructure is essential to accommodate the increasing demands of the environment. Not only does it help maintain high service reliability, but it also optimizes AWS resource utilization. However, the process isn’t as complex as most people think.

AWS has excellent service capabilities to enable high scalability among infrastructure components and cloud-hosted applications. We will explore the tools and features provided by Amazon that are perfect for scaling existing AWS infrastructure.

In this article, we will cover:

  1. Benefits of scaling
  2. Analyzing the Infrastructure
  3. Scaling Web Servers
  4. Scaling AWS Databases
  5. Event-Driven Architecture
  6. Scaling AWS Infrastructure with Terraform Using Count
  7. Scaling Kubernetes Clusters

Benefits of Scaling AWS Infrastructure

Some of the benefits of scaling your AWS infrastructure are described in short below.

Improved performance and availability: Distributing workloads across multiple AWS instances or servers, ensuring that all applications remain available.

Cost optimization: Scaling allows us to use AWS resources more efficiently, avoiding over-provisioning, resulting in unused cloud resources. Learn more about AWS Cost Optimization.

Increased flexibility: AWS provides enough capabilities to scale up or scale down the infrastructure to match the demand.

Reduced downtime: Scaling ensures that all infrastructure components are up and running even during unexpected spikes in traffic. It reduces the risk of service downtime and outages.

Automatic scaling: AWS offers auto-scaling capabilities so that the infrastructure scales automatically based on predefined policies. 

Geographic scalability: AWS allows infrastructure scaling across different regions globally. Hence, we can deploy resources in multiple regions to reduce latency.

Does your organization have extra compliance concerns? Spacelift has you covered with the possibility of self-hosting it in AWS. You can also read about Spacelift integration with AWS, with the new Cloud Integrations section and update to support account-level AWS integrations.

Analyzing the AWS Infrastructure

Before we scale the infrastructure horizontally or vertically, we first need to analyze it. This helps us to identify bottlenecks or performance issues that can limit the effectiveness of scaling efforts. We can identify underutilized resources for optimization and evaluate the infrastructure’s scalability requirements.

Monitor the Infrastructure to Gain Insights & Gather Key Metrics

Monitoring the AWS infrastructure is essential for gathering key insights and metrics. It helps to optimize performance, identify potential issues, and ensure the infrastructure’s security and compliance. With the gathered information, we can make data-driven decisions to enhance resource utilization and reduce costs.

AWS CloudWatch is the best AWS service to monitor and scale AWS infrastructure. Besides CloudWatch, we can use various third-party tools like Splunk, NewRelic, and Datadog to monitor AWS infrastructure. Each of these tools has different capabilities and is utilized depending on what type of AWS infrastructure we have.

Identify bottlenecks

Infrastructure bottlenecks are events that cause various service disruptions like:

  1. Network congestion due to insufficient bandwidth
  2. Loss of user engagement and data
  3. Delays in infrastructure resource deployments
  4. Physical failure of the servers, routers, databases, apps, etc.
  5. Outdated hardware or software components

Bottlenecks can occur at any point in the infrastructure, resulting in slower processing time with reduced productivity. Addressing infrastructure bottlenecks is crucial to reduce system downtimes.

Various metrics are used to monitor and analyze. Tools like AWS Trusted Advisor, New Relic, Datadog, or CloudWatch help identify these bottlenecks. These tools can identify the source of the bottlenecks and implement best practices such as hardware upgrades, software optimization, additional capacity provisioning, etc.

Leverage AWS CloudWatch

AWS CloudWatch is a powerful AWS monitoring and logging service that has all the features to collect and analyze data on infrastructure resource utilization, software performance, and network traffic. Leverage AWS CloudWatch to scale any AWS infrastructure with the below methods.

Monitoring Logs: Use CloudWatch’s real-time monitoring and alerting capabilities to fetch monitor logs and resource utilization reports from security metrics. Refer to the logs to identify potential issues that might impact resource availability and performance.

Key Metrics: AWS CloudWatch has a centralized dashboard to view all existing AWS resources and applications. Define key performance indicators (KPIs) and other metrics that are relevant to the service. Monitor the thresholds of AWS resources for scaling and create scaling policies accordingly with the help of the defined metrics.

CloudWatch Alarms: Set up alarms in CloudWatch that help notify us whenever metrics reach predefined threshold levels. For creating and configuring such alarms, use AWS Management Console or CloudWatch API.

Infrastructure Scaling Policies: Create or define separate infrastructure scaling policies to define the precautionary actions that need to be taken when an alarm gets triggered. We can use AWS Auto Scaling service or AWS EC2 Auto Scaling service to configure the policies.

Scaling AWS Web Servers

Scaling AWS web servers ensures proper handling of web traffic with improved application performance and availability. It distributes the workload to handle an increasing number of incoming requests without affecting the availability of websites and web apps. 

Horizontal vs. Vertical Scaling

This section compares the two broad ways of scaling web servers: horizontal vs. vertical.

Horizontal Scaling Vertical Scaling
When more servers or instances are added to distribute the workload. When more resources like CPU, RAM, etc., are added to a single server or instance.
Better for fault tolerance as it reduces downtime. Offers improved scalability with simplified management and reduced network latency.
Can increase the infrastructure complexity with multiple servers. As the workload gets spread out, it can lower the performance of single servers. Can be expensive. Also, many web servers come with limited hardware capacity, which restricts the server from vertical scaling.

Choose the type of scaling depending on the AWS infrastructure setup and web server capabilities. For resource-intensive applications with random peaks in traffic, choose vertical scaling. However, horizontal scaling offers a long-lasting AWS infrastructure scaling solution.

Autoscaling

AWS Autoscaling feature automatically increases or decreases the web server capacity based on the workload, web traffic, and other metrics. For example, if the web server’s memory usage is over 90%, then the Amazon EC2 Autoscaling service will auto-add a new server instance dynamically. It will also remove the extra instance once the memory optimization is down below the threshold value. 

It is also possible to schedule Autoscaling based on certain conditions on web servers. It offers us enough flexibility to scale in or scale out AWS infrastructure components during the scheduled runtimes. The system gets to the normal phase once the schedule is over.

Read more about deploying the AWS auto-scaling group with Terraform.

Using an Application Load Balancer (ALB)

Application Load Balancer (ALB) is an AWS service that allows us to divide the application load between multiple AWS EC2 instances or Lambda functions. Here are the main types of AWS services that support ALB:

  1. EC2 instances
  2. EKS (Elastic Kubernetes Service)
  3. ECS (Elastic Container Service)

Read more: How to Manage Application Load Balancer (ALB) with Terraform.

ALB is suited for handling HTTP or HTTPS traffic. It takes only a few minutes to set up the ALB in web servers and balance the traffic load between AWS EC2 instances.

Scaling AWS Databases

Amazon Relational Database Service (RDS) is a collection of AWS-managed services that simplify the process of database setup, scaling, and management in the cloud. Amazon supports all the popular relational database management systems with excellent scalability features.

Using Amazon RDS Multi-AZ

Amazon RDS Multi-AZ provides enhanced availability of the Amazon RDS database instances, making them ideal for handling production workloads. Some important reasons for using RDS Multi-AZ for scaling AWS infrastructure are listed below.

Automatic failover: This feature ensures the high availability of AWS databases by performing automatic database failovers within 60 seconds with no manual intervention and zero data loss.

Protect database performance: This feature ensures I/O activity does not get suspended during the ongoing backup of the database standby instance.

Enhanced durability: AWS RDS Multi-AZ synchronous replication can hold the data on standby database instances side-by-side with the primary instance.

Increased availability: It allows us to deploy a standby database instance in another AZ and achieve excellent fault tolerance during instance failure.

The Multi-AZ feature of AWS RDS places a standby database instance in another availability zone to ensure high availability during hardware failures. It is quite straightforward to enable this through the RDS dashboard.

Learn how to create an AWS RDS Instance using Terraform.

Using Read replicas

Amazon RDS Read Replicas are clone servers of the primary database server with similar features and capabilities. Being a secondary database instance, RDS Read Replicas offer enhanced read performance for Amazon RDS database instances by elastically scaling out the primary instance. 

Both primary and secondary database servers are auto-synced in real-time to maintain data synchronization. However, it is possible to route web app traffic that only needs to read from the database to the Read Replicas directly and reduce the primary database instance’s workload.

Read replicas are available in Amazon Aurora and AWS RDS for MariaDB, MySQL, Oracle, PostgreSQL, and SQL Server.

Using Aurora

Amazon Aurora offers unparalleled high availability and performance at a global scale with end-to-end PostgreSQL and MySQL compatibility. This relational database combines the capabilities of traditional enterprise databases with its simple yet cost-effective open-source databases. Amazon Aurora is perfect for:

  1. Modernizing the operations of enterprise applications like ERP, CRM, etc.
  2. Support reliable and multi-tenant SaaS applications with DB flexibility.
  3. Develop and deploy distributed applications at scale across different regions.
  4. Instantaneous serverless scaling to reduce operational expenses.

Compared to RDS, Amazon Aurora has built-in DR (disaster recovery) and HA (high availability) capabilities. We can easily migrate from commercial database engines like SQL or Oracle to relational database instances. Aurora is perfect for scaling small to medium workloads in AWS infrastructure.

Event-Driven Architecture

Event-driven architecture (EDA) is a popular way of designing software systems for building scalable and efficient AWS-based applications. It is also one of the strategies used in scaling AWS infrastructure in below contexts:

  • EDA can handle asynchronous communication between AWS services that are distributed across multiple servers and regions.
  • EDA’s main components are events that are generated by various sources like users, system components, and external components.
  • EDA promotes loose service coupling and reduces dependencies across AWS resources without affecting the rest of the system. 
  • EDA supports high fault tolerance so that AWS resources can communicate asynchronously.

AWS provides several services like Lambda and Simple Queue Service (SQS) that enable EDA events to trigger the execution of a specific code.

SQS to Implement Loose Coupling

SQS is a fully managed message queuing service that enables us to decouple and scale various microservices, serverless applications, and distributed systems within the AWS infrastructure. We can use SQS to decouple the system components so that they can work and scale independently. 

Leverage Serverless Architecture using Lambda

AWS Lambda is Amazon’s serverless computing service, enabling us to run code without provisioning any physical servers. The lambda function can automatically scale our AWS-based applications by considering the incoming traffic. So we don’t need to consider capacity planning for executing a Lambda function.

Scaling AWS Infrastructure with Terraform Using Count

Terraform IaC enables us to create, change, and scale AWS infrastructure by defining resources with a loop using for_each or count.

Here’s a Terraform code example with which we can create multiple EC2 instances using count.

resource "aws_instance" "my_web_server" {
  count         = 3
  ami           = "ami-0c94855ba95c71c99"
  instance_type = "t2.micro"
  
  tags = {
    Name = "My web server ${count.index + 1}"
  }
}

In the above example, the number of EC2 instances to be created can be declaratively mentioned using the count attribute. Creating an input variable to adjust this number helps update the number of instances dynamically.

Similarly, we can also use for_each construct to create multiple resources with similar configuration dynamically.

The example below implements an input variable “bucket_names” along with a for_each attribute to create multiple S3 buckets.

variable "bucket_names" {
  type = set(string)
  default = [
    "example-bucket-1",
    "example-bucket-2",
    "example-bucket-3"
  ]
}

resource "aws_s3_bucket" "example_buckets" {
  for_each = var.bucket_names
  
  bucket = each.value
  
  tags   = {
    Name = "${each.value} Bucket"
    Environment = "Production"
  }
}

Scaling Kubernetes Clusters

There are various ways using which we can scale Kubernetes clusters to manage varying demands. The kubectl CLI, Kubernetes dashboard and Kubernetes API are some of the most commonly used tools to do the same.

Scaling a Kubernetes cluster using these tools requires monitoring efforts to identify the changes in demand and act accordingly. Horizontal Pod Autoscaler (HPA) helps in automatically performing kubernetes cluster scaling tasks. However, it is not a continuous process and needs to be scheduled. Due to this, it makes it difficult to accurately scale the pods.

There are a couple of auto scalers that are worth considering – Cluster Autoscaler and Karpenter.

Cluster Autoscaler

The Kubernetes Cluster Autoscaler is deployed as a Kubernetes deployment component that monitors the demand and manages pod creation and provision of additional nodes for scaling purposes.

It monitors resource utilization and workload. Depending on this information, if required, it provisions additional nodes to spin more pods or deprovisions them when demand fades down. This way, it adjusts the cloud resources automatically to gain optimum cost benefits.

Karpenter

Karpenter is an open-source serverless auto-scaling solution for Kubernetes. It is designed to work with Kubernetes natively along with all the major cloud providers making it easy to adopt the same for organizations running their workloads on Kubernetes.

It’s automatic scaling up and scaling down capabilities of nodes enable organizations ​​to run their workloads on a cost-effective, on-demand basis. Karpenter uses Kubernetes API to manage nodes and workloads, making it easy to deploy and use.

Read more about running Kubernetes on AWS.

Key Points

Scaling AWS infrastructure is essential to ensure that web applications, servers, and databases can handle increased traffic and workload demands. By analyzing the infrastructure and leveraging AWS services like Aurora, CloudWatch, and Autoscaling Groups, we can effectively scale our web servers and databases. 

Additionally, event-driven architecture and serverless technologies such as Lambda Function and SQS can help us implement loose coupling and improve scalability. We can leverage Terraform’s for_each/count loops to dynamically create or destroy multiple resources. In this post, we also discussed how Karpenter and AWS Cluster Autoscaler are used to automate the scaling of AWS infrastructure.

You can also explore how Spacelift makes it easy to work with Terraform. If you need any help managing your Terraform infrastructure, building more complex workflows based on Terraform, and managing AWS credentials per run, instead of using a static pair on your local machine, Spacelift is a fantastic tool for this.

The Most Flexible CI/CD Automation Tool

Spacelift is an alternative to using homegrown solutions on top of a generic CI. It helps overcome common state management issues and adds several must-have capabilities for infrastructure management.

Start free trial

The Practitioner’s Guide to Scaling Infrastructure as Code

Transform your IaC management to scale

securely, efficiently, and productively

into the future.

ebook global banner
Share your data and download the guide