In the field of cloud computing, EC2 Auto Scaling is an important tool that helps you automatically scale resources up or down based on demands. This not only optimizes costs but also ensures that applications maintain stable performance. This article provides a detailed guide on setting up Auto Scaling for EC2 on AWS
Prerequisites
Before you begin, ensure you have:
- AWS account.
- Basic knowledge of AWS EC2 and CloudWatch.
What is EC2 Auto Scaling?
AWS EC2 Auto Scaling is a service that automatically adds or removes EC2 instances within an Auto Scaling Group (ASG) based on predefined conditions (Scaling Policies). It dynamically adjusts the number of EC2 instances to align with the actual usage needs of your application.
Benefits:
- Consistent Performance: Ensures resources are always available to handle traffic.
- Cost Optimization: Resources are used only when necessary.
- Easy to Manage: Automatically adjusts resources without manual intervention.
Basic Concepts
- Auto Scaling Group: A group of EC2 instances managed by Auto Scaling.
- Launch Template: A configuration template for creating new instances.
- Scaling Policy: A set of rules that determine when and how to adjust the number of instances.
- Target Tracking: A scaling policy that uses metrics to adjust the number of instances.
Prepare the Application Load Balancer and Target group.
To demonstrate Auto Scaling, you'll need to set up an Application Load Balancer and Target Group . You can do the same here Instructions for creating Application Load balancer on AWS
In the Target group section, we enter the following information:
- Target group name:
TG-group-Demo-ASG
- Register targets: Do not register instances yet. Auto Scaling will handle this automatically.
The created Target Group will show:
- 0 targets
- 0 Healthy
- 0 Unhealthy
Prepare EC2 Instance
Before setting up Auto Scaling, create an AMI (Amazon Machine Image) or a sample EC2 instance.
- Create a Sample EC2 Instance:
- Name:
EC2-template-instance
- AMI Image:
Amazon Linux 2023
- Instance Type:
t2.micro
(eligible for Free Tier). - Security Group: Open ports 80 and 443 for application access, and port 22 for SSH (if needed).
- User Data Configuration: In Advanced Details -> User Data , add the following script to pre-install Nginx(when the Instance is initialized):
#!/bin/bash
dnf update -y
dnf install nginx -y
systemctl start nginx
systemctl enable nginx
echo "<h1>EC2 Auto Scaling Instance: $(hostname -f)</h1>" > /usr/share/nginx/html/index.html
Follow the detailed guide to creating an AWS EC2 instance .
Create AMI
- Create AMI from Instance template: After initializing EC2 instance, you can create an AMI to use for Auto Scaling.
- Go to EC2 Dashboard -> Instances , select Instance -> Actions -> Image and Templates -> Create Image .
Provide the following details:
- Image name:
AMI-template-for-scaling
.
Leave the remaining information to the default value and click Create Image
Create Launch Template
Launch Template stores the configurations needed to initialize EC2 instances in Auto Scaling Group.
- Go to AWS EC2 Dashboard -> Launch Templates -> Click Create Launch Template .
- Configure information:
- Template name:
Launch-template-ASG
- Amazon machine image (AMI): Select
AMI-template-for-scaling
. - Instance type:
t2.micro
. - Key pair: Optional (if SSH is required).
- Security group: Select the group configured in the previous step.
- Advanced details: Leave it as default. This
Advanced details
section is similar toAdvanced details
when creating an EC2 Instance
- Click Create Launch Template .
- Note. You can Update the Launch template with the new version and specify which version the Launch Template uses
Create an Auto Scaling Group
Auto Scaling Group (ASG) is a group of EC2 instances managed by Auto Scaling.
Access Auto Scaling Group
Go to AWS EC2 Dashboard -> Auto Scaling Groups -> Click Create Auto Scaling Group .
Configure Auto Scaling Group:
-
Name:
ASG-demo-group
. -
Launch template: Select
Launch-template-ASG
. -
VPC: Select the VPC that contains the application Subnets.
-
Subnets: For most applications, you can use multiple Availability Zones and let EC2 Auto Scaling balance your instances across the zones. The default VPC and default subnets are suitable for getting started quickly.
Note Instance type, Key pair name, Security group IDs. Because these values are used to attach to instances at initialization.
Instance Launch Configuration
- Instance type requirements: Leave as default
If Launch Template you do not specify Instance type then here you must enter the necessary information as follows:
- Instance purchase options & Allocation strategies: This section is also required when Launch Template does not specify Instance type
Network
- VPC: Select the same VPC as the launch template
- Availability Zones and subnets: Select 1 or more subnets in the VPC. The more you select, the higher the availability rate
- Availability Zone distribution: Select
Balanced best effort
Configure Integrate with other services
In the Integrate with other services
section, configure as follows:
- Load balancing: Select
Attach to an existing load balancer
- Attach to an existing load balancer: Click the Choose from your load balancer target groups option and select the target groups you just created: TG-group-Demo-ASG
- VPC Lattice integration options: Select
No VPC Lattice service
Application Recovery Controller
In this section, we leave the default
Health checks:
-
EC2 health checks: Apply the default. EC2 Auto Scaling will always check if the Instance is running or not, and monitor issues related to software & hardware or software that can damage the instance.
-
Turn on Elastic Load Balancing health checks: Optional, turn on Elastic load balancing health checks to increase availability. Here I need to select
on
-
Turn on Amazon EBS health checks: Used to check the status of EBS attached to the Instance.
-
Health check grace period: Set the delay time for initial health checks. Here I will leave the default
300
Group Size configuration
Group Size - Desired capacity: Set the number of desired instances at the initial default. In addition, you can also change Desired capacity type
to VCPUs, Memory(Gib) if you do not want to use Instance units.
- Desired Capacity:
1
Scaling configuration - Scaling limits
The Scaling section is used to determine the limit for operations to increase or decrease the number of instances.
We enter the information as follows:
-
Min desired capacity:
1
- Minimum number of instances (Greater than or equal to the value of Desired capacity) -
Max desired capacity:
3
- Maximum number of instances that can be increased.
Scaling configuration - Automatic scaling
This is the part that determines whether to automatically increase or decrease the number of instances according to a certain condition. It can be the CPU threshold or the number of requests in the Application Load balancer.
If you choose No scaling policies
, the number of instances will be fixed and unchanged.
In this part, we will demo automatic increase and decrease, so we will choose Target tracking scaling policy
and set the conditions to automatically increase or decrease instances as follows:
- Scaling policy name: Optional name. Here I enter Target Tracking Policy
- Metric type: Select
Average CPU utilization
- Target value:
50
- If the CPU reaches 50%, it will trigger an increase in Instance - Instance warmup: Default 300 seconds - It will take 300 seconds for the Server to warm up before entering the metric for Autoscaling to calculate.
Because when the server is initialized, it will consume a lot of CPU, sometimes up to 50%. Because if the warmup is not calculated, the health check failure will continue to trigger scale out. This will increase too many instances.
Instance Maintenance Policy
This section is used to Control the behavior of replacing Instance to manage availability and cost:
-
Mixed behavior: New instances launch before existing instances are terminated.
-
Prioritize availability: Launch new instances before existing instances are terminated.
-
Control costs: Terminate and launch instances at the same time.
-
Flexible: Customize behavior based on needs Here I choose
Mixed behavior
Instance maintenance policy
- Capacity Reservation preference: choose
Default
(Auto scaling will use theCapacity Reservation
option from the Launch template)
Additional settings
Here you can enable historical tracking on Cloudwatch for easy monitoring.
Add notifications
You can set up notifications using SNS
when there is any event like Launch, Terminate, Fail to launch, Fail to terminate. We will learn about this part later
Add tags
Here we recommend creating tags to automatically attach to instances created by Auto scaling group We set Tag as follows:
- Key:
Name
, Value:ASG-test-machine
Review
-
Review all the configurations you have made.
-
Make sure everything is set up according to your requirements.
-
Click Create Auto Scaling Group to complete the process.
-
At this point the status will be:
Updating capacity
, we wait a moment and then check again.
Check
Check the number of Ec2 instances
Try checking the number of Ec2 instances running with the desired number
value you specified.
Go to EC2 Dashboard -> Instances We see that a new instance has been created with the status running. The Instance name is the same as the Tag name we created in ASG
Check Target Groups
You can verify the number of Instances in Target Groups to make sure they match the desired number set in desired capacity.
Open the Target group TG-group-Demo-ASG
we just created above to see.
We see there is 1 target and 1 Unused. This is because the Load balancer has not pointed to this Target group.
Now we will add Target group TG-group-Demo-ASG
to the Load balancer we created above by:
- Select Load Balancers -> Select listener
- Select listener Rule -> Edit Actions
- Set Target group to
TG-group-Demo-ASG
- Confirm by clicking
Save Changes
Go back to Target group TG-group-Demo-ASG
and you will see 1 Initial
Wait a moment for the health check to run and you will see 1 Healthy
Check Auto Scaling
1. Manually increase load:
- Use EC2 Stress Test Tool to increase CPU load on EC2 instances. You can refer here: https://docs.aws.amazon.com/fis/latest/userguide/fis-tutorial-run-cpu-stress.html
- Monitor Auto Scaling Group in AWS Management Console to see the number of instances increase.
Install EC2 Stress Test Tool with the following command:
sudo dnf install stress -y
Start stressing Ec2 instance:
stress --cpu 4 --timeout 300
# stress: info: [2780] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
Explanation:
- --cpu 4: Increases the load on 4 CPU cores
- --timeout 300: Runs the command for 300 seconds (5 minutes)
Open the Monitoring tab in AWS Instance and you will see that the CPU utilisation will reach 100%
The number of instances has increased to 2
Target group has 2 healthy targets
Continue to open Auto scaling group tab Activity, we see an additional item to initialize a new Instane
CloudWatch Alarm triggered:
2. Load reduction:
- Reduce CPU load and monitor the number of instances gradually decreasing to the default state.
- Monitor the number of instances automatically scaling down to the default state.
- Update the Desired Capacity value as needed to control the number of instances.
Note: For example, at first you set Desired capacity
= 1. If you scale out to increase the instance to 2, then also assign Desired capacity
to 2.
Conclusion
EC2 Auto Scaling is a powerful tool that optimizes costs and performance by dynamically adjusting the number of instances based on the load. With this guide, you can configure and deploy Auto Scaling for your application.
Expand
- Scheduled Scaling: Set up schedules to increase or decrease the number of instances at specific times.
- Predictive Scaling: Use AI to forecast traffic demands and scale up in advance.
- Monitoring: Integrate with AWS CloudWatch to monitor and optimize Auto Scaling Group operations including Alarms