﻿
### Slide 1:

![Slide 1](slide_1.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 2:

![Slide 2](slide_2.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 3:

![Slide 3](slide_3.png)

::: Notes

This slide extends the Example Corp. scenario into scaling territory: the application is growing and the company needs to respond to traffic changes automatically rather than reactively. The addition of license management to the scaling challenge is realistic — when you scale automatically, you may inadvertently exceed software license counts. Ask students to think about what constraints on scaling they've encountered that go beyond infrastructure capacity.

#### Instructor notes

#### Student notes

Example Corp.'s new application is experiencing increasing traffic and user demand. You must design a plan to mitigate the impact of changes in application traffic by automating the expansion and contraction of your resources. Additionally, you will need to establish a procedure to manage third-party licenses for any newly created resource.

:::

### Slide 4:

![Slide 4](slide_4.png)

::: Notes

The tax preparation analogy illustrates the core problem with fixed-capacity infrastructure: you provision for peak, which means you overprovision for the rest of the year. In a data center, this waste is unavoidable; in the cloud, it's a design choice. Ask students to think about what costs they incur by overprovisioning in the cloud versus the operational complexity they take on by automating scaling — both approaches have costs, and the right trade-off depends on the workload.

#### Instructor notes

#### Student notes

In a traditional data center environment, the scalability of your system is bound by your hardware. Consider the example of a tax preparation business in the United States. US taxpayers must file their taxes by April 15. Online tax preparation companies know that they will experience a steady flow of traffic. This traffic starts near the middle of January, with traffic peaking close to the April 15 deadline. In a data center, anticipating this 4-month heavy utilization requires starting enough physical servers to handle the anticipated load. But what happens to those servers the rest of the year? They sit idle in the data center.

:::

### Slide 5:

![Slide 5](slide_5.png)

::: Notes

The cloud enables elastic scaling, but elasticity doesn't happen automatically — it requires deliberate design. You must decide what metric to scale on, what thresholds to set, how quickly to scale, and how to handle state during scale-in. Each of these decisions involves trade-offs between cost, availability, and operational complexity. Ask students: what would you monitor to determine when to scale a web application, and what would you monitor for a batch processing workload?

#### Instructor notes

#### Student notes

In the cloud, because computing power is a programmatic resource, you can take a more flexible approach to the issue of scaling. You can program your system to create new Amazon Elastic Compute Cloud (Amazon EC2) instances in advance of known peak periods in a business cycle (such as tax filing deadlines). You can use monitoring services to programmatically scale out when you notice that critical resources---such as average CPU utilization across the fleet---are becoming constrained. You can also automatically scale in the number of resources when demand is lower. As a result, you pay only for the resources that you need, when you need them. What is required to implement such a system? Let's review how you can combine several Amazon Web Services (AWS) services to create a scalable, on-demand architecture.

:::

### Slide 6:

![Slide 6](slide_6.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 7:

![Slide 7](slide_7.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 8:

![Slide 8](slide_8.png)

::: Notes

EC2 Auto Scaling provides two distinct capabilities that are often conflated: fleet management (replacing unhealthy instances) and demand scaling (adding or removing capacity based on load). Fleet management is active even at a fixed desired capacity — it's always monitoring and replacing unhealthy instances. Demand scaling is what most people think of as 'auto scaling.' Understanding both capabilities helps students design Auto Scaling groups correctly for different use cases.

#### Instructor notes

#### Student notes

Amazon EC2 Auto Scaling is a fully managed service designed to launch or terminate Amazon EC2 instances automatically. This capability helps ensure that you have the correct number of EC2 instances available to handle the load for your application. EC2 Auto Scaling helps you maintain application availability through the following: Fleet management for EC2 instances, which detects and replaces unhealthy instances; Scaling your Amazon EC2 capacity up or down automatically according to conditions that you define.

* **Health check** : You can configure your EC2 Auto Scaling group to maintain a specified number of running instances at all times. If an instance becomes unhealthy, the group terminates the unhealthy instance and launches another instance to replace it.
* **Amazon CloudWatch alarms** : A scaling policy instructs EC2 Auto Scaling to track a specific CloudWatch metric. It defines what action to take when the associated CloudWatch alarm is in the ALARM state.
* **Amazon Simple Queue Service (Amazon SQS)** : You can scale your Auto Scaling group in response to changes in system load in an Amazon SQS queue.
* **Schedule** : Scaling by schedule means that scaling actions are performed automatically as a function of time and date. Scaling by schedule is useful when you know exactly when to increase or decrease the number of instances in your group.
* **Manual** : Manual scaling is the most basic way to scale your resources. Specify only the change in the maximum, minimum, or desired capacity of your Auto Scaling group. EC2 Auto Scaling manages the process of creating or terminating instances to maintain the updated capacity.

For more information, see "What Is Amazon EC2 Auto Scaling?" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html.

:::

### Slide 9:

![Slide 9](slide_9.png)

::: Notes

Auto Scaling requires three components: a launch template defining what to launch, an Auto Scaling group defining how many to run and where, and scaling policies defining when to change capacity. Each component can be configured and updated independently. This separation of concerns is powerful — you can update the launch template with a new AMI and gradually roll it out without changing scaling behavior, or adjust scaling thresholds without changing what instances look like.

#### Instructor notes

#### Student notes

To enable EC2 Auto Scaling, configure the following main components:

* **Launch templates** : Create a launch template that defines your EC2 instances.
* **Auto Scaling groups** : Create an Auto Scaling group to maintain a fixed number of instances even if an instance becomes unhealthy.
* **EC2 Auto Scaling policies** : You can configure the appropriate scaling strategy to meet your needs. Configure automatic scaling based on health checks and CloudWatch alarms. Configure programmatic scaling based on Amazon SQS. Schedule your scaling to meet your forecasted needs. Manually scale your Auto Scaling group.

For more information, see "Step 5: Next steps" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/GettingStartedTutorial.html#gs-tutorial-next-steps.

:::

### Slide 10:

![Slide 10](slide_10.png)

::: Notes

Launch templates define the configuration of instances that Auto Scaling will create, and they capture all the same parameters as a manually launched instance — except the VPC and subnet, which are defined at the Auto Scaling group level. This separation means you can use the same launch template across multiple Auto Scaling groups in different subnets or availability zones. The security group and IAM role in the template are particularly important to review: every instance launched by Auto Scaling will inherit these settings.

#### Instructor notes

#### Student notes

Creating a launch template works much like creating an individual instance. You must specify the same characteristics, such as the following: AWS Identity and Access Management (IAM) roles, Security groups, Storage, Instance type, User data, Key pairs. However, you do not specify the virtual private cloud (VPC) or subnet in which your instances will launch. The Auto Scaling group that uses your launch template specifies those settings. You can use the launch template to specify whether to assign a public IP address automatically to each new instance that the launch configuration creates. If your instances will be launched in a private subnet behind a public Elastic Load Balancing (ELB) load balancer, you do not have to set this option.

:::

### Slide 11:

![Slide 11](slide_11.png)

::: Notes

Auto Scaling groups maintain a specified number of instances and respond to scaling events, but the group's behavior depends critically on the minimum, maximum, and desired capacity settings. Setting minimum too low can leave you with insufficient capacity during failures; setting maximum too low can prevent legitimate scaling; setting desired too high wastes money. These three values define the bounds of your scaling behavior and should be set based on capacity analysis, not guesswork.

#### Instructor notes

#### Student notes

Use Auto Scaling groups to manage and scale a set of Amazon EC2 instances necessary to meet a capacity. You can specify the following configurations: Set a limit on the size of the group. Define some of the behavior of the instances.

:::

### Slide 12:

![Slide 12](slide_12.png)

::: Notes

Auto Scaling supports multiple scaling initiation methods — CloudWatch alarms, scheduled actions, and manual changes — and they can be used together. Cooldown and warm-up periods are the control mechanisms that prevent scaling thrashing: scaling too aggressively can create oscillation where the group alternately scales out and in without settling. Understanding the interaction between alarm evaluation periods, cooldown periods, and warm-up periods is essential for tuning Auto Scaling behavior.

#### Instructor notes

#### Student notes

**Methods to initiate changes** : You can initiate changes to an Auto Scaling group in two ways: Define a scaling policy that scales out or scales in based on a CloudWatch alarm. Define a CloudWatch alarm that calls a scaling policy. An example of an alarm is Average CPU Utilization > 50%. The policy specifies to do one of two things as a percentage of the desired capacity of the Auto Scaling group: Add or remove a fixed number of instances. Adjust the number of running instances.

**Scheduling an action** : You can define a scheduled action. Scheduled actions set a new desired capacity value at a specific time. You can specify a scheduled action to launch on a specific date and time. Alternatively, you can specify a recurring action that is carried out at specific times throughout a week, month, or year. Scheduled actions are an efficient way to pre-warm capacity in response to anticipated traffic spikes.

**Cooldown periods** : Help prevent the initiation of additional scaling activities before the effects of previous activities are visible.

**Warm-up periods** : If you are creating a step policy, you can specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warm-up time has expired, an instance is not counted toward the aggregated metrics of the Auto Scaling group.

:::

### Slide 13:

![Slide 13](slide_13.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 14:

![Slide 14](slide_14.png)

::: Notes

Launch templates replaced launch configurations for Auto Scaling, primarily because templates support versioning. With versioning, you can update instance configuration by creating a new template version rather than replacing the entire template, and you can roll back to a previous version if the new configuration causes problems. This is a significant operational improvement — ask students how they would manage a fleet update using launch template versions versus creating an entirely new Auto Scaling group.

#### Instructor notes

#### Student notes

**Launch template** : Launch templates have replaced the legacy use of launch configurations for repeated EC2 instance creation. A launch template specifies instance configuration information that you can use to launch EC2 instances in an Auto Scaling group. Configuration settings include the Amazon Machine Image (AMI) ID, the instance type, a key pair, security groups, and other parameters. Unlike the prior launch configurations, launch templates allow for versioning. With versioning, you can create a subset of the full set of parameters and then reuse it to create other templates or template versions. For example, you can create a default template that defines common configuration parameters within your accounts. You can then allow other parameters to be specified as part of a specialized version for a specific application. To create a launch template to use with an Auto Scaling group, use one or more of the following methods: Create a new template. Create a new version of an existing template. Copy the parameters from a running instance, or another template. For more information, see "Launch Templates" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html.

:::

### Slide 15:

![Slide 15](slide_15.png)

::: Notes

Launch template advanced settings cover IAM profiles, monitoring, license configurations, and user data — all of which affect operational behavior. The license configuration integration with AWS License Manager is worth highlighting in the context of this module: it allows you to enforce license compliance at launch time, which prevents licensing violations from accumulating as instances scale out. Termination protection in the launch template doesn't protect against Auto Scaling-initiated termination — a distinction that surprises many operators.

#### Instructor notes

#### Student notes

The launch template has the following advanced details:

* **IAM instance profile** : You associate an IAM instance profile with the instance.
* **Termination protection** : This setting specifies whether to prevent accidental termination. Termination protection does not protect from EC2 Auto Scaling initiated termination. It protects against user-initiated termination.
* **Detailed CloudWatch monitoring** : Use CloudWatch to monitor, collect, and analyze metrics about your instances.
* **License configurations** : License configurations are AWS License Manager rule sets that are associated with instances at launch to enforce license compliance. Rule sets are discussed in more detail later in this module.
* **User data** : Specify user data necessary to configure an instance during launch or to run a configuration script.

For more information, see "Launch an Instance from a Launch Template" in the Amazon Elastic Compute Cloud User Guide for Linux Instances at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html.

:::

### Slide 16:

![Slide 16](slide_16.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 17:

![Slide 17](slide_17.png)

::: Notes

Auto Scaling group configuration involves choices that affect cost, availability, and operational complexity simultaneously. Mixed instance types with Spot capacity can dramatically reduce costs but introduces interruption risk; health check configuration determines how quickly unhealthy instances are replaced; group size limits bound both your minimum availability and maximum cost. The decisions in an Auto Scaling group configuration require understanding your application's capacity requirements and tolerance for interruption.

#### Instructor notes

#### Student notes

An Auto Scaling group contains a collection of EC2 instances that are treated as a logical grouping for automatic scaling and management. With an Auto Scaling group, you can also use EC2 Auto Scaling features such as health check replacements and scaling policies. Maintaining the number of instances in an Auto Scaling group and automatic scaling are the core functionality of the EC2 Auto Scaling service. The size of an Auto Scaling group depends on the number of instances that you set as the desired capacity. You can adjust its size to meet demand, either manually or by using automatic scaling.

* **Launch template and launch configuration** : To configure EC2 instances that your Auto Scaling group launches, you can specify a launch template, a launch configuration, or an EC2 instance.
* **Purchase options and instance types** : Choose one or more instance types for the group. Assign each instance type an individual weight. Specify how much On-Demand and Spot capacity to launch and specify an optional On-Demand base portion. Rank instance types by whether they can benefit from Reserved Instance or Savings Plan discount pricing. Define how EC2 Auto Scaling should distribute your Spot capacity across instance types.
* **Health checks** : EC2 Auto Scaling automatically replaces instances that fail health checks. If you enabled load balancing, you can enable ELB health checks in addition to the Amazon EC2 health checks.
* **Group size** : Specify the size of the Auto Scaling group by changing the desired capacity. You can also specify minimum and maximum capacity limits.

For more information, see "Auto Scaling Groups" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.html.

:::

### Slide 18:

![Slide 18](slide_18.png)

::: Notes

Integrating ELB with Auto Scaling connects the load distribution layer to the capacity management layer. Auto Scaling automatically registers new instances with the load balancer as they launch and deregisters them as they're terminated. The load balancer's health checks can also feed into Auto Scaling's health monitoring — enabling ELB health checks in the Auto Scaling group means that an instance that passes EC2 status checks but fails the application-level health check will still be replaced.

#### Instructor notes

#### Student notes

Using Auto Scaling groups, you can launch EC2 instances behind a load balancer by enabling the Load balancing option. After you attach the load balancer, it automatically registers instances when they launch. When the load balancer receives traffic, it distributes the requests among the instances in your Auto Scaling group. To attach your load balancer to your Auto Scaling group, do one of the following:

* If you choose Application Load Balancer or Network Load Balancer, choose the name of a target group.
* If you choose Classic Load Balancer, choose the name of the load balancer.

For more information about how to distribute traffic across the instances in your Auto Scaling group, see "Use Elastic Load Balancing to Distribute Traffic Across the Instances in Your Auto Scaling Group" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-load-balancer.html.

:::

### Slide 19:

![Slide 19](slide_19.png)

::: Notes

EC2 Auto Scaling's default health checks operate at the infrastructure level: is the instance running and passing EC2 status checks? These checks don't assess whether your application is healthy or responding correctly. System status check failures indicate AWS infrastructure issues; instance status check failures indicate OS-level issues. Neither tells you about application-level failures — that requires enabling ELB health checks or custom health checks.

#### Instructor notes

#### Student notes

EC2 Auto Scaling maintains health state for instances and terminates instances marked unhealthy. By default, it uses Amazon EC2 status checks. EC2 instance status checks include two types:

* **System status checks** : These involve checks on the underlying hardware and virtualization software underpinning an instance. A failed system status check indicates underlying problems with your instance that require AWS involvement to repair.
* **Instance status checks** : These involve sending an address resolution protocol (ARP) request to the network interface card (NIC). These checks detect problems that require your involvement to repair.

For more information about instance status checks, see "Status Checks for Your Instances" in the Amazon Elastic Compute Cloud User Guide for Linux Instances at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html.

:::

### Slide 20:

![Slide 20](slide_20.png)

::: Notes

EC2 status checks only verify infrastructure health — they confirm the instance is running, not that your application is working. This gap between infrastructure health and application health means Auto Scaling may keep instances in rotation that are technically running but not serving requests correctly. Enabling ELB health checks closes this gap by using the same health checks that determine whether the load balancer routes traffic to an instance.

#### Instructor notes

#### Student notes

The default EC2 Auto Scaling status checks primarily determine whether the instance is in a running state. The status checks do not determine whether applications are performing as expected. If an Auto Scaling group is behind a load balancer, you can use the ELB health checks for fine-tuned monitoring.

:::

### Slide 21:

![Slide 21](slide_21.png)

::: Notes

The instance maintenance policy controls the trade-off between availability and cost during instance replacement. Prioritizing availability launches a replacement before terminating the old instance — briefly exceeding desired capacity and incurring extra cost. Controlling costs terminates first and launches simultaneously — potentially creating brief capacity reduction. The flexible option lets you define custom thresholds. Understanding this trade-off helps students choose the right policy for their workload's sensitivity to interruptions versus cost spikes.

#### Instructor notes

When presenting, consider drawing the four
replacement behavior labels: Mixed behaior, prioritize availability,
control costs, and flexible. This helps learners visually anchor the
policy names to the behaviors.

#### Student notes

Instances marked unhealthy in an Auto Scaling group are automatically terminated and a new instance is started to replace the unhealthy one. Using the AWS Command Line Interface (AWS CLI), you can mark an instance in one of your Auto Scaling groups for termination by setting its health status to "Unhealthy." When the instance is marked as unhealthy, the Auto Scaling group removes the instance and replaces it, if necessary. If you suspend automatic termination for your Auto Scaling group, the group can grow up to 10 percent beyond its maximum size as it attempts to maintain healthy instances. You can control how replacement happens by configuring an instance maintenance policy on your Auto Scaling group. With no policy applied, Amazon EC2 Auto Scaling uses mixed behavior: for rebalancing events it launches a new instance before terminating the old one, but for all other replacement events it terminates and launches at the same time. If you choose to prioritize availability, EC2 Auto Scaling launches a new instance and waits for it to pass health checks before terminating the old one, which might temporarily increase costs. If you choose to control costs, EC2 Auto Scaling terminates the old instance and launches the replacement at the same time, which might temporarily reduce availability. The flexible option lets you set custom minimum and maximum capacity thresholds for the most granular control over the trade-off between availability and cost.

:::

### Slide 22:

![Slide 22](slide_22.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 23:

![Slide 23](slide_23.png)

::: Notes

Scheduled scaling is the simplest and most predictable form of Auto Scaling: you know when demand will change, so you configure capacity to change at those times. It works well for recurring patterns like business-hours traffic or weekly spikes, but it doesn't respond to unexpected demand changes. Scheduled scaling is typically used alongside dynamic scaling — scheduled actions pre-warm capacity before known peaks, and dynamic scaling handles unexpected variation around the baseline.

#### Instructor notes

#### Student notes

When you scale based on a schedule, you

can scale your application in response to predictable load changes. For
example, every week the traffic to your web application starts to
increase on Wednesday, remains high on Thursday, and starts to decrease
on Friday. You can plan your scaling activities based on the predictable
traffic patterns of your web application. To configure your Auto Scaling
group to scale based on a schedule, you create a scheduled action. This
schedule instructs EC2 Auto Scaling to perform a scaling action at
specified times. To create a scheduled scaling action, specify the
following core settings:Start time when the scaling action should take
effect.New minimum, maximum, and desired sizes for the scaling action.At
the specified time, EC2 Auto Scaling updates the group with the values
for minimum, maximum, and desired size for the specified scaling action.

:::

### Slide 24:

![Slide 24](slide_24.png)

::: Notes

Simple scaling responds to a CloudWatch alarm by making a single fixed adjustment, then waiting for the cooldown period before responding to additional alarms. It's the simplest dynamic scaling policy, but the cooldown period means it can lag behind rapidly changing load. For gradual, sustained load changes, simple scaling is sufficient; for workloads with sudden, large demand spikes, step or target tracking scaling provides faster and more proportionate responses.

#### Instructor notes

#### Student notes

With simple scaling, you choose scaling metrics and threshold values for the CloudWatch alarms that initiate the scaling process. You also define how your Auto Scaling group should scale when a threshold is in breach for a specified number of evaluation periods. Simple scaling policy is one of the dynamic scaling options available. Simple scaling policies require you to perform the following tasks:

* Create CloudWatch alarms for the scaling policies.
* Specify the thresholds for the alarms.
* Define whether to add or remove instances, and how many, or set the group to an exact size.

After a scaling activity starts, the policy must wait for the scaling activity or health check replacement to finish and the cooldown period to expire before responding to additional alarms. Cooldown periods help to prevent the initiation of additional scaling activities before the effects of previous activities are visible. For more information, see "Step and Simple Scaling Policies for Amazon EC2 Auto Scaling" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html.

:::

### Slide 25:

![Slide 25](slide_25.png)

::: Notes

Step scaling improves on simple scaling by responding proportionally to the size of the alarm breach — a small breach triggers a small adjustment; a large breach triggers a larger one. Unlike simple scaling, step scaling can continue responding to new alarms while a scaling activity is in progress, which makes it more responsive during rapidly escalating demand. The step boundaries require careful calibration: steps that are too aggressive can overshoot capacity; steps that are too conservative can leave the application undersized.

#### Instructor notes

#### Student notes

With step scaling, you choose scaling metrics and threshold values for the CloudWatch alarms that initiate the scaling process. You also define how your Auto Scaling group should scale when a threshold is in breach for a specified number of evaluation periods. Step scaling policy is one of the dynamic scaling options available. Step scaling policies require you to perform the following tasks:

* Create CloudWatch alarms for the scaling policies.
* Specify the high and low thresholds for the alarms.
* Define whether to add or remove instances, and how many, or set the group to an exact size.

The main difference between the simple scaling policy and step scaling policy types is the step adjustments that are provided with step scaling policies. When step adjustments are applied, and they increase or decrease the current capacity of your Auto Scaling group, the adjustments vary based on the size of the alarm breach. With step scaling, the policy can continue to respond to additional alarms, even while a scaling activity or health check replacement is in progress. Therefore, all alarms that are breached are evaluated by EC2 Auto Scaling as it receives the alarm messages. For more information, see "Step and Simple Scaling Policies for Amazon EC2 Auto Scaling" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html.

:::

### Slide 26:

![Slide 26](slide_26.png)

::: Notes

Target tracking scaling simplifies policy configuration by letting you specify a target metric value rather than alarm thresholds. EC2 Auto Scaling handles creating and managing the CloudWatch alarms automatically. The trade-off is less granular control: you specify what metric to target but not how aggressively to scale. Target tracking works well for most workloads, but applications with unusual traffic patterns or slow-starting instances may need step scaling for more precise control.

#### Instructor notes

#### Student notes

With target tracking scaling policies, you choose a scaling metric and set a target value. EC2 Auto Scaling creates and manages the CloudWatch alarms that initiate the scaling policy. The tracking policy calculates the scaling adjustment based on the metric and the target value. The scaling policy adds or removes capacity as required to keep the metric at, or close to, the specified target value. In addition to keeping the metric close to the target value, a target tracking scaling policy also adjusts to changes in the metric because of a changing load pattern. In this example, the target tracking scaling policy is configured to keep the average CPU utilization of the Auto Scaling group at 50 percent.

The following predefined metrics are available:

* `ASGAverageCPUUtilization` : Average CPU utilization of the Auto Scaling group.
* `ASGAverageNetworkIn` : Average number of bytes received on all network interfaces by the Auto Scaling group.
* `ASGAverageNetworkOut` : Average number of bytes sent out on all network interfaces by the Auto Scaling group.
* `ALBRequestCountPerTarget` : Number of requests completed per target in an Application Load Balancer target group.

For more information, see "Target Tracking Scaling Policies for Amazon EC2 Auto Scaling" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-target-tracking.html.

:::

### Slide 27:

![Slide 27](slide_27.png)

::: Notes

Predictive scaling uses machine learning to forecast traffic based on historical patterns and provision capacity proactively. This is particularly valuable for workloads that take time to warm up — reactive scaling may provision instances after demand has already peaked. The important limitation is that predictive scaling is trained on observed patterns: it works well for recurring, foreseeable spikes but provides no advantage for sudden, unpredictable events. Use it in combination with dynamic scaling for comprehensive coverage.

#### Instructor notes

#### Student notes

AWS also provides predictive scaling. You can use predictive scaling to scale your Amazon EC2 capacity before traffic changes. EC2 Auto Scaling that is enhanced with predictive scaling delivers faster and more accurate capacity provisioning. These capabilities result in lower cost and more responsive applications. Predictive scaling predicts future traffic based on daily and weekly trends, including regularly occurring spikes. Predictive scaling provisions the right number of EC2 instances before anticipated changes. Provisioning the capacity in time for an impending load change makes automatic scaling faster. Predictive scaling's machine learning algorithms detect changes in daily and weekly patterns and automatically adjusts their forecasts. This removes the need for manual adjustment of EC2 Auto Scaling parameters over time.

**Configuring predictive scaling** : You can configure predictive scaling through the AWS Auto Scaling console, AWS Auto Scaling APIs, through AWS CLI, and AWS CloudFormation. To get started, navigate to the AWS Auto Scaling page and create a scaling plan for Amazon EC2 resources that includes predictive scaling. When enabled, you can visualize their forecasted traffic and the generated scaling actions immediately. You can use predictive scaling, dynamic scaling, or both. Predictive scaling works by forecasting load and scheduling minimum capacity; dynamic scaling uses target tracking to adjust a designated CloudWatch metric to a specific target. The two models work together because of the scheduled minimum capacity already set by predictive scaling. Predictive scaling is a great match for websites and applications that undergo periodic traffic spikes. It is not designed to help in situations in which spikes in load are unpredictable.

:::

### Slide 28:

![Slide 28](slide_28.png)

::: Notes

Termination policies control which instances Auto Scaling removes first when scaling in. The default policy prioritizes even distribution across Availability Zones for resilience; other policies optimize for cost (`ClosestToNextInstanceHour`), configuration freshness (`OldestLaunchTemplate`), or gradual fleet upgrades (`OldestInstance`). Choosing the wrong termination policy can inadvertently remove your newest, most current instances or create AZ imbalance. Review your termination policy whenever you change your launch template or deployment strategy.

#### Instructor notes

#### Student notes

With each Auto Scaling group, you control when EC2 Auto Scaling removes instances (referred to as scaling in) from your group. When you have EC2 Auto Scaling automatically scale in, you must decide which instances EC2 Auto Scaling should terminate first. You can configure this using a termination policy. EC2 Auto Scaling supports the following termination policies:

* **Default** : The default termination policy is designed to help ensure that your instances span Availability Zones evenly for high availability.
* **Allocation strategy** : Terminate instances in the Auto Scaling group to align the remaining instances to the defined allocation strategy to fulfill your On-Demand capacity or Spot capacity. This policy is useful when your preferred instance types have changed.
* **OldestLaunchTemplate** : Terminate instances that have the earliest launch template. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
* **OldestLaunchConfiguration** : Terminate instances that have the earliest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
* **ClosestToNextInstanceHour** : Terminate instances that are closest to the next billing hour. This policy helps you maximize the use of your instances that have an hourly charge.
* **NewestInstance** : Terminate the newest instance in the group. This policy is useful when you're testing a new launch configuration but don't want to keep it in production.
* **OldestInstance** : Terminate the earliest instance in the group. This option is useful when you're upgrading the instances in the Auto Scaling group to a new EC2 instance type. You can gradually replace instances of the earlier type with instances of the new type.

For more information, see "Control Which Auto Scaling Instances Terminate During Scale In" in the Amazon EC2 Auto Scaling User Guide at http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html.

:::

### Slide 29:

![Slide 29](slide_29.png)

::: Notes

Thrashing — where Auto Scaling oscillates between scaling out and scaling in — is a symptom of misconfigured scaling settings. The alarm sustained period, cooldown period, and instance warm-up period work together to create stability, but they also mean that the Auto Scaling group responds to demand changes with inherent latency. Tuning these parameters requires understanding your application's response time under load and the startup time for new instances — values that should be measured, not guessed.

#### Instructor notes

#### Student notes

Thrashing occurs when your settings for scaling are removing capacity, followed by quickly readding capacity.

* **Alarm sustained period** : Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions because they are in a particular state. The state must have changed and been maintained for a specified number of periods. For example, two consecutive periods of 5 minutes would take 10 minutes to initiate the alarm.
* **Cooldown period** : The Auto Scaling cooldown period is a configurable setting for your Auto Scaling group. This period makes sure that EC2 Auto Scaling doesn't launch or terminate additional instances before the previous scaling activity takes effect. After the Auto Scaling group dynamically scales using a simple scaling policy, EC2 Auto Scaling waits for the cooldown period to finish before resuming scaling activities. However, if an instance becomes unhealthy, EC2 Auto Scaling does not wait for the cooldown period to complete. For more information, see "Scaling Cooldowns for Amazon EC2 Auto Scaling" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-scaling-cooldowns.html.
* **Instance warm-up period** : With step scaling policies, you can specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warm-up time has expired, an instance is not counted toward the aggregated metrics of the Auto Scaling group. While scaling out, AWS also does not consider instances that are warming up as part of the current capacity of the group. Therefore, multiple alarm breaches that fall in the range of the same step adjustment result in a single scaling activity. This ensures that policy does not add more instances than you need. For more information, see "Instance Warm-Up" in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html#as-step-scaling-warmup.

:::

### Slide 30:

![Slide 30](slide_30.png)

::: Notes


#### Instructor notes

#### Student notes

For more information, see the following references: The Grinder at http://grinder.sourceforge.net/ and Apache JMeter at http://jmeter.apache.org/.

:::

### Slide 31:

![Slide 31](slide_31.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 32:

![Slide 32](slide_32.png)

::: Notes

Spot Instances offer significant cost savings by using spare EC2 capacity, but they come with the fundamental constraint that AWS can reclaim them with a two-minute warning. This makes Spot Instances unsuitable for stateful workloads or any task that can't tolerate interruption. They work well for batch processing, fault-tolerant distributed systems, and stateless web tiers. The key design principle is: if your application can't handle an instance disappearing suddenly, don't run it on Spot.

#### Instructor notes

#### Student notes

A Spot Instance is an instance that uses spare EC2 capacity that is available for less than the On-Demand price. Because Spot Instances make it possible for you to request unused EC2 instances at steep discounts, you can lower your Amazon EC2 costs significantly. By using Spot Instances and Spot Fleets, you can start or stop new instances based primarily on the price of instances at a given time instead of performance. Amazon EC2 can terminate a Spot Instance as the availability of, or price for, Spot Instances changes. For more information, see "Spot Instances" in the Amazon Elastic Compute Cloud User Guide for Linux Instances at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html.

:::

### Slide 33:

![Slide 33](slide_33.png)

::: Notes

Spot Fleets manage a collection of Spot and On-Demand Instances to maintain a target capacity, automatically replacing interrupted Spot Instances from different instance pools. The fleet configuration includes a maximum price, which bounds your costs, but the fleet will stop launching instances when the price ceiling is reached even if target capacity isn't met. Diversifying across multiple instance types and Availability Zones is the key strategy for maintaining capacity despite individual pool interruptions.

#### Instructor notes

#### Student notes

A Spot Fleet is a collection, or fleet, of Spot Instances, and if necessary, On-Demand Instances. The Spot Fleet attempts to launch the number of Spot Instances and On-Demand Instances to meet the target capacity that you specified in the Spot Fleet request. The request for Spot Instances is fulfilled if the following conditions apply: Capacity is available. Maximum price that you specified in the request exceeds the current Spot price. If your Spot Instances are interrupted, Spot Fleet attempts to maintain its target capacity fleet. You can also set a maximum amount per hour that you're willing to pay for your fleet. Spot Fleet launches instances until it reaches the maximum amount. When the maximum amount that you're willing to pay is reached, the fleet stops launching instances even if it hasn't met the target capacity. For more information, see "Spot Fleet" in the Amazon Elastic Compute Cloud User Guide for Linux Instances at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html.

:::

### Slide 34:

![Slide 34](slide_34.png)

::: Notes

Mixing On-Demand and Spot Instances in a single Auto Scaling group lets you use On-Demand for baseline capacity and Spot for burstable capacity. The base On-Demand portion provides stability; the Spot portion reduces cost for demand above the baseline. This hybrid approach requires designing the application to tolerate partial fleet interruptions — losing the Spot portion during high Spot prices should not cause an outage, merely a reduction in capacity.

#### Instructor notes

#### Student notes

You can launch and automatically scale a fleet of On-Demand Instances and Spot Instances within a single Auto Scaling group. This helps you optimize your cost savings for Amazon EC2 instances, while making sure that you obtain the desired scale and performance for your application. You can specify the same settings that are used to launch Spot Instances as part of the settings of an Auto Scaling group. When you specify the settings as part of the Auto Scaling group, you can specify additional options. For example, you can specify whether to launch only Spot Instances or a combination of both On-Demand Instances and Spot Instances. For more information, see "Requesting Spot Instances for Fault-Tolerant and Flexible Applications" at https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-launch-spot-instances.html and "Auto Scaling Groups with Multiple Instance Types and Purchase Options" at https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-purchase-options.html in the Amazon EC2 Auto Scaling User Guide.

:::

### Slide 35:

![Slide 35](slide_35.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 36:

![Slide 36](slide_36.png)

::: Notes

AWS License Manager centralizes software license compliance across AWS and on-premises environments, but it's only effective if license rules are accurately configured. Creating rules requires understanding the actual terms of your enterprise agreements — which may be complex, version-specific, or ambiguous. License Manager can block instance launches when license limits would be exceeded, which is the right behavior for compliance but can cause operational disruptions if rules are misconfigured or license counts are set too conservatively.

#### Instructor notes

#### Student notes

AWS License Manager makes managing your software licenses from software vendors across AWS and on-premises environments more efficient. Using License Manager, administrators can create customized licensing rules that emulate the terms of their licensing agreements. The rules in License Manager help you prevent a licensing breach by stopping the instance from launching or by notifying administrators about the infringement. Administrators gain control of and visibility into all their licenses with the License Manager dashboard. They can reduce the risk of noncompliance, misreporting, and additional costs as the result of licensing overages.

:::

### Slide 37:

![Slide 37](slide_37.png)

::: Notes

License configurations translate the terms of enterprise agreements into rules that AWS enforces at launch time. Creating accurate configurations requires careful review of your actual license agreements — the rules you define in License Manager are only as correct as your understanding of those agreements. Involve your compliance and legal teams when creating license configurations, because the consequences of misconfiguration can range from unnecessary launch failures to actual licensing violations.

#### Instructor notes

#### Student notes

License configurations are the core of License Manager. They contain licensing rules based on the terms of your enterprise agreements. The rules that you create determine how AWS processes commands that consume licenses. While creating license configurations, work closely with your organization's compliance team to review your enterprise agreements. For more information, see "Self-Managed Licenses in License Manager" in the AWS License Manager User Guide at https://docs.aws.amazon.com/license-manager/latest/userguide/license-configurations.html.

:::

### Slide 38:

![Slide 38](slide_38.png)

::: Notes

AWS License Manager provides tooling to help translate vendor license terms into enforcement rules, but vendor agreements vary widely in their specificity and complexity. Some metrics are straightforward (vCPU counts); others require judgment about what constitutes a 'socket' or 'physical machine' in a virtualized environment. Building your License Manager rules from vendor documentation is a useful starting point, but it requires verification against your actual agreements and consultation with your compliance team.

#### Instructor notes

#### Student notes

You can create License Manager rule sets based on the language of software vendor licenses. For more information, see "Build License Manager Rules from Vendor Licenses" in the AWS License Manager User Guide at https://docs.aws.amazon.com/license-manager/latest/userguide/licenses-to-rules.html.

:::

### Slide 39:

![Slide 39](slide_39.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 40:

![Slide 40](slide_40.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 41:

![Slide 41](slide_41.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 42:

![Slide 42](slide_42.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 43:

![Slide 43](slide_43.png)

::: Notes


#### Instructor notes

#### Student notes

For more information, see "AMI `<AMI ID>` Is Pending, and Cannot Be Run. Launching EC2 Instance Failed." in the Amazon EC2 Auto Scaling User Guide at https://docs.aws.amazon.com/autoscaling/ec2/userguide/ts-as-ami.html#ts-as-ami-2.

:::

### Slide 44:

![Slide 44](slide_44.png)

::: Notes


#### Instructor notes

#### Student notes

:::

### Slide 45:

![Slide 45](slide_45.png)

::: Notes


#### Instructor notes

#### Student notes

:::
