Auto Scaling
In this article:
Auto Scaling#
General information#
The Auto Scaling service allows you to create scalable groups where the required number of instances is automatically deployed to accommodate the current application load. Built-in automatic scaling mechanisms ensure the launch of new instances to replace failed ones and creation/termination of instances when the load changes.
You can create both groups where the number of instances is unchanged and groups where it can change within a given interval depending on the load. Health check mechanisms ensure that new instances are launched to replace failed ones, while scaling policies add/terminate instances to/in the group upon specified alarms. For example, the number of instances in a group may change when the CPUUtilization metric in the group goes beyond thresholds.
Automatic scaling provides the following benefits:
Enhanced fault tolerance. If an instance crashes, automatic scaling mechanisms detect this, delete the instance from the group, and launch a new instance instead.
Improved availability. Automatic scaling allows you to quickly add computing capacity when applications need more resources to accommodate increased load.
Cost-effectiveness. Since the number of running instances changes depending on the load, you only pay for the resources that are actually needed to support the load.
When creating an Auto Scaling group, you can choose subnets in different availability zones to place the instances. Distributing the instances among several availability zones improves service availability and reliability even more. The instances in the group are distributed among different availability zones as evenly as possible. If the number of instances in a group is divided by the number of subnets with a remainder, the remaining instances are distributed among subnets in the order in which the availability zones are numbered.
Note
Resources the Auto Scaling service needs are prepared and maintained by the system user. The Activity Log registers all API calls to the Auto Scaling service, including system requests to create, modify, and delete resources required for the service to run.
Key concepts#
Auto Scaling group. Instances are organized into a group that is managed and scaled as a single entity. The number of instances in a group changes according to the pre-defined scaling policy to support the current load as it changes.
Launch templates. Launch templates are used to create instances in a group. They describe configuration of the instances that you want to create. For example, you can choose an image, specify an instance type, add user data, and define other settings. For details, read about launch templates.
Scaling policy. The scaling policy prescribes how the number of instances in a group is to change when an alarm is triggered as a result of the monitored metrics going beyond the preset limits. For details, read about using alarms with scaling policies.
Cooldown. It sets a freezing period, after which a policy can be executed again. It takes time to get instances in and out of the service. Up to this point, tracked metrics can be nonobjective and cause false alarms about the need to add or remove instances. Blocking policy execution allows you to ignore such alarms.
Health check. The cloud periodically checks the status of the instances in the group. An instance is considered healthy if it is active and has the OK status. This status corresponds to the Running instance state in the instance table and on its page. If the instance is in any other state, then it is labelled as Unhealthy and another instance is launched instead.
Health check grace period. This parameter allows you to postpone the next health check until the instance is fully deployed, and all necessary services are launched on it. This helps avoid a situation when the instance is considered unhealthy and terminated while it is still being created and has not yet acquire the OK status.
Instance scale-in protection. Instances with protection enabled cannot be deleted when the desired group is downsized.
Termination policies. Termination policies set the criteria used to select instances for deletion when the group has to be downsized. Multiple termination policies can be set at once. To select the instances to be deleted, all group instances are sorted by applying the selected policies. The sorting order matches the order in which the policies are listed.
Using Auto Scaling groups#
Before you begin#
To be able to work with Auto Scaling service, a user needs to have AutoscalingFullAccess project grants. For instance, such grants are available to administrators in AutoScalingAdministrators and CloudAdministrators groups. If necessary, you can create a separate user, add the user to the project, and either attach the AutoscalingFullAccess policy to the user or add the user to one of the groups of administrators in this project.
In addition, the project should have the following resources:
Create a launch template with the desired instance configuration, since instances in the Auto Scaling group are created from a launch template.
Create an Auto Scaling group#
To create an Auto Scaling group, go to the section Virtual machines Compute Auto Scaling and click Create.
Step 1. Set the parameters#
To create an Auto Scaling group, specify:
Group name.
VPC and subnets where the group will be created and respective group instances will be launched. You can select up to three subnets, one in each availability zone.
Note
If automatic association is enabled for Elastic IPs in the selected subnet, then an available allocated Elastic IP address will be automatically associated with the instance when it is created in that subnet.
Launch template and its version, which describes configuration of the instances to be created.
If the Security groups option was selected when you created the launch template, you must specify subnets from the same VPC, which the specified security groups belong to.
If the Advanced network configuration option was selected when you created the launch template, then all specified parameters of the network interfaces will be ignored. In particular, the instances you create will have only one interface, even if more than one were specified. It will be connected to one of the selected subnet and assigned a default security group from the selected VPC. The VPC specified in the launch template will also be ignored if another VPC is selected.
Note
If the launch template was created using API, security groups and network interface parameters may be left non-configured. In this case, the primary interfaces of the instances being created will also be connected to one of the selected subnet and assigned a default security group from the selected VPC.
Once you have set the parameters, click Next to proceed to the next step.
Step 2. Specify settings#
In this step, you can set limits on the Auto Scaling group capacity, initial number of instances in the group, and delay for next scaling policy execution and instance health check.
Note
If a placement group is specified in a launch template, then the number of instances in the group cannot exceed your project’s quota for the number of instances in one availability zone in the placement group.
Attention
If you need a group with a fixed number of instances, set the maximum size equal to the minimum one.
Maximum capacity — maximum number of instances in the group.
Minimum capacity — minimum number of instances in the group. If this value is non-zero, then there will always be some number of instances in the group to accommodate the load.
Desired capacity — The required number of instances in the group. The desired capacity cannot be more than the maximum capacity and less than the minimum capacity. When creating a group, this parameter defines the initial number of instances in the group. In the future, the desired capacity can be changed both manually and automatically according to policies.
Default cooldown — The default delay before executing next scaling policy, if no other value is specified at its creation. Default value is 300 s.
Health check grace period — The delay before checking an instance health again; this time is required for the instance to start. Default value is 300 s.
Instance scale-in protection — instances with protection enabled are not deleted when the desired capacity is decreased.
Termination policy — when the group has to be downsized, the instances are selected for deletion in line with the specified policies. The following termination policies are supported:
Oldest instance
Newest instance
Oldest launch template
At least one termination policy should be specified. The Oldest instance policy is selected by default. To sort instances before deletion, the policies are applied from the first to the last. If you select multiple policies, you can change their priorities by dragging them.
Note
The Oldest instance and Newest instance policies contradict each other and should not be selected at the same time.
Once you have set the parameters, click Next to proceed to the next step.
Step 3. Assign tags#
Tags specified for the Auto Scaling group can be assigned to the instances created in the group.
Note
Auto Scaling group tags take precedence over launch template tags: if tags with the same keys are to be assigned to an instance when it is created, the tag specified in Auto Scaling group will be selected.
To add a Name tag, click Add Name tag and set the tag value. To assign an arbitrary tag, click Add tag and set the tag key and value.
If it is not necessary to assign the given tag to instances, clear the Assign to instances checkbox, which is selected by default.
To assign additional tags, click Add tag.
After setting tags, click Overview and create to go to the next step. You can skip this step and set tags later.
Step 4. Check the group configuration#
Check the parameters and settings you specified in the previous steps. If you need to change any parameters or settings, return to the desired step. If everything is OK, click Create.
Modify configuration of an Auto Scaling group#
Once an Auto Scaling group has been created, you can change its parameters and settings, except for the group name.
To modify the Auto Scaling group settings, go to the section Virtual machines Compute Auto Scaling, select the desired group in the table and click Modify.
Note
You can also change settings for the Auto Scaling group on the group page.
The Auto Scaling group change wizard is similar to the group creation wizard.
In the Parameters step, you can change the following parameters:
VPC;
subnets;
launch template;
launch template version;
Note
You can change, add or delete one or more subnets, but an Auto Scaling group can have not more than one subnet in every availability zone, and at least one subnet should be set.
In the Settings step, you can change the following settings:
maximum capacity;
minimum capacity;
desired capacity;
default cooldown;
health check grace period;
termination policy.
Note
If instances are created after protection is enabled, they cannot be deleted when the desired capacity is reduced. Conversely, if instances are created after it is disabled, they are allowed to be deleted when the group is scaled in. Changing this setting does not affect existing instances.
Once the parameters have been adjusted, click Next to view the changes. If everything is OK, click Change.
Note
If the desired size changes, the number of instances in the group will be increased or decreased to match the new value. If several subnets are set, instances will be redistributed among them as evenly as possible. If necessary, instances will be removed from one subnet and added to another.
Attach instances to the group#
You can add already existing instances to an Auto Scaling group provided that:
the instance is in the Running state;
the instance does not belong to any Auto Scaling group;
the instance primary interface is in the same subnet as the Auto Scaling group;
the group is not being deleted.
The desired capacity plus the number of instances being attached must not exceed the maximum capacity. In case of such an excess, then first increase the maximum capacity to add all the required instances. A maximum of 20 instances can be attached at a time.
Attached instances inherit the scale-in protection settings of the Auto Scaling group.
Instances can be attached in one of the following ways.
Go to the section Virtual machines Compute Auto Scaling.
Select a group in the resource table and go to its page.
Open the Instances tab.
Click Attach.
Select instances from the list.
Click Attach to confirm the operation.
Go to the Instances subsection.
Select the instances to be attached in the resource table.
Click Attach to Auto Scaling group.
Select an Auto Scaling group from the drop-down list.
Click Attach to confirm the operation.
After adding, the desired capacity will be increased by the number of attached instances.
Detach instances from the group#
You can remove instances from an Auto Scaling group provided that:
the instance lifecycle has the Active status;
the group is not being deleted.
Important
The remaining instances in the group may not be enough to accommodate the load. When detaching, we recommend that you immediately add instances to the group to replace those being removed.
If no instances are added to the group to replace those being removed, then the number of detached instances cannot exceed the difference between the desired and the minimum capacities of the group. In such a case, to detach the required number of instances, reduce the minimum size. Up to 20 instances can be detached at a time.
To detach instances:
Go to the section Virtual machines Compute Auto Scaling.
Select a group in the resource table and go to its page.
Open the Instances tab.
Select the instances to be detached in the resource table.
Click Detach.
In the dialog window, check the Add new instances to the group to balance the load option if you want new instances to be added to the group to replace the detached ones.
Click Detach to confirm the operation.
After removing instances from the group, its desired capacity will be reduced by the number of detached instances, unless the Add new instances to the group to balance the load option is checked.
Ensure instance scale-in protection#
If you want a particular instance or set of instances not to be deleted when the desired capacity is reduced, then you can enable instance scale-in protection.
Scale-in protection can be configured for an entire group or specific instances. It can be enabled instantly when the group is created or later. This option can be configured using the dialog for changing an Auto Scaling group. This setting is applied by default to all created instances.
Scale-in protection does not prevent instance deletion when:
the Auto Scaling group is deleted, which it belongs to;
the instance is deleted manually via the web interface or API;
the instance is found to be unhealthy and therefore must be replaced.
Note
If, when the group is scaled in, the desired capacity turns out to be less than the number of instances protected against deletion, then the required number of instances cannot be deleted. The group remains in the Updating capacity status until the protection is disabled for the required number of instances or the desired capacity becomes equal to the number of instances protected against deletion.
Enable scale-in protection for an instance#
To enable scale-in protection for specific instances:
Go to the section Virtual machines Compute Auto Scaling.
Click the group name in the resource table to go to the group page.
Open the Instances tab.
In the table, select the instances whose scale-in protection settings you want to change. To enable protection, instances must be in the Active state.
Click Enable scale-in protection to enable the option.
In the dialog window, check the list of instances for which protection is to be enabled, and confirm the action.
Disable scale-in protection for an instance#
To disable scale-in protection for specific instances:
Go to the section Virtual machines Compute Auto Scaling
Click the group name in the resource table to go to the group page.
Open the Instances tab.
In the table, select the instances whose protection settings you want to change. To disable protection, instances must be in the Active state.
Click Disable scale-in protection to disable the feature.
In the dialog window, check the list of instances for which protection is to be disabled, and confirm the action.
Remove scaling constraints#
Scaling (creating and deleting instances) a particular Auto Scaling group may be impossible. New instances may not be created, for example, because the launch template uses an outdated image. If this is the case, you must manually remove the cause, that is blocking scaling. In this example, you should specify a template or template version with a supported image for the Auto Scaling group.
You can check the scaling status in the resource table in Virtual machines Compute Auto Scaling. If instances could not be created or deleted during scaling, the value of the Scaling field of the Auto Scaling group changes from Available to Restricted.
Note
If you have configured notifications of the respective events, you will receive an email that it is not possible to scale the Auto Scaling group.
For details on why scaling has failed for a particular group, open the Information tab on its page. To go to the Auto Scaling group page, click on the group name in the resource table. The Information tab will show the cause why instances could not be created and/or deleted.
Note
Instances cannot be created due to various causes, but there is the only cause why they cannot be deleted — all remaining instances have scale-in protection enabled.
Delete an Auto Scaling group#
Attention
Deleting a group permanently deletes its instances. This operation cannot be undone. Instances will be deleted, regardless of their scale-in protection setting.
To delete an Auto Scaling group, go to the Auto Scaling subsection, select the desired group in the table and click Delete. To confirm the deletion, enter the group name and click Yes, delete.
Scaling policies#
The Auto Scaling service supports dynamic scaling, i.e. automatic adding/terminating instances to/in a group based on instance metrics, such as average CPU utilization.
K2 Cloud currently supports a simple scaling policy only: the number of instances in a group changes by a fixed number, regardless of the target metric deviation. For example, if the average CPU utilization threshold is set to 80%, then the same number of instances will be added regardless of whether the actual utilization is 85%, 90%, or 95%.
Create a scaling policy#
To define a scaling policy for an Auto Scaling group, open the Policies tab on that group’s page. To do this, go to the Auto Scaling subsection, click the group name in the groups table and select the Policies tab. Then click Create and set required parameters in the dialog window.
A scaling policy has the following parameters:
Type — The policy type. Currently, only the SimpleScaling type is supported, and it cannot be changed.
Name — The policy name. It must be unique within an Auto Scaling Group.
Cooldown — The time after which a scaling policy can be executed again. It can be decreased or increased compared to the default value, depending on how long it takes to execute a specific policy.
Action — The action to be performed when the alarm is triggered.
Instances — The number of instances to be added to or terminated in the group. This parameter is available when you select the Add instances, Remove instances, or Set capacity equal to action.
Percentage — The percentage by which the group capacity changes. This parameter is available only when you select the Add capacity percentage or Remove capacity percentage action.
Number of instances, at least — The number of instances by which the group capacity will change if the number of instances calculated by percentage, turns out to be zero. This parameter is available only when you select the Add capacity percentage or Remove capacity percentage action.
Once you have set the policy parameters, click Create.
Note
As a result of the policy execution, the group capacity cannot go beyond the specified interval (maximum or minimum capacity). For example, if there are 8 instances in a group, the policy requires adding 3 more instances, but the maximum size is 10, then only 2 instances will be added. If the maximum/minimum group capacity is reached, then further addition/removing of instances is not possible. If you want to make it possible, change the maximum/minimum group capacity.
Note
When creating a policy that increases/decreases the number of instances in a group, we recommend also creating another policy that decreases/increases it respectively. For example, if one policy adds instances to a group when CPUUtilization exceeds a specified threshold, then another one should remove instances from the group when the metric drops below a predefined level.
Execute scaling policy automatically#
For the scaling policy to be executed automatically, create an alarm for the selected metric and associate the scaling policy with it. To learn how to create an alarm for an Auto Scaling group and associate a policy, read the alarm documentation.
When you create a scaling policy, it is enabled by default. To enable or disable the policy, go to the Auto Scaling subsection and click the group name in the table. After the group page opens, go to the Policies tab and turn the On/Off switch for this policy in the desired position.
Execute scaling policy manually#
You can execute a scaling policy at any time without waiting for an alarm to trigger. To do this, go to the Auto Scaling subsection and click the group name in the table. After the group page opens, open the Policies tab, select the policy you want to execute, and click Execute.
Note
If you manually execute the policy in the web interface, the Cooldown option of the executed policy is taken into account. If the policy is executed before the specified cooldown period elapses, the group size will not change.
Change a scaling policy#
To change a scaling policy, go to the Auto Scaling subsection and click the group name in the table. After the group page opens, go to the Policies tab, select the policy you want to change, and click Change. The dialog for policy change is similar to that for policy creation. You can redefine any parameters except for the policy name and type.
To change the policy execution criteria, go to the section Monitoring Alarms and, in the table, select the alarm with which the policy is associated. Click Modify and change the alarm parameters in the dialog window. In particular, you can update its triggering condition. For example, you can lower the CPUUtilization threshold so that additional instances are launched before CPU utilization reaches critical levels.
Delete a scaling policy#
To delete a scaling policy, go to the Auto Scaling subsection and click the group name in the table. Once the group page opens, go to the Policies tab, select the policy you want to delete, and click Delete. In the dialog window, confirm the deletion of the selected policy (click Delete again).
Note
Don’t forget to also delete the policy from the list of alarm actions it was associated with.
Activity notifications#
You can configure email notifications to be sent in case of adding/excluding instances to/from an Auto Scaling group when scaling.
Depending on the activity type, notifications can be sent to different targets. To do this, set up appropriate notification configurations. If the same email address is included in multiple configurations, activities from such notifications are aggregated. Therefore, only one message will be sent to this address.
You can configure 10 activity notification configurations per Auto Scaling group and specify up to 10 email addresses per configuration. Moreover, you can expand these limits if necessary. To do this, contact support.
Note
You can set up, modify, or delete notification configurations only when the Auto Scaling group status is Ready.
Set up activity notifications#
Go to the section Virtual machines Compute Auto Scaling.
In the resource table, find the Auto Scaling group for which you want to set up activity notifications. Click the group name to go to its page.
Open the Notification configurations tab and click Setup.
In the dialog window that opens, specify the following parameters:
Name — The notification configuration name may only contain Roman letters, digits, and the
.
,-
, and_
characters.Activity types — Choose at least one of the following supported activity types:
Instance launched
Instance launch error
Instance terminated
Instance termination error
Targets — Enter comma-separated email addresses to which notifications of the chosen activity types will be sent. When any of the chosen activities occurs, notifications will be sent to all targets specified in the notification configuration. At least one target must be specified.
Click Setup.
Modify notification configuration#
Go to the section Virtual machines Compute Auto Scaling.
In the resource table, find the Auto Scaling group for which you want to modify a notification configuration. Click the group name to go to its page.
Open the Notification Configurations tab, select the configuration in the resource table and click Modify.
In the window that opens, you can choose another activity type(s) and add/remove email addresses.
Click Modify to save the changes.
Delete notification configuration#
Go to the section Virtual machines Compute Auto Scaling.
In the resource table, find the Auto Scaling group from which you want to delete a notification configuration. Click the group name to go to its page.
Open the Notification configurations tab, select the configuration in the resource table and click Delete.
In the dialog window, confirm the action.
Viewing group history#
Every activity with instances within an Auto Scaling group is stored in the activity history. Corresponding activities are recorded when a new instance is launched within the ASG and when it is terminated.
To view the activity history, go to the Auto Scaling subsection and click the group name in the table. After the group page opens, go to the Activity history tab.
The tab displays the history of launching and removing group instances over the last week, including the activity status (success or failure), brief description, start and end time, and the reason for performing the activity.
Auto Scaling group information#
For information about the group parameters, status and composition, scaling policies and activity history, open the Auto Scaling group page. To go to the specific group page, click the link with its name in the group’s table in the Auto Scaling subsection.
The Information tab contains data about the launch template and its version used; current group status and scale-in protection settings; subnet and availability zone where the instances are deployed; default Cooldown and health check delays; desired, minimum and maximum group size; and current number of instances in the group. Here, you can change group settings and delete group.
The Activity history tab displays the history of launching and terminating group instances over the last week, including the activity status (success or failure), brief description, start and end time, and the reason for performing the activity.
The Policies tab contains the list of policies you created, including policy name, type, status (On/Off), Cooldown, and the action to execute. Here, you can create a new scaling policy, enable/disable a policy, execute a policy manually or delete an existing policy.
The Instances tab displays information about the group instances. Here you can also attach and detach instances, prohibit and permit deletion of specific instances when scaling a group.