Autoscale

Autoscale AWS services using SNS and CloudWatch Alarms.

Kinesis Data Streams

The scaling behavior is to scale up / out if stream utilization is greater than 75% of the Kinesis service limits within a 5 minute period and scale down / in if stream utilization is less than 25% of the Kinesis service limits within a 60 minute period. In both cases, streams scale by 50%.

Stream utilization is based on volume (i.e., 60,000 events per minute) and size (i.e., 10GB data per minute); these values are converted to a percentage (0.0 to 1.0) and the maximum of either is considered the stream's current utilization.

By default, streams must be above the upper threshold for all 5 minutes to scale up and below the lower threshold for at least 57 minutes to scale down. These values can be overridden by the environment variables SUBSTATION_KINESIS_UPSCALE_DATAPOINTS (cannot exceed 5 minutes) and SUBSTATION_KINESIS_DOWNSCALE_DATAPOINTS (cannot exceed 60 minutes).

For example:

  • If a stream is configured with 10 shards and it triggers the upscale alarm, then the stream is scaled up to 15 shards
  • If a stream is configured with 10 shards and it triggers the downscale alarm, then the stream is scaled down to 5 shards

Shards will not scale evenly, but the autoscaling functionality follows AWS best practices for resharding streams. The UpdateShardCount API call has many limitations that the application is designed to address, but there may be times when these limits cannot be avoided; if any limits are met, then users should file a service limit increase with AWS. Although rare, the most common service limits that users may experience are:

  • Scaling a stream more than 10 times per 24 hour rolling period
  • Scaling a stream beyond 10,000 shards

We recommend using one autoscaling service for an entire Substation deployment, but many can be used if needed. For example, one can be assigned to data pipelines that have predictable traffic (e.g., steady stream utilization) and another can be assigned to data pipelines that have unpredictable traffic (e.g., sporadic stream utilization, bursty stream utilization).