Lambda

Kinesis Data Streams

Scaling is based on stream capacity as determined by the number and size of incoming records written to the stream. By default, the scaling behavior follows this pattern:

If stream utilization is greater than 70% of the Kinesis service limits consistently within a 5 minute period, then scale up
If stream utilization is less than 35% of the Kinesis service limits consistently within a 60 minute period, then scale down

The scaling behavior is customizable using environment variables:

AUTOSCALE_KINESIS_THRESHOLD: The target threshold to cause a scaling event. The default value is 0.7 (70%), but it can be set to any value between 0.4 (40%) and 0.9 (90%). If the threshold is low, then the stream is more sensitive to scaling up and less sensitive to scaling down. If the threshold is high, then the stream is less sensitive to scaling up and more sensitive to scaling down.
AUTOSCALE_KINESIS_UPSCALE_DATAPOINTS: The number of data points required to scale up. The default value is 5, but it can be set to any value. The number of data points directly affects the evaluation period. Use a higher value to reduce the frequency of scaling up.
AUTOSCALE_KINESIS_DOWNSCALE_DATAPOINTS: The number of data points required to scale down. The default value is 60, but it can be set to any value. The number of data points directly affects the evaluation period. Use a higher value to reduce the frequency of scaling down.

Shards will not scale evenly, but the autoscaling functionality follows AWS best practices for resharding streams. The UpdateShardCount API call has many limitations that the application is designed to address, but there may be times when these limits cannot be avoided; if any limits are met, then users should file a service limit increase with AWS. Although rare, the most common service limits that users may experience are:

Scaling a stream more than 10 times per 24 hour rolling period
Scaling a stream beyond 10,000 shards

We recommend using one autoscaling service for an entire Substation deployment, but many can be used if needed. For example, one can be assigned to data pipelines that have predictable traffic (e.g., steady stream utilization) and another can be assigned to data pipelines that have unpredictable traffic (e.g., sporadic stream utilization, bursty stream utilization).