AWS Serverless Scaling Considerations for Kinesis Data Stream
Scaling Considerations for Kinesis Data Stream
Kinesis Data Streams are intended to handle massive amounts of data.
Stream processing is shard-dependent.
Lambda retrieves records in batches and calls your function once per shard.
If Lambda can't process one message in a shard, the whole shard is stopped.
It is stopped until the message is processed or the data retention ends.
To handle the remainder of the messages, your function should catch errors and log them.
You can use Amazon CloudWatch to store the error logs.
You can adjust failure handling by:
- A function error
- A maximum record age
- Retry attempts
- Failure destinations
For example, 4,000 records per second or 4 MB of data per second requires four shards.
How many shards you need depends on how much data you intend to create.
Scaling Considerations for Kinesis Data Stream Video
W3schools.com collaborates with Amazon Web Services to deliver digital training content to our students.
Enhanced fan-out was released to solve constraints and improve the way customers acquire data.
Subscribers to the stream are enhanced fans.
Once subscribed, the consumer receives data from the shard lasting up to 5 minutes.
Data will be pushed to consumers as it comes in.
This decreases latency to 50-70 ms.
Enhanced fan-out also boosts throughput.
It also comes at an extra expense.
You should examine your traffic and if regular consumer latency is acceptable.
If your Lambda function takes too long or fails to execute a batch, additional messages in the stream may be lost.
Related reads:Reading Data from Amazon Kinesis Data Streams
Using Consumers with Enhanced Fan-Out