Using scheduled queries with Amazon Timestream for LiveAnalytics
Timestream for LiveAnalytics is a serverless, purpose-built database for processing time-series data. While it efficiently ingests and stores data, querying large raw datasets frequently can be inefficient, especially for use cases like aggregations, trend analysis, or anomaly detection. Scheduled Queries help addressing this by running SQL queries at specified intervals, rolling-up raw data into aggregated results, and storing them in a destination Timestream table. This approach improves performance on target tables, and optimises storage by retaining only the aggregated data. In this post, we’ll walk through setting up Timestream Scheduled Queries to automate data rollups. We’ll also explore how this setup helps you analyze and detect trends in your data over time. What are we going to build ? As a use case, we’ll create a basic clickstream API that allows clients to push events, such as clicks, views, and user actions. These events will be ingested into a timestream table and made available for consumption by downstream systems. A scheduled query will run, performing an hourly rollup of raw event data. Based on this rolled-up data, we’ll detect trending products. Let’s dive into this: Time Series store: With EventBridge Pipes supporting Timestream as a direct target, integrating with Kinesis stream source is simple. This eliminates much of the custom glue code needed for ingestion. In this setup, EventBridge Pipes polls the Kinesis stream, converts the received events into records, and writes them directly to the Timestream table. A scheduled aggregation query runs at predefined intervals to process raw events, performing an hourly rollup and storing the results in a dedicated table. Once it successfully completes, it publishes a notification event that triggers a function to query the table and detect the top N trending products, which are then published to a topic. We can imagine that these events can be used for various purposes: adjusting ad spend or fine-tuning A/B tests. Additionally, the rolled-up data can be fed into dashboards like QuickSight, Tableau, or Grafana for visualizing and monitoring product performance. ☝️Note: While I won't go into detail in this post, it's worth noting that a common way to reduce writes to the raw table is by pre-aggregating data in small batches, such as using a 5-minute tumbling window. This groups events into fixed, non-overlapping intervals before writing, effectively lowering the write frequency. Managed Apache Flink can help achieve this. Let’s see the code In this section, I will mainly focus on the integration with EventBridge Pipes as well as the scheduled query configuration. You can find the complete end-to-end solution at the following link

Timestream for LiveAnalytics is a serverless, purpose-built database for processing time-series data. While it efficiently ingests and stores data, querying large raw datasets frequently can be inefficient, especially for use cases like aggregations, trend analysis, or anomaly detection.
Scheduled Queries help addressing this by running SQL queries at specified intervals, rolling-up raw data into aggregated results, and storing them in a destination Timestream table. This approach improves performance on target tables, and optimises storage by retaining only the aggregated data.
In this post, we’ll walk through setting up Timestream Scheduled Queries to automate data rollups. We’ll also explore how this setup helps you analyze and detect trends in your data over time.
What are we going to build ?
As a use case, we’ll create a basic clickstream API that allows clients to push events, such as clicks, views, and user actions. These events will be ingested into a timestream table and made available for consumption by downstream systems. A scheduled query will run, performing an hourly rollup of raw event data. Based on this rolled-up data, we’ll detect trending products.
Let’s dive into this:
Time Series store: With EventBridge Pipes supporting Timestream as a direct target, integrating with Kinesis stream source is simple. This eliminates much of the custom glue code needed for ingestion. In this setup, EventBridge Pipes polls the Kinesis stream, converts the received events into records, and writes them directly to the Timestream table.
A scheduled aggregation query runs at predefined intervals to process raw events, performing an hourly rollup and storing the results in a dedicated table. Once it successfully completes, it publishes a notification event that triggers a function to query the table and detect the top N trending products, which are then published to a topic. We can imagine that these events can be used for various purposes: adjusting ad spend or fine-tuning A/B tests. Additionally, the rolled-up data can be fed into dashboards like QuickSight, Tableau, or Grafana for visualizing and monitoring product performance.
☝️Note: While I won't go into detail in this post, it's worth noting that a common way to reduce writes to the raw table is by pre-aggregating data in small batches, such as using a 5-minute tumbling window. This groups events into fixed, non-overlapping intervals before writing, effectively lowering the write frequency. Managed Apache Flink can help achieve this.
Let’s see the code
In this section, I will mainly focus on the integration with EventBridge Pipes as well as the scheduled query configuration. You can find the complete end-to-end solution at the following link