An intro to time-series databases

Last updated: Nov 18, 2025

Time-series data is everywhere in modern systems - from IoT sensors and financial markets to application monitoring and user analytics. As organizations collect more temporal data, they need efficient ways to store, process, and analyze it.

This article explores time-series databases, their use cases, and how different database solutions handle time-based data. Whether you're dealing with millions of sensor readings, tracking user behavior, or monitoring system performance, understanding your options for time-series data storage is crucial for building effective data systems.


To get a taste of time-series analysis in action, here’s a sample query that looks at average yearly precipitation across the UK, France, and the US using weather station data from NOAA. It's a simple example, but it shows how powerful time-based queries can be for uncovering trends over time.

1SELECT year,
2       avg(`precipitation`) AS `avg_precipitation`,
3       dictGet(`country`.`country_iso_codes`, 'name', code) as country
4FROM `noaa`.`noaa_v2`
5WHERE date > '1990-01-01' AND code IN ('UK', 'FR', 'US')
6GROUP BY toStartOfYear(`date`) AS `year`,
7         substring(station_id, 1, 2) as code
8HAVING avg_precipitation > 0         
9ORDER BY country, year ASC
10LIMIT 100000;

You can see more queries like this in the Is ClickHouse a time-series database? section.

What is time-series data? #

Let’s start by defining time-series data. It describes datasets where observations are captured along a timeline, and a key feature is a timestamp. These data points are collected at regular intervals—like every second, minute, or day—or irregular intervals when events occur unpredictably.

For example, while a weather sensor might collect readings at fixed intervals (e.g., every 10 seconds), an error log in a server system records data only when an error event happens, which can be at any time. Both are forms of time-series data because they represent data that changes over time, whether those changes are periodic or sporadic.

Collection methods #

Let’s look at the methods of data collection in a little more detail:

  • Fixed interval sampling: This method captures data at consistent time points, providing a predictable and continuous stream of information. Examples include weather sensors, heart rate monitors, and energy meters. Since the points are evenly distributed, this data is beneficial for seeing trends over specific periods.
  • Event-driven data: Time-series data can also be captured irregularly, triggered by specific occurrences. For instance, a server logs data each time a particular error occurs, which could happen multiple times in one hour or not for several hours. Other examples include clickstream data from websites (where clicks happen unpredictably) or social media posts, which depend entirely on user activity.

When analyzing time series data, we often slice or group it by different time periods to understand how it changes over time. This ability to analyze changes across time defines time series data—any data that changes in any way over time belongs to this category.

Example: Weather sensor data #

Below is an example of a message that a weather sensor might generate:

{
  "device_id": "sensor-12345",
  "timestamp": "2024-12-04T10:15:00Z",
  "location": {
    "latitude": 40.7128,
    "longitude": -74.0060,
    "altitude": 15.5
  },
  "metrics": {
    "temperature": 23.5,
    "humidity": 60,
    "pressure": 1013.25,
    "battery_level": 85,
    "signal_strength": -70
  },
  "status": {
    "operational": true,
    "last_maintenance": "2024-11-20T08:00:00Z"
  }
}

In this example, several metrics, such as temperature, humidity, and pressure are captured. These metrics continuously change, often at high frequency. Additionally, we have metadata like the device_id and location, which provide context for where and what is being measured but change much more frequently than metrics and perhaps not at all. There is also status information, which includes details like operational status and last_maintenance date.

Finally, each data point is associated with a timestamp, representing when the observation was taken. This timestamp is crucial for understanding how the metrics evolve, allowing us to detect patterns or trends.

What are good use cases for time-series data? #

Time-series databases can support a wide range of applications, each benefiting from the ability to store and analyze data as it changes over time. Let's explore some common use cases and the types of questions they help organizations answer.

Product analytics #

Product analytics generates rich time-series data through user interactions, system events, and transactions. Every click, page view, feature interaction, and purchase is timestamped, creating a detailed record of user behavior over time. This temporal data enables teams to answer crucial questions about their product: How do users navigate it? What paths lead to successful conversion? Which behaviors indicate potential churn? When do users typically discover and adopt new features?

The power of time-series analysis in product analytics lies in understanding not just what users do but when and in what sequence they do it. Teams can track user journeys through onboarding, measure time-to-conversion, analyze retention patterns, and identify features that drive engagement. By correlating these behaviors with other metrics like performance data, organizations can build a complete picture of their product's effectiveness and make data-driven decisions about product development.

➡️ Read more about building product analytics with ClickHouse

Financial Markets and Trading #

Financial markets, from traditional stock exchanges to cryptocurrency trading platforms, generate massive volumes of time-series data. Every price change, trade, and order book update must be captured and analyzed in real-time. This data is crucial for generating trading signals, performing technical analysis, and identifying market opportunities.

Time-series analysis is particularly important for creating standard trading tools like candlestick charts showing price movements over specific time intervals. These charts require rapid price data aggregation (open, high, low, close) over various time windows, from minutes to months. Traders also need to analyze market liquidity, calculate technical indicators, and simultaneously detect patterns across multiple assets or trading venues. Processing this data quickly is crucial - even small delays can mean missed trading opportunities or increased risk.

For example, cryptocurrency trading platforms must aggregate data from multiple blockchain networks and decentralized exchanges, processing millions of price updates daily while maintaining sub-second query response times. This enables traders to spot arbitrage opportunities, track market trends, and make real-time trading decisions.

➡️ Read more about how Coinhall uses ClickHouse to power its blockchain data platform

System and Application Observability #

Modern applications generate vast amounts of operational data that must be monitored in real time. From server metrics to user behavior, organizations need to track everything from system health to user experience. This observability data typically includes system metrics (CPU, memory, network), application telemetry (response times, error rates), and user interaction data.

Time-series analysis enables teams to visualize this data through real-time dashboards, track performance trends, and quickly identify issues. For example, teams can monitor application performance across different regions, track user engagement metrics, and analyze experiment results from A/B tests. The ability to correlate various metrics - from infrastructure health to user behavior - helps organizations understand how system performance impacts user experience and business outcomes.

By storing this data in a time-series database, teams can not only monitor the current system state but also analyze historical patterns, establish baselines, and detect anomalies that might indicate potential problems. This comprehensive view of system behavior is essential for maintaining reliable services and optimizing user experience.

➡️ Learn how Skool uses ClickHouse to visualize real-time observability and monitor user behavior.

Summary of use case characteristics #

Use CaseWrite VolumeQuery PatternRetention NeedsCardinality
Product AnalyticsHighRecent data + historical trends90 days hot, years warmVery high (user IDs, events)
Financial TradingVery highReal-time + historical analysisYears at full resolutionHigh (symbols, exchanges)
ObservabilityExtremeReal-time dashboards + troubleshooting30 days hot, months warmExtreme (services, hosts, metrics)

What is a time-series database? #

Now that we've defined time-series data and seen some examples, what does storing this data take? A time-series database (TSDB) is a database that can efficiently store, manage, and analyze time-series data.

Such a database will need to have the following characteristics:

  1. Data volume - Time-series data grows rapidly due to the high frequency of measurements, such as sensor readings every second. Traditional databases can struggle to maintain performance when millions of data points are generated in short periods. Instead, we need a database that is optimized for appending new data.
  2. Write and query performance - Time-series data involves frequent writes (inserting new data points continuously) and complex queries for analysis (e.g., aggregations over time). Our database needs efficient time-based indexing or the ability to sort data by timestamp during ingestion.
  3. Efficient storage and compression - Since time series data often contains many repeated or very similar values, storing it efficiently is crucial. Column-based storage is an advantage here since values in the same column are stored next to each other. We’ll also want to use codecs that allow us to store deltas between values rather than the raw values each time. Delta encoding is one such codec that’s often used when storing timestamps.
  4. Time-based aggregations - Time-series analysis typically involves time-based queries, such as calculating daily averages or summing metrics over weeks. We need to be able to run these types of queries over large volumes of data while also being able to filter by time period.
  5. Ability to handle high cardinality - Time-series data often involves tracking metrics across many dimensions - thousands of servers, millions of IoT devices, or billions of user sessions, each with their own tags and labels. When you combine multiple dimensions (server + region + application + environment), the number of unique combinations (cardinality) explodes. A database needs efficient indexing and storage strategies to handle these high-cardinality dimensions without performance degradation. Some time-series databases impose strict cardinality limits that can become bottlenecks as data diversity grows, while others use specialized data structures to maintain performance even with millions of unique dimension combinations.

Many of these characteristics are the same as those required for real-time analytics databases.

Time-series databases vs transactional databases #

While traditional transactional databases like PostgreSQL or MySQL can store time-stamped data, they're optimized for different workloads than time-series analysis. Understanding these differences helps clarify when specialized time-series capabilities matter.

Transactional (OLTP) vs analytical workloads Transactional databases are designed for online transaction processing (OLTP)—handling frequent inserts, updates, and deletes with strong consistency guarantees. Think user accounts, order processing, or inventory management. Time-series databases (whether purpose-built or analytical databases like ClickHouse) are optimized for online analytical processing (OLAP)—handling high-volume append-only writes and fast aggregations over large datasets.

Write patterns OLTP databases balance read and write operations, supporting updates and deletes across rows with ACID guarantees. Time-series workloads are overwhelmingly append-only: new data points arrive continuously, but historical data rarely changes. This append-only pattern allows time-series databases to optimize storage and indexing strategies that would be inefficient in OLTP systems that need to handle updates.

Query patterns OLTP databases excel at retrieving or updating individual records or small sets of related records - "find this user's order" or "update this account balance." Time-series analysis typically involves scanning millions or billions of rows to compute aggregations: "what's the average response time over the last week?" or "show me the 95th percentile latency by region." These analytical queries require different optimization strategies.

Storage and compression OLTP databases use row-oriented storage, keeping all fields for a record together for fast retrieval and updates. Time-series databases typically use columnar storage, storing all values for a single metric together. This enables dramatic compression—sequential time-series values often change gradually, allowing delta encoding and other specialized compression techniques to achieve 10-100x compression ratios.

Scalability for time-series data As time-series data accumulates, OLTP databases can struggle with table sizes reaching billions of rows. Query performance degrades, and traditional indexing strategies become less effective. Databases optimized for analytical workloads handle these large datasets efficiently through partitioning, distributed query execution, and specialized data structures.

When to use each Use OLTP databases (PostgreSQL, MySQL) for transactional workloads requiring updates, deletes, strong consistency, and complex relational integrity. Use time-series optimized databases for high-volume temporal data requiring fast aggregations and analytical queries. Many organizations run both: OLTP databases for core application data and analytical databases for metrics, logs, and time-series analysis.

Extensions like TimescaleDB bridge this gap by adding time-series optimizations to PostgreSQL, while analytical databases like ClickHouse handle both time-series and other analytical workloads in a single system.

When should you use a time-series database? #

The decision to move from a transactional database to a time-series optimized solution typically comes when your workload characteristics change. Let's have a look at some key indicators:

Data volume is overwhelming #

When your time-series tables contain billions of rows and continue growing rapidly, traditional OLTP databases struggle. Table sizes exceeding hundreds of gigabytes often result in degraded query performance, slower writes, and increasingly complex maintenance operations, such as vacuuming or index rebuilding.

Query patterns favor analytical operations #

If most of your queries scan large portions of your dataset rather than looking up individual rows, you've outgrown OLTP optimization. Typical signs include:

  • Queries that aggregate millions of rows (daily/weekly summaries, averages, percentiles)
  • Queries that only touch a few columns from wide tables
  • Time-range scans that process data from specific periods
  • Minimal use of UPDATE or DELETE operations - your data is primarily append-only

Performance requirements exceed OLTP capabilities #

When users expect sub-second responses for queries scanning millions of rows, or when concurrent analytical queries impact your transactional workload, it's time to separate concerns. Real-time dashboards requiring fresh data with low latency are particularly challenging for transactional databases.

High cardinality becomes problematic #

As you track more unique dimensions (device IDs, user IDs, tags, labels), the number of unique combinations explodes. Traditional database indexes become inefficient, and query performance degrades despite proper indexing strategies.

Storage costs escalate #

Time-series data in row-oriented OLTP databases consumes significantly more storage than in columnar systems optimized for compression. When storage costs become a concern or when implementing complex archival strategies to manage growth, specialized time-series storage offers better economics.

ACID guarantees aren't essential #

If eventual consistency is acceptable for your time-series data - meaning you can tolerate brief delays between writes and reads, and don't require transactional guarantees across multiple tables - you can benefit from the performance advantages of time-series optimized systems.

When to stay with your transactional database #

Stick with PostgreSQL or MySQL when:

  • Your time-series data volume remains manageable (millions, not billions of rows)
  • You frequently update or delete historical data points
  • You need strong transactional consistency with other application data
  • Your queries primarily look up individual records or small ranges
  • Time-series analysis is a small part of a larger transactional application

As mentioned in the previous section, many organizations run both systems: transactional databases for core application data that require ACID guarantees, and time-series-optimized databases for metrics, logs, and analytical workloads. This separation enables each system to excel in its respective area of expertise.

Time-series databases can be categorized into three main types: purpose-built time-series databases, extensions of other databases, and real-time analytics/column-based databases. Here are some popular examples:

DatabaseTypeBest ForQuery LanguageKey Strength
InfluxDBPurpose-built TSDBIoT, DevOps metrics, real-time monitoringInfluxQL, Flux, SQLDownsampling and data retention policies
QuestDBPurpose-built TSDBHigh-throughput ingestion, fast SQL queriesSQLFast writes with low-latency queries
PrometheusPurpose-built TSDBSystem and service monitoring, alertingPromQLPull-based metrics collection and alerting
TimescaleDBPostgreSQL extensionHybrid relational + time-series workloadsSQL (PostgreSQL)Familiar PostgreSQL ecosystem and tooling
Apache PinotReal-time analyticsUser-facing dashboards, clickstream analysisSQLSub-second query response times
ClickHouseReal-time analyticsObservability, large-scale analyticsSQLExtreme performance and analytical flexibility

Purpose-built time-series databases #

Specialized databases engineered from the ground up to efficiently handle time-stamped data, offering optimized temporal data storage and query mechanisms, including as of joins, specialized maths functions, downsampling, grouped gap filling, and more. Some examples are described below:

  • InfluxDB - Specifically designed for time-series data, InfluxDB handles high write loads and provides features like downsampling and data retention policies, making it great for IoT, DevOps metrics, and real-time monitoring.
  • QuestDB - A high-performance open-source time-series database that excels at fast SQL queries and high-throughput ingestion.
  • Prometheus - An open-source monitoring system primarily designed for system and service monitoring. Prometheus excels at scraping metrics from various endpoints, storing them efficiently, and enabling alerting based on those metrics. It’s well-suited for use cases like server health monitoring and application performance metrics.

Extensions of relational databases #

Traditional relational databases can enhanced with time-series capabilities, combining the familiarity and flexibility of SQL with specialized temporal features.

TimescaleDB is an extension of PostgreSQL. TimescaleDB adds time-based features like automatic partitioning and time-based indexing. It's perfect for scenarios where you want to blend relational data with time-series data, such as in business intelligence or IoT.

Real-Time analytics / column-based Databases #

Systems optimized for rapid large-scale data analysis, using columnar storage to enable fast aggregations and real-time processing of time-series information. Some examples are described below:

  • Apache Pinot - Designed for real-time, low-latency analytics, Pinot is well-suited for applications like user-facing dashboards or clickstream analysis, offering sub-second query response times.
  • ClickHouse - That’d be us! ClickHouse was initially designed to keep records of all clicks by people from all over the Internet but is now used for various time-centric datasets, with a particular focus on observability.

Querying time-series data #

Time-series databases offer specialized query capabilities designed to handle temporal data efficiently. While many modern time-series databases use SQL with extensions, some have developed their own query languages optimized for time-series operations.

Query language approaches #

Most time-series databases extend standard SQL with specialized functions for temporal analysis. These extensions typically include:

  • Time-based window functions for analyzing data over specific time intervals
  • Gap filling to handle missing data points
  • Interpolation functions to estimate values between known data points
  • Time bucket operations for grouping data into regular time intervals
  • Specialized mathematical functions for time-series analysis

Prometheus uses a domain-specific query language called PromQL, specifically designed for time-series analysis and monitoring use cases. PromQL has built-in support for rate calculations and aggregations over time, native handling of labels and label matching, and vector and range vector selectors. It's particularly well-suited for monitoring scenarios where you must analyze metrics over time windows and create alerting rules.

InfluxDB initially used a query language called InfluxQL, which was SQL-like but explicitly designed for time-series operations. With InfluxDB 2.0, they introduced Flux, a more powerful SQL-based language, before adding SQL support to make the platform more accessible to users familiar with traditional database querying.

Query patterns and performance requirements #

Different types of time-series queries have varying performance requirements based on their use case:

Query TypeFrequencyLatency RequirementExample
Real-time dashboardsContinuous (every 1-5s)<1 secondCurrent system health, live metrics
Historical analysisAd-hoc1-5 seconds"Last month's trends", quarterly reports
Alerting queriesScheduled (every 10-60s)<1 secondThreshold violations, anomaly detection
Downsampling/aggregationBackground (nightly/hourly)Minutes acceptablePre-computing hourly/daily summaries
Forensic analysisRare5-30 secondsRoot cause analysis, incident investigation

Understanding these requirements helps in selecting the right database and optimizing query patterns for your specific use case.

Data retention and lifecycle management #

As time-series data accumulates, organizations face a fundamental challenge: data grows indefinitely while query patterns typically focus on recent information. A monitoring dashboard might primarily display the last 24 hours of metrics, yet the system continues ingesting data every second. Without a management strategy, storage costs escalate while query performance degrades as tables grow to billions or trillions of rows.

Organizations address this through three main approaches: data expiration, storage tiering, and data rollup. Each solves the same core problem - managing the lifecycle of time-series data - but with different trade-offs between cost, accessibility, and data granularity.

Comparing retention strategies:

StrategyStorage CostQuery SpeedData LossComplexityBest For
Data expirationLowest (data deleted)N/A (data gone)CompleteLowShort-term logs, temporary metrics
Storage tieringMedium (cheaper storage)Slower for old dataNoneMediumCompliance, auditing, long-term trends
Data rollupLow (aggregates only)Fast (smaller data)Precision lossMedium-HighHistorical analysis, trend monitoring
Combination approachOptimizedVaries by tierPartial (for rollups)HighMost production systems

Data expiration #

Data expiration automatically deletes data after a specified time period has elapsed. This is the most straightforward approach: define a retention policy (e.g., "keep data for 90 days"), and the database automatically removes older data. The advantage is straightforward implementation and genuine storage savings since the data disappears entirely. However, it's also the most destructive approach—once data is deleted, it's gone. This approach works well when historical data has no long-term value, such as short-lived application logs or metrics that are only relevant for immediate troubleshooting.

Storage tiering #

Storage tiering moves data through different storage layers based on age rather than deleting it. Recent "hot" data is stored on fast SSDs for quick queries, while older "warm" data migrates to less expensive storage, such as object stores (S3, GCS), and the oldest "cold" data may be moved to archival systems. This approach preserves all data while optimizing costs - you're not paying premium storage prices for data that's rarely accessed. The trade-off is complexity: queries spanning multiple tiers may be slower, and you still incur storage costs (albeit at lower rates). Storage tiering shines when you need occasional access to historical data for compliance, auditing, or long-term trend analysis.

Data rollup #

Data rollup (or downsampling) replaces high-resolution raw data with pre-computed aggregations. For example, keep per-second metrics for 7 days, then replace them with per-minute averages for 90 days, then per-hour averages indefinitely. This dramatically reduces storage requirements—a year of per-second data becomes manageable when aggregated to hourly granularity. The significant trade-off is loss of detail: once you've aggregated second-level data to minutes, you can't recover that original precision. Rollup works best when you can anticipate your analysis needs—if you know you'll only ever need hourly trends for historical data, rolling up makes perfect sense.

Implementation approaches #

Different time-series databases implement these strategies at various levels. Some apply retention policies at the database level, providing simplicity but less granularity - all data follows the same rules. Others allow table-level or even column-level policies, giving you precise control at the cost of increased configuration complexity. Some databases automate rollup processes, while others require you to define and maintain aggregation pipelines explicitly.

The best strategy often combines multiple approaches: expire truly transient data, tier important data to less expensive storage, and roll up metrics where aggregations are sufficient. The key is understanding your query patterns and data value over time - recent data may require millisecond precision on fast storage, while year-old data may serve your needs perfectly as hourly aggregates in object storage.

Is ClickHouse a time-series database? #

While ClickHouse isn't specifically designed as a time-series database, it excels at handling time-series workloads as part of its broader analytical capabilities.

As a columnar OLAP database, ClickHouse provides the performance and features needed for efficient time-series analysis without the limitations of a specialized solution.

ClickHouse's strengths in handling time-series data come from several key capabilities.

Real-time querying of large datasets #

ClickHouse enables the analysis of historical and current data at a large scale through its innovative dual-layer architecture. The system processes billions of rows per second on standard hardware through:

  • Isolated concurrent operations: Data is organized into "table parts" that allow inserts and selects to operate independently without blocking each other
  • Vectorized query execution: Processes data in batches rather than row-by-row, utilizing CPU caches efficiently and applying SIMD instructions
  • Parallel processing: Automatically distributes query execution across multiple CPU cores and can scale horizontally across nodes in a cluster
  • Merge-time computation: Shifts computational work from query time to background merge processes, making queries significantly faster
  • Specialized algorithms and data structures: As noted by CMU Professor Andy Pavlo, ClickHouse has "20 versions of a hash table" and other specialized components optimized for different query patterns

You can read more in the Why is ClickHouse fast? Developer guide.

These architectural advantages enable organizations to maintain years of historical time-series data while providing sub-second query responses for real-time dashboards and deep historical analysis.

The following query analyzes New York City taxi data that contains over 3 billion records. For January 1, 2014, it groups rides by hour and cab type to show the number of rides, average trip distance, and average fare for each hourly period and taxi category.

1SELECT 
2    toStartOfHour(pickup_datetime) AS hour,
3    cab_type,
4    count(*) AS rides,
5    round(avg(trip_distance), 2) AS avg_distance,
6    round(avg(total_amount), 2) AS avg_fare
7FROM nyc_taxi.trips
8WHERE pickup_date = '2014-01-01'
9GROUP BY 1, 2
10ORDER BY 1, 2

Comprehensive date/time type support #

ClickHouse provides robust support for time-series data through specialized date and time data types that balance storage efficiency with precision requirements:

  • Versatile date types:
    • Date: Compact 2-byte storage covering [1970-01-01, 2149-06-06], sufficient for most use cases
    • Date32: Extended 4-byte storage covering a wider range [1900-01-01, 2299-12-31]
  • Flexible timestamp types:
    • DateTime: 4-byte storage with second precision, range of [1970-01-01 00:00:00, 2106-02-07 06:28:15]
    • DateTime64: 8-byte storage with configurable sub-second precision (up to nanoseconds), range of [1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999]
  • Time zone awareness:
    • Built-in support for time zones in both DateTime('TimeZone') and DateTime64('TimeZone')
    • Automatic time zone conversion during queries
    • Support for different time zones within the same table
  • Type conversion functions:
    • Seamless conversion between temporal types with functions like toDate, toDateTime, and toDateTime64
    • Precision control when converting between different temporal resolutions

These comprehensive date/time capabilities provide the foundation for sophisticated time-series analysis, enabling precise temporal storage and manipulation across massive datasets while optimizing storage efficiency and query performance.

The following query aggregates total daily hits from the Wiki dataset, using the toDate function to convert DateTime values to Date:

1SELECT
2    sum(hits) AS h,
3    toDate(time) AS d
4FROM wiki.wikistat_small
5GROUP BY d
6ORDER BY d
7LIMIT 5;

Rich set of temporal functions #

ClickHouse offers a comprehensive suite of temporal functions that can be used for time-series analysis:

These temporal functions allow analysts to perform sophisticated time-series analyses with concise, readable SQL queries. With ClickHouse's query performance, these functions enable complex time-based aggregations, pattern detection, and anomaly identification across massive datasets with minimal latency.

Query examples #

Let’s have a look at a couple of examples.

The following query computes the yearly average precipitation in the UK, France, and the US from 1990 onwards.

1SELECT year,
2       avg(`precipitation`) AS `avg_precipitation`,
3       dictGet(`country`.`country_iso_codes`, 'name', code) as country
4FROM `noaa`.`noaa_v2`
5WHERE date > '1990-01-01' AND code IN ('UK', 'FR', 'US')
6GROUP BY toStartOfYear(`date`) AS `year`,
7         substring(station_id, 1, 2) as code
8HAVING avg_precipitation > 0         
9ORDER BY country, year ASC
10LIMIT 100000;

The following query uses a window function to calculate the cumulative stars of the deepseek-ai/DeepSeek-R1 repository:

1SELECT toDate(created_at) AS day, 
2       count() AS dailyCount,
3       sum(dailyCount) OVER (ORDER BY day ASC) AS culStars
4FROM github.events 
5WHERE event_type = 'WatchEvent' AND repo_name = 'deepseek-ai/DeepSeek-R1'
6GROUP BY ALL
7ORDER BY day;

Long-term data management #

Additionally, ClickHouse offers features that are particularly valuable for long-term time-series data management:

These capabilities mean you can implement sophisticated time-series storage strategies, such as keeping recent data at full granularity while automatically rolling up older data to save space. This approach provides both detailed recent data for operational needs and efficient storage of historical data for long-term analysis.

Prometheus compatibility #

ClickHouse also expands its time-series capabilities with experimental features like the Time Series table engine, which can be a backing store for Prometheus data. This allows organizations to leverage ClickHouse's analytical capabilities while maintaining compatibility with popular time-series monitoring tools.

Rather than being limited to time-series-specific functionality, ClickHouse allows you to handle time-series workloads alongside other analytical queries, providing a more versatile solution for organizations with diverse data analysis needs.

➡️ Read more in Can I use ClickHouse as a Time-Series Database? and Working with Time Series Data in ClickHouse.

Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...
Follow us
X imageBluesky imageSlack image
GitHub imageTelegram imageMeetup image
Rss image