Estimating resources
The following provides a model for estimating the compute and storage resources required for a ClickStack deployment based on your expected ingest volume. The values produced are estimates only and should be used as an initial baseline - they are not a prescriptive answer. Actual requirements depend on query complexity, concurrency, retention policies, and variance in ingestion throughput. Always monitor resource usage and scale as needed.
Every number on this page - throughput (MB/s, TB/month), CPU sizing, and storage - is expressed in terms of uncompressed raw ingest volume, i.e. the size of the data as produced by your applications and sent to the OpenTelemetry collector before any compression is applied.
This is the figure you should estimate from your existing logs, traces, and metrics pipelines. Storage figures in the table below already have the assumed 10x compression ratio applied to this raw volume.
When deploying ClickStack, provision compute to cover two independent workloads: ingest and query.
| Workload | Estimated resources |
|---|---|
| Ingest | 1 vCPU per 10 MB/s of sustained ingest throughput |
| Query | 1 vCPU per 1 QPS and per 10 MB/s of sustained ingest throughput |
In most self-managed deployments, ingest and query share the same nodes. In this case, use the Total CPUs as your baseline. Isolated scaling - where ingest and query compute are provisioned independently - is supported in ClickHouse Cloud through separate compute pools aka Warehouses.
Assumptions
- A 10x compression ratio for storage - typically conservative for logs and traces.
- Query SLAs of a P50 of 1.5 seconds and a P99 of 5 seconds.
- We assume most queries occur over recent data, following a log-normal distribution that peaks at around one hour and tails out to around six hours. Users may wish to provision dedicated compute to query older data. In ClickHouse Cloud this can be idle (thus not incuring costs) when not in use.
- While query compute can be scaled independently of ingest compute, it remains intrinsically linked to ingest volume. We assume as ingest increases, data density grows, resulting in larger scan volumes at query time and consequently higher query compute requirements.
The following table provides example sizings based on increasing ingest throughput in megabytes per second, alongside the corresponding data volumes in terabytes per month. This assumes a sustained average of 1 QPS from ClickStack across all query types (search, dashboards, alerting).
| MB/s | TB/month | Ingest CPUs | Query CPUs | Total CPUs | Total Storage (per month) (GB) |
|---|---|---|---|---|---|
| 10 | 25.92 | 1 | 3 | 4 | 2,592 |
| 20 | 51.84 | 2 | 6 | 8 | 5,184 |
| 50 | 129.6 | 5 | 15 | 20 | 12,960 |
| 100 | 259.2 | 10 | 30 | 40 | 25,920 |
| 200 | 518.4 | 20 | 60 | 80 | 51,840 |
| 500 | 1,296 | 50 | 150 | 200 | 129,600 |
| 1000 | 2,592 | 100 | 300 | 400 | 259,200 |
Refining sizing assumptions for your environment
The model assumes a sustained average of 1 QPS from ClickStack, aggregating all query types including search, dashboards, and alerting.
For higher query volumes, scale CPU requirements linearly by multiplying CPU requirements by the target QPS. For example, a deployment ingesting at 100 MB/s with a target of 9 QPS would require 90 query CPUs (10 × 9) rather than the baseline 10, giving a revised total of 100 CPUs (10 ingest + 90 query).
Storage estimates assume a conservative 10x compression ratio. In practice, logs, traces, and metrics often achieve higher compression. We recommend testing on a sample of data to establish your compression ratio and storage requirements in advance of production. To compute the required storage for longer retention, multiply the storage per month by the number of months required to retain.
This assumes a relatively balanced query distribution. Workloads skewed toward heavier historical or archival queries may have significantly different compute requirements, and should be validated through load testing. We plan to introduce a more flexible sizing model that allows extrapolation of query compute based on varying query distribution patterns.
Worked example
Requirements: 1.5 PB/month ingest, 5 QPS, 3-month retention.
Converting to MB/s
The sizing model is expressed in MB/s. Converting 1.5 PB/month (1,500 TB) to a sustained throughput:
- 1,500 TB = 1,500,000,000 MB
- Seconds per month (30 days): 30 × 24 × 60 × 60 = 2,592,000
- MB/s = 1,500,000,000 ÷ 2,592,000 ≈ 579 MB/s
Ingest compute
At 1 vCPU per 10 MB/s of sustained ingest:
579 ÷ 10 = ~58 vCPUs for ingest
Query compute
Query compute scales with both ingest throughput and QPS. At 5 QPS:
(579 ÷ 10) × 5 = 58 × 5 = 290 vCPUs for query
Storage
At 579 MB/s sustained over 30 days, raw ingest equals 1,500 TB/month. Applying the assumed 10x compression ratio:
- Compressed per month: 1,500 TB ÷ 10 = 150 TB/month
- For 3-month retention: 150 TB × 3 = 450 TB total
Summary
| Resource | Value |
|---|---|
| Ingest compute | 58 vCPUs |
| Query compute | 290 vCPUs |
| Total compute | 348 vCPUs |
| Storage per month (compressed) | 150 TB |
| Storage for 3-month retention | 450 TB |
Isolating observability workloads
If you're adding ClickStack to an existing ClickHouse Cloud service that already supports other workloads, such as real-time application analytics, isolating observability traffic is strongly recommended.
Use Managed Warehouses to create a child service dedicated to ClickStack. This allows you to:
- Isolate ingest and query load from existing applications
- Scale observability workloads independently
- Prevent observability queries from impacting production analytics
- Share the same underlying datasets across services when needed
This approach ensures your existing workloads remain unaffected while allowing ClickStack to scale independently as observability data grows.
For larger deployments or custom sizing guidance, please contact support for a more precise estimate.