An intro to column-based storage
Mark Needham
In this video, we explore column storage, the backbone of column stores or column databases like ClickHouse. We compare the concept to the more familiar row-based storage used in many relational databases, using a practical example of weather data to illustrate the differences.
Column storage isn't just a different way to organize data; it's a game-changer for analytical queries and data compression. We'll explore why this approach is particularly well-suited for modern data analysis needs and how it aligns with current CPU architectures.
Key points covered:
- Comparison of row-based vs. column-based storage layouts
- Advantages of column storage for data compression and efficient querying
- Examples of compression techniques like dictionary encoding and delta encoding
- How column storage enables faster analytical queries and aggregations
- The benefits of column storage for CPU cache usage and SIMD operations

Scaling ClickHouse to petabytes of logs at OpenAI

How ClickHouse helps Anthropic scale observability

How Capital One Slingshot cut infrastructure costs by 50%
Engineering leaders at Capital One Slingshot share how they cut infrastructure costs by 50% and reduced average dashboard load time from 5+ to under 500ms with ClickHouse Cloud.