Primary key design plays a crucial role in ClickHouse performance. It determines how data is stored, merged, and queried – directly impacting speed, compression, and scalability.
One commonly discussed choice is whether to use a UUID as a primary key instead of a numeric ID.
In this blog, we’ll explore what UUID is, why engineering teams use it, its trade-offs, and – most importantly – how it performs in practice with a real benchmark.
What Is UUID in ClickHouse
A UUID (Universally Unique Identifier) is a 128-bit (16-byte) value that guarantees uniqueness across systems.
In ClickHouse, it’s stored internally as a FixedString(16) – compact and efficient compared to regular strings.
Example table definitions:
CREATE TABLE test.events_uint
(
id UInt64,
user_id UInt64,
event_time DateTime,
event_type String
)
ENGINE = MergeTree()
ORDER BY id;
CREATE TABLE test.events_uuid
(
id UUID,
user_id UInt64,
event_time DateTime,
event_type String
)
ENGINE = MergeTree()
ORDER BY id;
Both tables are identical except for their primary key data type.
Why Engineering Teams Use UUID as a Primary Key
- Global Uniqueness Across Sources
UUIDs ensure globally unique identifiers across distributed pipelines or multi-node ingestion systems. - No Centralized ID Generation
No need for a sequence generator orMAX(id)handling – every record can be generated independently. - Simplified Data Merging
When consolidating data from multiple environments, UUIDs make deduplication and reconciliation easy. - Concurrency-Friendly for Writes
Multiple writers can safely insert data without worrying about ID collisions.
Trade-offs of Using UUID
- Poor Sorting Locality
UUIDs are random, which means inserts are not sequential. This affects how efficiently ClickHouse merges and compresses parts. - Higher Storage Overhead
UUIDs (16 bytes) are larger than typical numeric IDs (8 bytes), resulting in slightly higher index and part sizes. - Reduced Compression Efficiency
Since data isn’t naturally ordered, compression ratios can be lower than for sequential numeric keys. - Not Ideal as a Sorting Key
For analytical tables, it’s better to order by time or another sequential field for efficient merges.
Benchmark Setup
To test the real impact of UUID vs UInt64, we created two tables (events_uint and events_uuid) and inserted 5 million rows into each.
Insert Statements


We then measured:
- Insert speed (rows per second)
- Table size on disk
- Compression ratio
- Query performance
Benchmark Results
| Metric | UInt64 Key | UUID Key |
|---|---|---|
| Insert Speed | 2.3M rows/s | 1.9M rows/s |
| Table Size | 58.13 MiB | 128.40 MiB |
| Compression Ratio | 0.435 | 0.748 |
Query Time (count()) | 0.038 sec. | 0.046 sec. |
Results are based on a single-node ClickHouse instance; actual performance may vary.
Interpretation
- UUIDs simplify ID management in distributed systems but add minor overhead in storage and insert performance.
- Numeric IDs (
UInt64) remain faster for sequential workloads or analytics tables with ordered inserts. - The optimal choice depends on your data model – for event pipelines or decentralized ingestion, UUIDs are worth it.
Best Practices
- Use UUID as a primary key, not necessarily as a sorting key.
- Consider composite sorting keys like
(event_time, event_id)for better compression. - Benchmark using your actual data scale before finalizing schema decisions.
Example:
CREATE TABLE events
(
event_id UUID,
event_time DateTime,
user_id UInt64,
event_type String
)
ENGINE = MergeTree()
ORDER BY (event_time, event_id);
Final Thoughts
Choosing the right primary key in ClickHouse isn’t about following trends – it’s about understanding trade-offs.
UUIDs bring flexibility and scalability, especially in distributed systems, but sequential numeric keys still win in pure performance and compression.
Balancing both gives you the best of both worlds.
Looking for ClickHouse Expertise?
At Quantrail Data, we work closely with teams to make ClickHouse adoption seamless – from smooth deployments and guided migrations to performance benchmarking and scalable data pipelines.
We’ve seen these improvements in action, helping teams unlock bare-metal-level performance for real-time analytics – as shared in our Success Story: Quantrail Baremetal.
Whether you’re optimizing an existing setup or planning a migration, our goal is to make your ClickHouse journey efficient, reliable, and ready for production.
👉 Get in touch to see how we can help you make the most of ClickHouse.
References
ClickHouse Official Documentation – UUID Data Type
ClickHouse MergeTree Engine Overview
