,

Using UUID as a Primary Key in ClickHouse: Benefits, Trade-offs, and Real Benchmarks

Mohamed Hussain S avatar
Using UUID as a Primary Key in ClickHouse: Benefits, Trade-offs, and Real Benchmarks

Primary key design plays a crucial role in ClickHouse performance. It determines how data is stored, merged, and queried – directly impacting speed, compression, and scalability.

One commonly discussed choice is whether to use a UUID as a primary key instead of a numeric ID.
In this blog, we’ll explore what UUID is, why engineering teams use it, its trade-offs, and – most importantly – how it performs in practice with a real benchmark.

A UUID (Universally Unique Identifier) is a 128-bit (16-byte) value that guarantees uniqueness across systems.
In ClickHouse, it’s stored internally as a FixedString(16) – compact and efficient compared to regular strings.

Example table definitions:

CREATE TABLE test.events_uint
(
    id UInt64,
    user_id UInt64,
    event_time DateTime,
    event_type String
)
ENGINE = MergeTree()
ORDER BY id;

CREATE TABLE test.events_uuid
(
    id UUID,
    user_id UInt64,
    event_time DateTime,
    event_type String
)
ENGINE = MergeTree()
ORDER BY id;

Both tables are identical except for their primary key data type.

  1. Global Uniqueness Across Sources
    UUIDs ensure globally unique identifiers across distributed pipelines or multi-node ingestion systems.
  2. No Centralized ID Generation
    No need for a sequence generator or MAX(id) handling – every record can be generated independently.
  3. Simplified Data Merging
    When consolidating data from multiple environments, UUIDs make deduplication and reconciliation easy.
  4. Concurrency-Friendly for Writes
    Multiple writers can safely insert data without worrying about ID collisions.
  1. Poor Sorting Locality
    UUIDs are random, which means inserts are not sequential. This affects how efficiently ClickHouse merges and compresses parts.
  2. Higher Storage Overhead
    UUIDs (16 bytes) are larger than typical numeric IDs (8 bytes), resulting in slightly higher index and part sizes.
  3. Reduced Compression Efficiency
    Since data isn’t naturally ordered, compression ratios can be lower than for sequential numeric keys.
  4. Not Ideal as a Sorting Key
    For analytical tables, it’s better to order by time or another sequential field for efficient merges.

To test the real impact of UUID vs UInt64, we created two tables (events_uint and events_uuid) and inserted 5 million rows into each.

We then measured:

  • Insert speed (rows per second)
  • Table size on disk
  • Compression ratio
  • Query performance
MetricUInt64 KeyUUID Key
Insert Speed2.3M rows/s1.9M rows/s
Table Size58.13 MiB128.40 MiB
Compression Ratio0.4350.748
Query Time (count())0.038 sec.0.046 sec.

Results are based on a single-node ClickHouse instance; actual performance may vary.

  • UUIDs simplify ID management in distributed systems but add minor overhead in storage and insert performance.
  • Numeric IDs (UInt64) remain faster for sequential workloads or analytics tables with ordered inserts.
  • The optimal choice depends on your data model – for event pipelines or decentralized ingestion, UUIDs are worth it.
  • Use UUID as a primary key, not necessarily as a sorting key.
  • Consider composite sorting keys like (event_time, event_id) for better compression.
  • Benchmark using your actual data scale before finalizing schema decisions.

Example:

CREATE TABLE events
(
    event_id UUID,
    event_time DateTime,
    user_id UInt64,
    event_type String
)
ENGINE = MergeTree()
ORDER BY (event_time, event_id);

Choosing the right primary key in ClickHouse isn’t about following trends – it’s about understanding trade-offs.
UUIDs bring flexibility and scalability, especially in distributed systems, but sequential numeric keys still win in pure performance and compression.

Balancing both gives you the best of both worlds.

At Quantrail Data, we work closely with teams to make ClickHouse adoption seamless – from smooth deployments and guided migrations to performance benchmarking and scalable data pipelines.

We’ve seen these improvements in action, helping teams unlock bare-metal-level performance for real-time analytics – as shared in our Success Story: Quantrail Baremetal.

Whether you’re optimizing an existing setup or planning a migration, our goal is to make your ClickHouse journey efficient, reliable, and ready for production.

👉 Get in touch to see how we can help you make the most of ClickHouse.

ClickHouse Official Documentation – UUID Data Type
ClickHouse MergeTree Engine Overview