One of the first surprises engineers encounter when working with ClickHouse® is that updates don’t behave the same way they do in traditional transactional databases.
In PostgreSQL or MySQL, an UPDATE statement modifies a row directly. In ClickHouse®, things work differently because the database is optimized for analytical workloads rather than row-level transactions.
This is where the ReplacingMergeTree engine comes in.
ReplacingMergeTree is one of the most commonly used table engines in ClickHouse® because it provides a practical way to handle updates, corrections, and deduplication while preserving the performance characteristics that make ClickHouse® fast.
In this article, we’ll explore how ReplacingMergeTree works internally, how deduplication actually happens, common misconceptions, and the best practices for using it effectively.
Why ReplacingMergeTree Exists
ClickHouse® is built for high-throughput analytical workloads.
Instead of modifying rows in place, ClickHouse® writes data into immutable parts.
This design provides several advantages:
- Fast inserts
- Efficient compression
- Parallel processing
- High analytical performance
However, immutable storage creates a challenge:
How do you handle updated records?
ReplacingMergeTree solves this problem by allowing multiple versions of the same logical row to exist temporarily, with older versions eventually being removed during merge operations.
Understanding the Core Idea
Consider a table:
CREATE TABLE users
(
id UInt64,
name String
)
ENGINE = ReplacingMergeTree
ORDER BY id;
Insert an initial record:
INSERT INTO users VALUES
(1, 'Alice');
Later, a correction arrives:
INSERT INTO users VALUES
(1, 'Alice Smith');
At this point, both rows physically exist inside the table.
1 Alice
1 Alice Smith
This often surprises new users.
ReplacingMergeTree does not immediately remove duplicates.
Instead, deduplication occurs during background merge operations.
What Actually Gets Replaced?
A common misconception is that ReplacingMergeTree automatically updates rows.
It doesn’t.
The engine simply keeps the newest version of rows that share the same sorting key.
The key phrase here is:
Rows are considered duplicates based on the ORDER BY clause.
Example:
ENGINE = ReplacingMergeTree
ORDER BY id
Rows with the same id are treated as duplicate versions of the same logical record.
If the sorting key is incorrect, deduplication may not work as expected.
Choosing the right sorting key is therefore critical.
The Role of Background Merges
ClickHouse® continuously merges parts in the background.
During these merges:
- Multiple parts are combined.
- Duplicate rows are identified.
- Older versions are removed.
- New merged parts are written.
Until a merge occurs, duplicates remain visible.
Example:
SELECT *
FROM users;
May return:
1 Alice
1 Alice Smith
even though only one version should ultimately survive.
This behavior is completely normal.
Understanding FINAL
To see deduplicated results immediately, ClickHouse® provides the FINAL modifier.
Example:
SELECT *
FROM users
FINAL;
Result:
1 Alice Smith
FINAL forces ClickHouse® to perform deduplication during query execution.
This guarantees correctness but introduces additional work.
Because FINAL requires extra processing, it should be used carefully on large datasets.
Why FINAL Exists
Many newcomers assume that duplicates indicate a problem.
In reality, duplicates are often expected.
ReplacingMergeTree prioritizes write performance by allowing duplicates to exist temporarily.
FINAL provides a mechanism for obtaining fully deduplicated results before background merges complete.
Think of FINAL as:
“Show me the result as if all merges had already finished.”
Adding a Version Column
In many real-world scenarios, determining the newest row requires more than insertion order.
ReplacingMergeTree supports an optional version column.
Example:
CREATE TABLE users
(
id UInt64,
name String,
version UInt64
)
ENGINE = ReplacingMergeTree(version)
ORDER BY id;
Insert records:
INSERT INTO users VALUES
(1, 'Alice', 1);
INSERT INTO users VALUES
(1, 'Alice Smith', 2);
When duplicates are merged, the row with the highest version value survives.
This provides deterministic behavior and is generally recommended for production workloads.
A Real-World Example
Imagine collecting customer records from multiple systems.
Corrections may arrive throughout the day:
Customer A - Version 1
Customer A - Version 2
Customer A - Version 3
ReplacingMergeTree allows all versions to be ingested immediately.
Eventually:
Customer A - Version 3
becomes the surviving record.
This pattern is common in:
- CDC pipelines
- Event-driven architectures
- Data warehousing
- Customer master data systems
- Analytics platforms
Common Misconceptions
“Duplicates Mean Something Is Broken”
Not necessarily.
Duplicates are often expected until background merges occur.
“ReplacingMergeTree Performs Updates”
It does not update rows directly.
It replaces duplicate versions during merge operations.
“FINAL Is Always Required”
Not always.
Many analytical workloads can tolerate eventual deduplication.
“OPTIMIZE FINAL Fixes Everything”
Running:
OPTIMIZE TABLE users FINAL;
forces merges immediately.
However, repeatedly running OPTIMIZE FINAL is rarely a good long-term strategy and can create unnecessary resource pressure.
Designing Good ReplacingMergeTree Tables
A typical production design might look like:
CREATE TABLE orders
(
order_id UInt64,
customer_id UInt64,
amount Float64,
updated_at DateTime,
version UInt64
)
ENGINE = ReplacingMergeTree(version)
ORDER BY order_id;
Why it works:
order_ididentifies logical duplicates.versiondetermines which row survives.- Queries remain efficient.
- Updates are handled naturally.
Best Practices
When using ReplacingMergeTree:
- Choose the sorting key carefully.
- Use a version column whenever possible.
- Understand that deduplication is eventually consistent.
- Avoid assuming inserts immediately replace existing rows.
- Use FINAL only when correctness requires it.
- Monitor merge activity.
- Design around actual query patterns.
When Should You Use ReplacingMergeTree?
ReplacingMergeTree is a good choice when:
- Records may be updated.
- Duplicate versions can temporarily coexist.
- Eventual consistency is acceptable.
- CDC data is being ingested.
- Data corrections occur regularly.
It may not be ideal when:
- Immediate consistency is required.
- Frequent row-level updates dominate workloads.
- Transactional behavior is expected.
Final Thoughts
ReplacingMergeTree is one of the most powerful and frequently misunderstood engines in ClickHouse®.
Many engineers initially expect it to behave like a traditional UPDATE mechanism, only to discover that duplicates remain visible until background merges occur.
Understanding this behavior is the key to using the engine effectively.
Rather than updating rows directly, ReplacingMergeTree embraces ClickHouse®’s immutable storage model and uses background merges to reconcile multiple versions of the same record.
Once you understand the relationship between sorting keys, merges, version columns, and FINAL, the engine becomes significantly easier to reason about and operate in production environments.
Exploring ClickHouse® for Your Analytics?
At Quantrail Data, we help teams run ClickHouse® reliably for real-time analytics – from Kubernetes deployments and migrations to performance tuning in production.
We see these challenges firsthand while supporting demanding analytics workloads. In one recent engagement, a customer achieved near bare-metal performance with ClickHouse® in production – a story we’ve shared here:
Success Story: Quantrail Bare-Metal ClickHouse® Deployment
If you’re evaluating ClickHouse® or trying to get more out of an existing setup, we’re happy to share practical lessons from real-world deployments.
Contact
Quantrail Data
References
ClickHouse® Documentation – ReplacingMergeTree
ClickHouse® Documentation – MergeTree Family
ClickHouse® Introduction to ReplacingMergeTree
