ClickHouse is a high-performance columnar database for analytics at scale. It’s built for speed – processing billions of rows in milliseconds – and is widely used for dashboards, event analytics, and real-time metrics.
With the release of ClickHouse 25.4, the project continues to evolve as a general-purpose analytical platform – not just for high-performance SQL, but also for working with semi-structured data, vector embeddings, and cloud-native data lakes like Iceberg and DeltaLake.
This post will walk through what changed in version 25.4, why it matters, and what to watch out for if you’re upgrading or exploring ClickHouse for the first time.
What’s New in ClickHouse 25.4?
ClickHouse 25.4 brings improvements across several areas: correctness, performance, system stability, and support for modern data lake and vector workloads. Here’s what you need to know.
1. Breaking Changes (Important Compatibility Changes)
ClickHouse doesn’t often introduce breaking changes, but 25.4 has a few you should be aware of:
- Materialized Views Must Match Target Tables
When the settingallow_materialized_view_with_bad_selectis set tofalse(which is now the default), all columns in a materialized view must match the structure of the target table. Previously, mismatches were allowed and could lead to subtle bugs. - Legacy MongoDB Integration Removed
The old built-in MongoDB engine has been removed. If you were relying on it, you’ll need to switch to external ETL or newer integrations. dateTruncNow Handles Negative Timestamps Properly
Older versions of ClickHouse had incorrect behavior for truncating negativeDateorDateTimevalues (e.g., timestamps before 1970). This has now been fixed.
2. Smarter Query Scheduling and CPU Management
ClickHouse now includes CPU slot scheduling to better manage query concurrency. This allows ClickHouse to make workload decisions based on system pressure – rejecting low-priority queries if the server is overloaded. This is particularly useful in shared environments or when multiple query workloads compete for resources.
Another improvement is query overload protection: the server will probabilistically reject new queries based on the ratio of CPU wait time to active time. These features aim to make ClickHouse more stable under heavy load.
3. Improvements for Data Lake Use Cases (Iceberg and DeltaLake)
Data lake support continues to mature:
- Iceberg:
- Time travel queries are now supported by timestamp.
- An in-memory cache for Iceberg metadata was added to speed up reads.
- Trivial
count()queries on Iceberg tables are now optimized.
- DeltaLake:
- Now supports partition pruning for better performance.
- Added support for reading from Azure Blob Storage.
- Improvements in schema handling, caching, and correctness.
These features make ClickHouse a viable option for querying open table formats without needing to move data.
4. Vector Search Gets More Efficient
ClickHouse has added more efficient handling for vector similarity search – commonly used for machine learning or semantic search.
A new cache system for deserialized vector indexes improves repeat query performance and reduces memory usage. The underlying memory allocation strategy was also improved to avoid 2x over-allocation, which was common in earlier versions.
This brings ClickHouse closer to being a practical platform for hybrid analytical + vector workloads.
5. clickhouse-local Now Supports Persistent Storage
In previous versions, clickhouse-local – a tool used for running ClickHouse without a server – would discard all data and schema on exit.
Now, you can provide a --path argument to retain databases between runs. This makes clickhouse-local more usable for quick experiments, lightweight pipelines, and debugging without spinning up a full ClickHouse cluster.
6. Other Highlights
- Kafka integration now supports SASL authentication directly in
CREATE TABLE, reducing dependency on external config files. - SCRAM-SHA-256 authentication was added to the PostgreSQL compatibility protocol.
- The query condition cache is now enabled by default, improving repeated query performance.
- User-defined functions (UDFs) can now be marked as deterministic, allowing results to be cached when safe.
- Query plans are now serialized for distributed queries, improving efficiency in distributed deployments.
- Added a new function
toInterval(value, unit)to simplify working with intervals. - Many small improvements to JSON, S3/Azure queue engines, logging, and error handling.
7. Bug Fixes and Stability Improa-smarter-and-faster-analytical-databasevements
ClickHouse 25.4 includes a large number of bug fixes – especially in areas like:
- Materialized views not refreshing properly on replicas
- Vector search returning incorrect results in edge cases
- Crashes in S3/Azure queue storage under concurrent access
- Incorrect behavior when working with nullable and low-cardinality types
- Broken MongoDB table function behavior with filters and limits
If you’ve encountered subtle issues in recent versions, it’s worth scanning the changelog to see if they’ve been addressed.
Conclusion
ClickHouse 25.4 is not a revolutionary release, but it’s a foundational one. It delivers cleaner, safer, and faster behavior across the board -with particular improvements in data lake interoperability, resource-aware query handling, and vector search performance.
If you’re evaluating ClickHouse for your next analytics use case – whether as a backend for dashboards, a data lake query engine, or even a vector database -25.4 is a stable and capable version to start with.
Looking for Expert ClickHouse Solutions?
At Quantrail, we offer a fully managed ClickHouse service, seamless migration assistance, and dedicated service contracts to help businesses optimize their analytics stack. Whether you need a hassle-free ClickHouse deployment, expert support, or help transitioning from another database, we’ve got you covered. Let’s talk about how we can accelerate your analytics!
References
https://clickhouse.com/docs/whats-new/changelog#254
https://www.pexels.com/photo/metadata-wooden-block-letters-on-a-table-30885765
