ClickHouse 25.2 Release - Quantrail Data

ClickHouse continues to push the boundaries of performance and efficiency with its latest 25.2 release, introducing several key enhancements that improve data storage, query execution, and usability. This update brings a lot of features and enhancements making ClickHouse even more versatile for analytical workloads.

In this blog, we’ll dive into the most exciting new features, explore how they enhance database performance, and highlight how they can benefit your data pipelines. Let’s take a closer look!

New Features

ClickHouse 25.2 introduces several exciting improvements, enhancing both performance and usability. From expanded JSON support to advanced storage policies, this release makes data management more efficient. Let’s explore some of the most impactful updates.

1. Enhanced JSON Handling with Nullable Support

ClickHouse now supports Nullable(JSON), allowing more flexibility when working with semi-structured data. This update simplifies schema management by letting fields store NULL values without breaking JSON parsing. For developers handling inconsistent data sources, this is a game-changer.

2. Smarter Storage Policies with Read-Only and Read-Write Disk Combinations

The new storage policy update lets you combine read-only and read-write disks within the same setup. ClickHouse will now prioritize writable disks for inserts while still allowing queries across the entire volume. This Copy-on-Write (CoW) approach optimizes both performance and cost efficiency.

3. Instant Database Restoration with DatabaseBackup Engine

Restoring backups has never been faster! The new DatabaseBackup engine lets you instantly attach tables and databases from backups, eliminating lengthy recovery times. If you manage large datasets, this feature ensures minimal downtime and quick disaster recovery.

4. Interactive Database Navigation in Web UI

The ClickHouse Web UI now includes interactive database navigation, making schema exploration easier than ever. Users can browse databases, tables, effortlessly. So, no more manually running queries to find the tables via Play UI.

5. Parquet Bloom Filters for Faster Query Performance

Bloom filters improve query performance on large datasets, and ClickHouse now supports writing them in Parquet files by default. This enhancement reduces unnecessary reads, making data retrieval more efficient.

6. Attach Tables Without a Database Layer

ClickHouse now lets you attach MergeTree tables stored on Web, S3, and other virtual filesystems without requiring a database layer. This simplifies workflows when dealing with external storage solutions.

7. Track Query Execution Time More Accurately

A new function, initialQueryStartTime, allows users to retrieve the exact start time of a query, ensuring consistent timing across distributed queries. This feature is particularly useful for performance tuning and debugging.

Enhancements

ClickHouse 25.2 delivers substantial performance optimizations, more efficient query execution, and better system management. These updates improve JSON processing, optimize joins, and enhance error handling, making ClickHouse more powerful and user-friendly.

Faster JSON Column Processing from S3

Reading JSON columns from S3 is now significantly faster, thanks to:

Prefetching subcolumn prefixes to speed up deserialization.
Caching previously deserialized prefixes to avoid redundant work.
Parallel deserialization of subcolumns, reducing processing time.
These optimizations result in 4x faster performance for full-table scans and 10x faster queries when using LIMIT 10.

Query Execution Enhancements

Optimized JOIN conditions: Filters in JOIN ON clauses now push down when possible, reducing unnecessary data scans.
Reduced memory usage in window functions: Some window functions now consume less memory, improving efficiency.
Parallel partition fetching: ALTER TABLE FETCH PARTITION now fetches data in parallel, controlled by max_fetch_partition_thread_pool_size, improving replication performance.
More efficient TTL processing: MATERIALIZE TTL now reads only necessary columns to drop expired parts, reducing unnecessary I/O.
Fixed contention in parallel_hash: When max_rows_in_join and max_bytes_in_join are set to 0, contention is now eliminated, improving performance.

Storage and Replication Improvements

Keeper performance boost: Disabling digest calculation on commit significantly improves write performance. It remains enabled for preprocessing requests.
Shard naming in cluster configurations: Users can now assign custom names to shards, making cluster management more intuitive.
Better control over freezing data: ALTER TABLE ... FREEZE ... queries can now be canceled via KILL QUERY or automatically via max_execution_time, preventing stuck operations.
Stronger permissions for SYSTEM DROP REPLICA: Users now receive proper error messages instead of silent failures when lacking necessary permissions.

System Insights and Debugging Improvements

Query ID tracking in cache: The query cache now includes a query ID column for better query monitoring.
Normalized query hash in system.processes: While it can be computed dynamically, its inclusion in system tables prepares for future query optimization improvements.
Better error messages:
- Large queries now generate clearer error messages, ensuring reasons for failures are not lost in long query fragments.
- UTF-8 characters in error messages are now correctly handled.
- Excessive quoting of query fragments is fixed, making error messages more readable.

Other Notable Updates

Iceberg storage engine security: Sensitive catalog credentials are now hidden from configuration settings.
Improved memory tracking: A new config option, memory_worker_correct_memory_tracker, periodically corrects memory tracking using background data reads.
More precise backup timestamps: Backup start and end times are now stored with microsecond precision for better auditability.
Refined client-server setting behavior: The new apply_settings_from_server client setting replaces send_settings_to_client, controlling whether client-side parsing should use server-defined settings.

Final Thoughts

ClickHouse 25.2 is a significant step forward, offering a mix of performance gains, operational efficiency, and security enhancements. Faster JSON processing, smarter query execution, and better debugging tools ensure that developers and data engineers can work more efficiently with large-scale analytics workloads. The improvements in cluster management, replication, and error handling make ClickHouse more robust and reliable.

As the ecosystem continues to evolve, these refinements reinforce ClickHouse’s position as a top choice for real-time analytics. Whether you’re optimizing queries, managing large datasets, or ensuring seamless cluster operations, this release brings meaningful improvements that enhance both speed and usability.

References

https://clickhouse.com/docs/whats-new/changelog#-clickhouse-release-252-2025-02-27

Photo by Porapak Apichodilok: https://www.pexels.com/photo/brown-gift-box-360624/