ClickHouse – Dynamic Data Type

Introduction

ClickHouse excels in high-performance analytics, but managing mixed data types in a single column has always been tricky. Enter the Dynamic data type – an experimental feature until ClickHouse 25.2 and officially stable in 25.3 LTS. This game-changing feature lets you store multiple data types in one column without a fixed schema, making it perfect for handling unstructured and semi-structured data with ease.

Defining a Dynamic Column in ClickHouse

A Dynamic column can store multiple types and automatically adapt to the data it receives. The syntax for defining a Dynamic column is:

<column_name> Dynamic(max_types=N)
  • N is an optional parameter (0 to 254) defining the number of distinct types stored as separate subcolumns before falling back to binary storage.
  • If omitted, the default max_types is 32.

Table Creation

CREATE TABLE test 
(d Dynamic)
ENGINE = Memory;

Inserting and Querying Data

The Dynamic type supports storing values of any type within a single column:

INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT d, dynamicType(d) FROM test;
┌─d─────────────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ          │ None           │
│ 42            │ Int64          │
│ Hello, World! │ String         │
│ [1,2,3]       │ Array(Int64)   │
└───────────────┴────────────────┘

You can also use CAST to store values dynamically:

SELECT 'Hello, World!'::Dynamic as d, dynamicType(d);

Reading Nested Data Types as Subcolumns

ClickHouse allows extracting specific types from Dynamic columns using the subcolumn syntax:

SELECT d.String, d.Int64, d.`Array(Int64)` FROM test;

Alternatively, the dynamicElement() function can be used:

SELECT dynamicElement(d, 'String'), dynamicElement(d, 'Int64') FROM test;

Example Output

┌─d.String──────┬─d.Int64─┬─d.Array(Int64)─┐
│ Hello, World! │    ᴺᵁᴸᴸ │ []             │
│ ᴺᵁᴸᴸ          │      42 │ []             │
│ ᴺᵁᴸᴸ          │    ᴺᵁᴸᴸ │ [1,2,3]        │
└───────────────┴─────────┴────────────────┘

Performance Considerations and Best Practices

  • Use max_types wisely to balance performance and flexibility.
  • Extract specific subcolumns when querying to optimize read performance.
  • Avoid excessive type variations in a single column to reduce storage overhead.

Conclusion

The Dynamic data type in ClickHouse is a game-changer for handling mixed-type data in a single column. With subcolumns and dynamicType(), you can efficiently query diverse data while maintaining high performance. Now stable in ClickHouse 25.3, Dynamic unlocks new possibilities for flexible data modeling.

Looking to deploy ClickHouse on-prem, need self-managed solutions, expert support, or seamless migration services? Quantrail provides end-to-end ClickHouse expertise to help you scale with confidence..

Get in touch today!

References

https://clickhouse.com/docs/sql-reference/data-types/dynamic

Photo by MART PRODUCTION: https://www.pexels.com/photo/3-women-in-white-stockings-lying-on-floor-7319794/