ClickHouse® Pod Templates and Resources on Kubernetes: CPU, Memory, and Scheduling

This is the thirteenth article in our series on running the ClickHouse® database on Kubernetes with the Altinity® Kubernetes Operator. We have secured the cluster. Now we control how the pods actually run: how much CPU and memory they get, where they are scheduled, and how their availability is protected. This is where a cluster goes from working to production-ready.

What a pod template is

Everything about how a ClickHouse pod runs is set through a pod template in the CHI. You have seen it already as the place where the container image lives. It is actually a full Kubernetes pod specification that the operator merges into the StatefulSets it builds, so anything you can set on a pod, you can set here: resources, scheduling, labels, volumes, and probes.

Resource requests and limits: the most important setting

On a shared Kubernetes cluster, a pod with no resource settings can be starved by noisy neighbours or, worse, can consume a whole node and crash other workloads. You prevent this with requests and limits. A request is what the pod is guaranteed and what the scheduler uses to place it. A limit is the ceiling it may not exceed. For a database you set both, generously, on every ClickHouse pod:

spec:
  templates:
    podTemplates:
      - name: clickhouse-pod
        spec:
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:26.3
              resources:
                requests:
                  cpu: "2"
                  memory: 8Gi
                limits:
                  cpu: "4"
                  memory: 16Gi

This pod is guaranteed two cores and 8 gigabytes, and may burst to four cores and 16 gigabytes. Sizing memory matters especially for ClickHouse, because large aggregations and joins use a lot of it. A good practice is to also cap ClickHouse's own max_memory_usage setting somewhat below the pod's memory limit, so the database refuses an oversized query gracefully rather than being killed by Kubernetes for exceeding its limit.

Placing pods on the right nodes

In production you often want database pods on specific machines, for example nodes with fast local disks or more memory, and away from general workloads. Two mechanisms do this. A nodeSelector schedules pods only onto nodes carrying a given label. Tolerations let pods run on nodes that have been reserved with a taint, so only your database lands there. Combine them to dedicate a node pool to ClickHouse:

spec:
  templates:
    podTemplates:
      - name: clickhouse-pod
        spec:
          nodeSelector:
            node-pool: clickhouse
          tolerations:
            - key: "dedicated"
              operator: "Equal"
              value: "clickhouse"
              effect: "NoSchedule"
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:26.3

Here pods schedule only onto nodes labelled node-pool: clickhouse, and they tolerate the dedicated=clickhouse taint that keeps everything else off those nodes. Together with the anti-affinity and zone rules from the scaling article, you get full control over placement.

Protecting availability with a PodDisruptionBudget

Kubernetes sometimes needs to evict pods voluntarily, for example when an administrator drains a node for maintenance. Without a guardrail, a drain could take down too many replicas at once. A PodDisruptionBudget tells Kubernetes the minimum that must stay running. The good news is that the operator creates a PodDisruptionBudget for your cluster automatically, so during a node drain it will not voluntarily remove so many pods that a shard loses all its replicas. This is one more piece of correctness you get for free, and it is worth knowing it is there when you plan maintenance.

Labels, annotations, and probes

Pod templates also carry metadata. Custom labels and annotations are useful for cost allocation, for selecting pods in NetworkPolicies, and for monitoring systems that discover targets by annotation. You set them in the template's metadata:

spec:
  templates:
    podTemplates:
      - name: clickhouse-pod
        metadata:
          labels:
            team: analytics
          annotations:
            prometheus.io/scrape: "true"
        spec:
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:26.3

The operator also configures sensible liveness and readiness probes for ClickHouse, so Kubernetes knows when a pod is alive and when it is ready to serve. You can override these in the pod template if you have special requirements, but the defaults are good for most clusters.

Different templates for different roles

Because a cluster can reference multiple pod templates, you can give different parts of it different shapes. A common pattern is a larger template for shards that handle heavy queries and a smaller one for nodes with a lighter role, or different templates per zone as we saw earlier. You define several templates and point each replica or shard at the one it should use, and the operator builds each pod accordingly.

Sizing guidance for beginners

If you are unsure where to start: give each ClickHouse pod whole CPU cores rather than fractions, set memory requests and limits equal for predictable behaviour on critical clusters, leave headroom on the node beyond the pod limits for the operating system and page cache, and always cap max_memory_usage below the pod's memory limit. Measure with the monitoring you set up earlier, then adjust. Right-sizing is iterative, and good observability makes it straightforward.

What is next

Your pods are sized, placed, and protected. We now have every ingredient for a real deployment. In the next article we assemble them into a single production-grade ClickHouse cluster: multiple shards and replicas, a three-node Keeper, durable storage, Secrets, TLS, placement rules, and resources, all in one manifest.

ClickHouse® Pod Templates and Resources on Kubernetes: CPU, Memory, and Scheduling

What a pod template is

Resource requests and limits: the most important setting

Placing pods on the right nodes

Protecting availability with a PodDisruptionBudget

Labels, annotations, and probes

Different templates for different roles

Sizing guidance for beginners

What is next

References

Expert ClickHouse services

Manage ClickHouse with CHOps

Related articles

FIPS 140-3 Compliance for ClickHouse® on Kubernetes with the Altinity® Operator

Troubleshooting ClickHouse® on Kubernetes: A Practical Debugging Guide

Tiered Storage for ClickHouse® on Kubernetes: Hot Disks and S3 Cold Storage