cgroups — Linux Control Groups
Also Known As
TL;DR
Explanation
Control groups (cgroups) let the kernel organise processes into hierarchical groups and enforce resource controls per group. Each 'controller' (cpu, memory, io, pids, etc.) tracks and can cap usage for every process in the group. cgroups v1 used one directory tree per controller under /sys/fs/cgroup/<ctrl>/, while cgroups v2 (default on most modern distros) unifies them under a single /sys/fs/cgroup/ tree with a single hierarchy. When you run 'docker run --memory 512m', Docker writes 536870912 to memory.max inside a cgroup it creates for the container. Kubernetes resource requests and limits, systemd service memory caps, and PHP-FPM pool limits via Unit delegation all ultimately configure cgroups. Understanding cgroups is what separates people who can debug 'why is my container being OOM-killed' from people who just add more memory and hope.
Common Misconception
Why It Matters
Common Mistakes
- Confusing cgroups v1 and v2 paths — v1 has /sys/fs/cgroup/memory/, /sys/fs/cgroup/cpu/ etc.; v2 has a single /sys/fs/cgroup/ tree with cgroup.controllers listing active controllers.
- Setting memory.max without considering memory.swap.max — a process can escape a memory cap if swap is uncapped.
- Forgetting that cgroup OOM kills the whole container in most runtime configs — set memory requests above typical spikes to avoid churn.
- Reading /proc/<pid>/cgroup expecting meaningful output inside a container — you see the cgroup path relative to the container's namespace, not the host.
- Confusing CPU shares (relative weight) with CPU quota (hard cap) — shares only matter when CPU is contested; quota caps regardless.
Avoid When
- You control the whole machine and trust every process on it — plain ulimit may suffice.
- Running on a non-Linux OS — macOS and Windows containers use different mechanisms.
When To Use
- You need to cap resource usage per process group — mandatory for multi-tenant hosts.
- Debugging OOM kills or CPU throttling in containerised workloads.
- Building a container runtime or systemd-delegated service manager.