libct: prepareCgroupFD: fall back to container init cgroup by kolyshkin · Pull Request #5101 · opencontainers/runc

kolyshkin · 2026-02-06T01:53:58Z

Previously, when prepareCgroupFD would not open container's cgroup
(as configured in config.json and saved to state.json), it returned
a fatal error, as we presumed a container can't exist without its own
cgroup.

Apparently, it can. In a case when container is configured without
cgroupns (i.e. it uses hosts cgroups), and /sys/fs/cgroup is mounted
read-write, a rootful container's init can move itself to an entirely
different cgroup (even a new one that it just created), and then the
original container cgroup is removed by the kernel (or systemd?) as
it has no processes left. By the way, from the systemd point of view
the container is gone. And yet it is still there, and users want
runc exec to work!

And it worked, thanks to the "let's try container init's cgroup"
fallback as added by commit c91fe9a ("cgroup2: exec: join the
cgroup of the init process on EBUSY"). The fallback was added for
the entirely different reason, but it happened to work in this very
case, too.

This behavior was broken with the introduction of CLONE_INTO_CGROUP
support.

While it is debatable whether this is a valid scenario when a container
moves itself into a different cgroup, this very setup is used by e.g.
buildkitd running in a privileged kubernetes container (see issue #5089).

To restore the way things are expected to work, add the same "try
container init's cgroup" fallback into prepareCgroupFD.

Fixes: #5089.

kolyshkin · 2026-02-06T01:54:42Z

TODO: add an integration test case.

Separate initProcessCgroupPath code out of addIntoCgroupV2. To be used by the next patch. While at it, describe the new scenario in which the container's configured cgroup might not be available. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

Previously, when prepareCgroupFD would not open container's cgroup (as configured in config.json and saved to state.json), it returned a fatal error, as we presumed a container can't exist without its own cgroup. Apparently, it can. In a case when container is configured without cgroupns (i.e. it uses hosts cgroups), and /sys/fs/cgroup is mounted read-write, a rootful container's init can move itself to an entirely different cgroup (even a new one that it just created), and then the original container cgroup is removed by the kernel (or systemd?) as it has no processes left. By the way, from the systemd point of view the container is gone. And yet it is still there, and users want runc exec to work! And it worked, thanks to the "let's try container init's cgroup" fallback as added by commit c91fe9a ("cgroup2: exec: join the cgroup of the init process on EBUSY"). The fallback was added for the entirely different reason, but it happened to work in this very case, too. This behavior was broken with the introduction of CLONE_INTO_CGROUP support. While it is debatable whether this is a valid scenario when a container moves itself into a different cgroup, this very setup is used by e.g. buildkitd running in a privileged kubernetes container (see issue 5089). To restore the way things are expected to work, add the same "try container init's cgroup" fallback into prepareCgroupFD. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin · 2026-02-07T00:04:36Z

Added a test case. Testing it fails (w/o a fix) in #5102

Add a test case to reproduce runc issue 5089. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin added the backport/1.4-todo A PR in main branch which needs to backported to release-1.4 label Feb 6, 2026

kolyshkin mentioned this pull request Feb 6, 2026

Probe with exec fails because no cgroup directory is found (can't open cgroup: openat2 /sys/fs/cgroup/...: no such file or directory) #5089

Open

kolyshkin marked this pull request as draft February 6, 2026 01:54

kolyshkin added 2 commits February 6, 2026 11:16

libct: factor out initProcessCgroupPath

e3659da

Separate initProcessCgroupPath code out of addIntoCgroupV2. To be used by the next patch. While at it, describe the new scenario in which the container's configured cgroup might not be available. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin force-pushed the fix-exec branch 2 times, most recently from 1b7acc8 to 6397833 Compare February 7, 2026 00:03

kolyshkin force-pushed the fix-exec branch from 6397833 to c68dda7 Compare February 7, 2026 00:43

tests/int: add "runc exec [init changes cgroup]"

1081573

Add a test case to reproduce runc issue 5089. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin force-pushed the fix-exec branch from c68dda7 to 1081573 Compare February 7, 2026 00:51

kolyshkin marked this pull request as ready for review February 7, 2026 01:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libct: prepareCgroupFD: fall back to container init cgroup#5101

libct: prepareCgroupFD: fall back to container init cgroup#5101
kolyshkin wants to merge 3 commits intoopencontainers:mainfrom
kolyshkin:fix-exec

kolyshkin commented Feb 6, 2026

Uh oh!

kolyshkin commented Feb 6, 2026

Uh oh!

kolyshkin commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kolyshkin commented Feb 6, 2026

Uh oh!

kolyshkin commented Feb 6, 2026

Uh oh!

kolyshkin commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant