aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-02-14Update kafelupstream-masterWiktor Garbacz
2023-12-02make indentRobert Swiecki
2023-11-28Fix deleted bracehappyCoder92
2023-11-28Make util::gerlimit compatible with MSANhappyCoder92
2023-11-19Merge pull request #227 from philwo/philwo-1robertswiecki
Fix typo (SIKGILL -> SIGKILL)
2023-11-19Fix typo (SIKGILL -> SIGKILL).Philipp Wollermann
2023-10-25config: use proper namespace for SetLogHandlerRobert Swiecki
2023-10-22net: initialize some structs with {}Robert Swiecki
2023-10-22pid: more loggingRobert Swiecki
2023-10-21sandbox: better loggingRobert Swiecki
2023-10-21convert strcmp() to util::StrEqRobert Swiecki
2023-10-20Use nullptr where appropriateRobert Swiecki
2023-10-10util: implement rLimName() to use in cmdline/configRobert Swiecki
2023-10-05.clangd: Add -Ikafel/include for kafelRobert Swiecki
2023-10-04update kafelWiktor Garbacz
2023-10-04.cland: add include to libnl3Robert Swiecki
2023-10-04cgroups2: make a function declaration-less by moving it earlierRobert Swiecki
2023-10-03.clangd: remove unnecessary empty lineRobert Swiecki
2023-10-03mnt: reformat messages for PLOGRobert Swiecki
2023-10-02cmdline: log to stdout if -h or --help was usedRobert Swiecki
2023-10-01config: adjust identifiers, so they don't repeat config::config.. in method ↵Robert Swiecki
names
2023-10-01.clangd: for nvim/clangdRobert Swiecki
2023-10-01contain: fail of fcntl(F_GETFD) fails for a fd with something else then EBADFDRobert Swiecki
2023-09-29contain: use prlimit64 instead of setrlimit64 which seems to be ↵Robert Swiecki
glibc-specific, so it compiles with musl too
2023-09-22.clang-format: proto specific sectionRobert Swiecki
2023-09-22.clang-format: use formatting based on .clang-formatRobert Swiecki
2023-09-22Makefile: indent .proto with the same cmd as *.cc *.hRobert Swiecki
2023-09-22indent: use 'AlignEscapedNewlines: Right' to put backslashed in macros at ↵Robert Swiecki
ends of lines
2023-09-22util: put QC() in ()Robert Swiecki
2023-09-21make indentRobert Swiecki
2023-09-21Makefile: move to c++17 to use [[maybe_unused]] and remove DEFER (actually ↵Robert Swiecki
not used) from macros.h
2023-09-19Makefile/indent: add AlwaysBreakBeforeMultilineStrings:falseRobert Swiecki
2023-09-19make indent: clang-format-18Robert Swiecki
2023-09-18Makefile: simplifications around config.pb.*Robert Swiecki
2023-09-17caps: define new CAP_* unconditionallyRobert Swiecki
2023-09-15cmdline: constify structsRobert Swiecki
2023-09-15util/signal: sort signals according to asm/signal.hRobert Swiecki
2023-09-06pid: clear sigaction before useRobert Swiecki
2023-08-29util: missing SIGPWRRobert Swiecki
2023-08-18subproc: mark cloneFunc as [[noreturn]]Robert Swiecki
2023-08-18subproc: support CLONE_CLEAR_SIGHANDRobert Swiecki
2023-08-09subproc: display additional clone3 flagsRobert Swiecki
2023-06-25configs/: formattingRobert Swiecki
2023-06-25configs/telegram: telegram is 64 bit onlyRobert Swiecki
2023-06-24configs/telegram: a new config for the telegram-desktopRobert Swiecki
2023-06-13formatting fixokunz
2023-06-13Better output formatting for --helpokunz
2023-05-30Merge pull request #219 from disconnect3d/patch-3robertswiecki
cgroup2.cc: improve note about using Docker
2023-05-29cgroup2.cc: improve note about using DockerDisconnect3d
Improve the error log message when Nsjail fails to write to the `/sys/fs/cgroup/cgroup.subtree_control` file when it attempts to setup the cgroupv2 configuration. The previous message looked like this: ``` [E][2023-05-28T21:52:56+0000][8807] writeBufToFile():105 Couldn't write '7' bytes to file '/sys/fs/cgroup/cgroup.subtree_control' (fd='4'): Device or resource busy [E][2023-05-28T21:52:56+0000][8807] enableCgroupSubtree():95 Could not apply '+memory' to cgroup.subtree_control in '/sys/fs/cgroup'. If you are running in Docker, nsjail MUST be the root process to use cgroups. [E][2023-05-28T21:52:56+0000][8807] main():354 Couldn't setup parent cgroup (cgroupv2) ``` It could have been confusing because the nsjail may have already been running as real root with full capabilities, e.g., when the user ran the container with the `--privileged --user 0:0` flags. In such a case, the issue is that Docker enters new pid, uts, network, ipc, mount and cgroup namespaces (but not user or time namespaces, fwiw) and I believe that if you do so after the cgroupv2 filesystem is mounted, the root of its filesystem hierarchy will start to render only a subtree, or, generally a limited view of the cgroup. This can be seen below. On the host, we can see the cgroup sub-hierarchies and the `cgroup.subtree_control` shows us the controllers properly: ``` # ls /sys/fs/cgroup/ cgroup.controllers cgroup.threads dev-mqueue.mount memory.numa_stat system.slice cgroup.max.depth cpu.pressure init.scope memory.pressure user.slice cgroup.max.descendants cpuset.cpus.effective io.cost.model memory.stat cgroup.procs cpuset.mems.effective io.cost.qos sys-fs-fuse-connections.mount cgroup.stat cpu.stat io.pressure sys-kernel-config.mount cgroup.subtree_control dev-hugepages.mount io.stat sys-kernel-debug.mount # cat /sys/fs/cgroup/cgroup.subtree_control cpuset cpu io memory hugetlb pids rdma ``` However, even in a privileged container, we can't see the same: ``` # sudo docker run --rm -it --privileged nsjail ls /sys/fs/cgroup cgroup.controllers cpuset.cpus memory.events.local cgroup.events cpuset.cpus.effective memory.high cgroup.freeze cpuset.cpus.partition memory.low cgroup.kill cpuset.mems memory.max cgroup.max.depth cpuset.mems.effective memory.min cgroup.max.descendants hugetlb.2MB.current memory.numa_stat cgroup.procs hugetlb.2MB.events memory.oom.group cgroup.stat hugetlb.2MB.events.local memory.pressure cgroup.subtree_control hugetlb.2MB.max memory.stat cgroup.threads hugetlb.2MB.rsvd.current memory.swap.current cgroup.type hugetlb.2MB.rsvd.max memory.swap.events cpu.idle io.latency memory.swap.high cpu.max io.max memory.swap.max cpu.max.burst io.pressure pids.current cpu.pressure io.stat pids.events cpu.stat io.weight pids.max cpu.weight memory.current rdma.current cpu.weight.nice memory.events rdma.max # sudo docker run --rm -it --privileged nsjail cat /sys/fs/cgroup/cgroup.subtree_control # ``` Of course, the namespaces itself can be seen by comparing them like this: ``` // HOST # ls -la /proc/self/ns total 0 dr-x--x--x 2 root root 0 May 28 22:17 . dr-xr-xr-x 9 root root 0 May 28 22:17 .. lrwxrwxrwx 1 root root 0 May 28 22:17 cgroup -> 'cgroup:[4026531835]' lrwxrwxrwx 1 root root 0 May 28 22:17 ipc -> 'ipc:[4026531839]' lrwxrwxrwx 1 root root 0 May 28 22:17 mnt -> 'mnt:[4026531841]' lrwxrwxrwx 1 root root 0 May 28 22:17 net -> 'net:[4026531840]' lrwxrwxrwx 1 root root 0 May 28 22:17 pid -> 'pid:[4026531836]' lrwxrwxrwx 1 root root 0 May 28 22:17 pid_for_children -> 'pid:[4026531836]' lrwxrwxrwx 1 root root 0 May 28 22:17 time -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 May 28 22:17 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 May 28 22:17 user -> 'user:[4026531837]' lrwxrwxrwx 1 root root 0 May 28 22:17 uts -> 'uts:[4026531838]' // CONTAINER # sudo docker run --rm -it --privileged nsjail ls -la /proc/self/ns total 0 dr-x--x--x 2 user user 0 May 28 22:17 . dr-xr-xr-x 9 user user 0 May 28 22:17 .. lrwxrwxrwx 1 user user 0 May 28 22:17 cgroup -> 'cgroup:[4026532381]' lrwxrwxrwx 1 user user 0 May 28 22:17 ipc -> 'ipc:[4026532317]' lrwxrwxrwx 1 user user 0 May 28 22:17 mnt -> 'mnt:[4026532315]' lrwxrwxrwx 1 user user 0 May 28 22:17 net -> 'net:[4026532319]' lrwxrwxrwx 1 user user 0 May 28 22:17 pid -> 'pid:[4026532318]' lrwxrwxrwx 1 user user 0 May 28 22:17 pid_for_children -> 'pid:[4026532318]' lrwxrwxrwx 1 user user 0 May 28 22:17 time -> 'time:[4026531834]' lrwxrwxrwx 1 user user 0 May 28 22:17 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 user user 0 May 28 22:17 user -> 'user:[4026531837]' lrwxrwxrwx 1 user user 0 May 28 22:17 uts -> 'uts:[4026532316]' ``` Anyway, passing `--cgroupns=host` solves this problem, which can be seen below: ``` # ls -la /proc/self/ns | grep cgroup lrwxrwxrwx 1 root root 0 May 28 22:18 cgroup -> cgroup:[4026531835] # sudo docker run --rm -it --cgroupns=host --privileged nsjail ls -la /proc/self/ns | grep cgroup lrwxrwxrwx 1 user user 0 May 28 22:19 cgroup -> 'cgroup:[4026531835]' # sudo docker run --rm -it --privileged nsjail ls -la /proc/self/ns | grep cgroup lrwxrwxrwx 1 user user 0 May 28 22:19 cgroup -> 'cgroup:[4026532381]' ```
2023-05-28logs: respect getenv(NO_COLOR)Robert Swiecki