Add daemonset to disable core dumps#7588
Conversation
|
Merging this PR will trigger the following deployment actions. Support deployments
Staging deployments
Production deployments
|
The scheduler.alpha.kubernetes.io/critical-pod annotation is deprecated and has been replaced by priorityClassName. This daemonset already uses priorityClassName: system-node-critical which is the proper way to mark critical pods. See https://kubernetes.io/docs/reference/labels-annotations-taints/#scheduler-alpha-kubernetes-io-critical-pod-deprecated
Adds a readiness probe that checks if kernel.core_pattern is still set to |/bin/false. The pod will be marked not ready if the setting changes, providing observability without active enforcement.
|
What else is needed to get this through? |
|
I still need to test this on a hub but otherwise this is ready for review |
|
Tested on the VEDA hub and everything is working as expected. Before deploying the daemonset:
cat /proc/sys/kernel/core_pattern
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %hTriggered a core dump to confirm they were landing on the host node: (notebook) jovyan@jupyter-sunu:~$ python
>>> import os
>>> import subprocess
>>> pid = os.getpid()
>>> subprocess.run(["kill", "-SIGABRT", str(pid)])
Aborted (core dumped)And confirmed the core dump file showed up at kubectl debug node/ip-192-168-23-56.us-west-2.compute.internal -it --image=alpine
/ # ls -lh /host/var/lib/systemd/coredump/
total 1M
-rw-r----- 1 root root 1.0M Feb 13 07:04 core.python.1000.9cc87ea62b3b426e98884096cb6dd442.9417.1770966244000000.xzCleared that file, then deployed the daemonset. After deploying the daemonset:
cat /proc/sys/kernel/core_pattern
|/bin/falseTriggered another core dump the same way, then checked the host node: / # ls -lh /host/var/lib/systemd/coredump/
total 0No new core dump files created. 👍🏽 |
|
Thanks @sunu — we will address the review request in the next week and a bit! |
|
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/22414717695 |
Addresses #3321 (comment) and https://github.com/NASA-IMPACT/veda-analytics/issues/189