-
Notifications
You must be signed in to change notification settings - Fork 204
Open
Labels
Description
Title
Support for Terminating STATUS of PODs
Description
The BIG-IP has two options for disabling a pool member:
- Disable: Disallows the allocation of new, unpersisted connections. When disabled, the node or pool member continues to process persistent and active connections. It can accept new connections only if the connections belong to an existing persistence session. Those persistence matches for new connections continue until persistence times out.
- Force Offline: Specifies that a node or pool member allows existing connections to time out, but no new connections are allowed.
It is not unusual that PODs might remain in Terminating status for 30 seconds. During this period no new connections should be sent to these pool members.
The next example shows a rollout restart where 3 new PODs have been running still for ~30 seconds and the previous 3 PODs have not terminated yet:
[ucamaro@ocp-jumphost ~]$ oc -n vllm get pods
NAME READY STATUS RESTARTS AGE
highend-gpt-74748758cc-l8m7s 1/1 Terminating 0 6d23h
highend-gpt-74748758cc-qngk5 1/1 Terminating 0 6d23h
highend-gpt-74748758cc-ttj8s 1/1 Terminating 0 6d23h
highend-gpt-d58c45cb-8zqr2 1/1 Running 0 27s
highend-gpt-d58c45cb-bxbq5 1/1 Running 0 30s
highend-gpt-d58c45cb-m2g9w 1/1 Running 0 29s
The PODs in such status should be in disabled status.
Actual Problem
Without this, new connections are sent to PODs that are going to dissapear.
Solution Proposed
- Disable thus pool members
- This will put more pressure in the control plane, it should be configurable in the case of large deployments.
Reactions are currently unavailable