You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: kep/42-podgroup-coscheduling/README.md
+21-1Lines changed: 21 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,7 @@
15
15
-[QueueSort](#queuesort)
16
16
-[PreFilter](#prefilter)
17
17
-[PostFilter](#postfilter)
18
+
-[Backoff](#backoff)
18
19
-[Permit](#permit)
19
20
-[Known Limitations](#known-limitations)
20
21
<!-- /toc -->
@@ -158,7 +159,26 @@ For any pod that gets rejected, their pod group would be added to a backoff list
158
159
159
160
#### PostFilter
160
161
161
-
If the gap to reach the quorum of a PodGroup is greater than 10%, we reject the whole PodGroup. Note that this plugin should be configured as the last one among PostFilter plugins.
162
+
PostFilter handles scheduling failures for pods that belong to a PodGroup. When a pod fails Filter, PostFilter evaluates whether the PodGroup should be rejected based on how far it is from meeting its quorum:
163
+
164
+
1. If the number of assigned pods already meets `minMember`, no action is taken.
165
+
2. If the fraction of unassigned pods is at or below `podGroupRejectPercentage` (default: 10%), PostFilter returns `Unschedulable` without rejecting the group — the remaining pods get another scheduling attempt.
166
+
3. If the fraction of unassigned pods exceeds the threshold, PostFilter rejects all waiting pods in the group and optionally triggers backoff (see below).
167
+
168
+
The `podGroupRejectPercentage` parameter (default: `10`) is configurable in the scheduler's `CoschedulingArgs`. Set it to `0` to always reject on any failure, or `100` to never reject.
169
+
170
+
Note that this plugin should be configured as the last one among PostFilter plugins.
171
+
172
+
#### Backoff
173
+
174
+
When `podGroupBackoffSeconds` is set to a positive value in `CoschedulingArgs`, PostFilter places a failed PodGroup into a time-based backoff cache after rejection. During the backoff window, PreFilter immediately rejects all pods from the PodGroup with `UnschedulableAndUnresolvable`, preventing wasteful scheduling cycles.
175
+
176
+
Backoff is triggered only when all of the following conditions are met:
177
+
-`podGroupBackoffSeconds > 0`
178
+
- The fraction of unassigned pods exceeds `podGroupRejectPercentage`
179
+
- The total number of pods with the PodGroup label is at least `minMember`
180
+
181
+
The backoff state is stored in a TTL-based in-memory cache that auto-evicts entries after the configured duration.
0 commit comments