Documentation for ignore-taint

**Which component are you using?**: 
Cluster Autoscaler

**Is your feature request designed to solve a problem? If so describe the problem this feature should solve.**:
I would like to understand how to best use ignore-taint.
Both the command-line argument `--ignore-taint`, and the annotation prefix `ignore-taint.cluster-autoscaler.kubernetes.io/`.
See below for exact use case.

**Describe the solution you'd like.**:
Behaviour, use cases and examples for ignore-taint explained in **_cluster-autoscaler/FAQ.md_**

**Describe any alternative solutions you've considered.**:
`go run main.go --help`, very brief explanation.

Reading the source code. It get quite hairy for an outsider when starting with the ClusterSnapshot stuff.

Digging through issues,

- Here is where i found the best explanation so far: https://github.com/kubernetes/autoscaler/issues/3985#issuecomment-812417839
  > The _only_ use-case supported by ignore-taint is one where a node requires additional custom initialization (ex. installing drivers, starting some DS) before it can accept pods. In this case node can be started with ignore-taint and the taint can be removed once the initialization is done.
  >
  > As long as the node has an ignore-taint CA will treat it as still booting up / unready. This is needed to avoid infinite scale-up (node is created with ignore-taint, pods remain pending, CA immediately triggers scale-up again, this cycle repeats until the taint is removed from some nodes). Since a node with ignore-taint is unready as far as CA is concerned, once there are enough such nodes in the NodeGroup CA will start treating the NG as unhealthy (thresholds are controlled by --max-total-unready-percentage, --ok-total-unready-count).
 
- PR describing use case waiting for DaemonSets to init: https://github.com/kubernetes/autoscaler/pull/2758
- PR adding annotation-prefix: https://github.com/kubernetes/autoscaler/pull/2733
- PR adding --ignore-taints flag: https://github.com/kubernetes/autoscaler/pull/2493 
   > for example if they have their own readiness labels or other markings that should not affect balancing behaviour.

---------------------------

**Additional context.**:

To explain the situation i'm trying to solve.
We have batch oriented pods that need to run on nodes with a pre-warmed cache in a shared hostPath. Network PVs don't work for this. Each batch takes around 10 minutes to process, pre-baking the cache takes 30 minutes. Each warm node can handle 3 such pods at a time.
Image you have one node N1, already serving 3 pods, P1,P2,P3. Now fourth pod, P4, is added to the scheduler-queue. Autoscaler sees that one pod is unschedulable and starts a new node N2. 
Now N2 will not be able to serve the fourth pod until it has prepared the cache, which will take 30 minutes, but we also know that any of P1,P2,P3 will finish in only 10, so when that happens it's better to put P4 onto N1, while N2 is still warming up. (The load pattern is such that even after this, N2 will not be redundant because the queue is having consistent pressure with additional P5,6,7...)

To do this, i was imagining one could add a taint to the NodePool, `CachePrepared=false:NoSchedule`, a DaemonSet tolerating this taint will start on the Node, prepare the cache and remove the taint. All other pods get blocked from scheduling on the node until the taint is removed.

From my understanding, `--ignore-taint CachePrepared` would solve the situation of creating N2 even though it can not schedule P4 according to its template. What i'm unsure about is what happens after that, will the autoscaler understand that N2 is upcoming or will it still consider P4 as unschedulable and allocate a new N3?

If it makes any difference, using AKS with the built in autoscaler.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation for ignore-taint #5251

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Documentation for ignore-taint #5251

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions