Keeping Disruptions in Check when using Karpenter
Keeping Disruptions in Check when using Karpenter
By default Karpenter can be very aggressive in disrupting your workloads, as soon as it detects a cheaper way to run your pod it will start to consolidate. If you are running on Spot Instances then a spot disruption that comes in at the same time as Karpenter is conslidating your workloads this can lead to trouble - so here are some settings and tips that I found to be very effective at keeping Karpenters Disruptions in check.
1. Give Karpenter many instance types to choose from
This is probably the most common Trap I see people fall into with Karpenter, they create too many nodepools that contains only a small set of allowed instance types. This is usually comes from people wanting to the perfect instance for their workload but The path to hell is paved with good intentions.
2. Increase consolidateAfter
Karpenter by default has consolidateAfter set to 0s