Auto-scaling reacts too slow or flaps back and forth

Question

Accepted Answer

1. Set scale-out aggressive, scale-in conservative. E.g. scale-out at 60 percent CPU for 2 minutes, scale-in at 30 percent for 15. Otherwise you flap.
2. Pick a metric that reflects your real bottleneck. For web that's often ALB target-tracking on request count, not CPU.
3. Set cooldown or warm pool. If instance startup is 3 minutes, a 1-minute evaluation interval is pointless.
4. For predictable workloads (e-commerce, office-hour spike): predictive scaling based on pattern. Pre-fills for the spike.
5. Load-test scaling with k6 or Locust. A config never tested will fall over during the first peak.

When to bring us in: 
If you breach SLA during peaks despite scaling, a short performance review is worth it. The fix usually sits in pool warmup or the app, not the auto-scaling config.

Auto-scaling reacts too slow or flaps back and forth

Try this first

When to bring us in

See also

None of the above fits?

Who are you?

Or skip the DIY entirely