Skip to content

Auto-scaling reacts too slow or flaps back and forth

Three calibrations: warming time, scale-out must be faster than scale-in, and pick the right metric. Default CPU target rarely works for modern apps.

Try this first

  1. 1Set scale-out aggressive, scale-in conservative. E.g. scale-out at 60 percent CPU for 2 minutes, scale-in at 30 percent for 15. Otherwise you flap.
  2. 2Pick a metric that reflects your real bottleneck. For web that's often ALB target-tracking on request count, not CPU.
  3. 3Set cooldown or warm pool. If instance startup is 3 minutes, a 1-minute evaluation interval is pointless.
  4. 4For predictable workloads (e-commerce, office-hour spike): predictive scaling based on pattern. Pre-fills for the spike.
  5. 5Load-test scaling with k6 or Locust. A config never tested will fall over during the first peak.

When to bring us in

If you breach SLA during peaks despite scaling, a short performance review is worth it. The fix usually sits in pool warmup or the app, not the auto-scaling config.

See also

None of the above fits?

Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.

Who are you?

For the AI question we need your email and company, so we can follow up if the AI gets stuck, and to prevent abuse.

Limited to 2 questions per hour and 5 per day, kept lean so the AI stays useful. For more, contacting us directly works better for you and us.

Or skip the DIY entirely

Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.