CloudWatch alarms or self-hosted Grafana, what fits us?
For pure cloud resources, cloud-native (CloudWatch, Azure Monitor, Cloud Monitoring) is cheap and good enough. For cross-cloud, custom app metrics or nicer dashboards, Grafana or Datadog is an upgrade.
Try this first
- 1Under 20 alarms and 1 cloud: cloud-native. No extra tool, no extra licence. CloudWatch composite alarms for multi-condition.
- 2Cross-cloud or on-prem + cloud: Grafana Cloud or self-hosted Grafana with Prometheus. One dashboard for everything.
- 3Custom app metrics (request latency, queue depth, business KPIs): OpenTelemetry plus a backend (Grafana, Datadog, or Honeycomb for traces).
- 4On-call with alerting routing and escalation: PagerDuty or Opsgenie. CloudWatch alarms to SNS to PagerDuty works fine.
- 5For SMB with < 10 services: Grafana Cloud free or pro is usually cheaper than running Prometheus yourself. Keep self-host for real scale.
When to bring us in
Above 50 services or with a customer SLA, observability (logs, metrics, traces together) is a design question where a short session prevents a lot of rework.
See also
- Everyone logs in with the AWS root accountRoot is for emergencies and billing. Day-to-day work belongs in IAM users or SSO.
- Every developer has AdministratorAccessAdministratorAccess everywhere is convenient now, painful later. Start with role-based policies.
- Everyone has individual IAM users with their own passwordIdentity Center (formerly AWS SSO) links to your IdP and issues temporary credentials per session.
None of the above fits?
Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.
Or skip the DIY entirely
Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.