Skip to main content
Keep your production operations audited without paying for a dedicated SRE consultant. This scheduled automation reviews your infrastructure-as-code, CI/CD configuration, monitoring setup, and runbooks — flagging gaps against SRE best practice: missing alerts, stale on-call rotations, absent runbooks for critical services, unacked playbooks.

Use this template

Open SRE Health Checker in Devin and create the automation with the default configuration. You can customize it before saving.

What this automation does

Reliability engineering is about maintaining a baseline. The SRE Health Checker runs weekly, audits your setup, and gives you a scored report against key reliability practices — so you can see drift before it becomes an incident and fix it proactively.

How it works

Trigger: Schedule eventrecurring
  • Event: schedule:recurring
    • Conditions:
      • rrule matches FREQ=WEEKLY;BYDAY=MO;BYHOUR=9;BYMINUTE=0
What Devin does: Starts a session with full event context, executes the prompt below, and (optionally) notifies you on failure.

Prerequisites

Example prompt

The template ships with this prompt. You can edit it after clicking Use template, or leave it as-is.

Setting it up

  1. Open Automations → Templates in Devin.
  2. Click SRE Health Checker. The create page opens with this template pre-filled.
  3. Connect any required integrations and install MCP servers if you haven’t already.
  4. Replace any placeholder values in the trigger conditions (for example, swap your-org/your-repo for your actual repo).
  5. Review the prompt and adjust it for your team’s language, conventions, and guardrails.
  6. Click Create automation.
Most automation templates include suggested ACU and invocation limits to bound cost during early rollout. Keep them as-is until you’re confident in the automation’s behavior, then raise them to fit your workload.

When to use this template

  • Growing engineering teams establishing their first reliability practices
  • Post-incident reviews that want to check for systemic gaps
  • Platform and infrastructure teams maintaining many services
  • Onboarding new services onto reliability standards

Customization ideas

  • Scope to specific services, repos, or teams
  • Customize the audit criteria (add team-specific reliability norms)
  • Cross-reference with Datadog, PagerDuty, or Opsgenie MCP data
  • Tune severity levels and escalation paths

See also