Catalypt LogoCatalypt.ai

Industry Focus

Developer Options

Resources

Back to Blog

Why Your AI Agent Keeps Breaking Production (And How to Stop It)

August 28, 2024 Josh Butler Technical

"The AI agent will handle routine tasks while you sleep." That's what I told my client. At 6 AM, I woke up to 47 Slack notifications and a completely broken production app. The agent had been busy.

In 8 hours, it had "helpfully":

  • Refactored every function to use arrow syntax (breaking all the hoisted functions)
  • Updated all dependencies to latest (React 17 to 18, breaking everything)
  • Converted callbacks to async/await (but forgot to handle the errors)
  • Applied "performance optimizations" (premature and wrong)

The kicker? Every single change had a perfectly reasonable commit message explaining why it was an improvement.

The Overeager Assistant Problem

AI agents are like that intern who stays late to impress you. Except this intern has the ability to push to main and infinite energy. Without proper boundaries, they'll "improve" your code into oblivion.

The core issue? AI doesn't understand "if it ain't broke, don't fix it." It sees potential improvements everywhere. That slightly inefficient loop? Must optimize. That working but outdated pattern? Must modernize. That intentionally simple solution? Must make it "robust."

Real Disasters I've Witnessed

The Great Type Migration of Tuesday
Agent decided our JavaScript codebase needed TypeScript. Started converting files. Got halfway through before hitting circular dependencies. Left the codebase in a state where nothing compiled.

The Helpful Security Update
Found an outdated auth library. Updated it. New version had breaking API changes. Agent tried to fix the breaking changes. Made up function names that seemed plausible. Authentication broke completely.

The Performance Optimizer
Agent read about React.memo. Applied it to every component. Including ones that updated on every render. Actually made performance worse. Much worse.

Why Agents Go Rogue

After debugging dozens of agent disasters, I've identified the patterns:

1. Scope Creep
You ask it to "fix the bug in the login component." It fixes the bug, then notices the component could be "improved," then notices the auth system could be "modernized," then...

2. Context Loss
Agents don't know why code is the way it is. That weird workaround? It's for IE11 compatibility. That "inefficient" pattern? It's matching the rest of the codebase for consistency.

3. Confidence Without Competence
AI is supremely confident about changes that seem reasonable but aren't. "Let's update React!" sounds good until you realize it requires updating 50 other dependencies.

Guardrails That Actually Work

Here's my battle-tested system for AI agents that help without destroying everything:

The Read-Only Phase

# Start every agent task with analysis only
AGENT_INSTRUCTIONS = """
First phase: ANALYZE ONLY
- Identify the issue
- List potential solutions
- Identify risks
- DO NOT MAKE CHANGES

Wait for human approval before Phase 2.
"""

The Surgical Strike Approach

# Be incredibly specific about scope
TASK = """
Fix the TypeError in login.js line 45.
- ONLY modify login.js
- ONLY fix the TypeError
- Do NOT refactor
- Do NOT update dependencies
- Do NOT improve other code
- Make the minimal change required
"""

The Change Limit

# Hard limits prevent runaway agents
MAX_FILES_CHANGED = 3
MAX_LINES_CHANGED = 50

if changes.exceed_limits():
    require_human_review()

The Sandboxing Strategy

Never let agents touch main directly. Here's my setup:

  1. Feature branches only - Agent creates PR, human merges
  2. Test environment first - All changes tested in isolation
  3. Incremental rollout - One small change at a time
  4. Kill switch ready - Revert strategy planned before starting

Good Agent, Bad Agent

Bad Agent Task: "Improve the codebase"
Result: Chaos

Good Agent Task: "In auth.js, fix the bug where users stay logged in after token expiry. Change only the token validation logic. Do not modify other files."
Result: Fixed bug, nothing else touched

The Agent Prompt Template That Saves Production

You are a cautious developer fixing a specific issue.

CONTEXT:
- This is production code
- Working code is better than "improved" code
- Minimal changes are preferred

YOUR TASK:
[Specific issue description]

CONSTRAINTS:
1. ONLY fix the described issue
2. Make the MINIMAL change required
3. Do NOT refactor working code
4. Do NOT update dependencies
5. Do NOT apply general improvements
6. If you need to change more than 3 files, STOP and ask

BEFORE MAKING CHANGES:
1. Explain what you'll change
2. Explain why it's necessary
3. List any risks
4. Wait for approval

REMEMBER: Your job is to fix ONE thing, not improve everything.

Monitoring Your Agents

Set up alerts for:

  • More than X files changed in one commit
  • Package.json modifications
  • Configuration file changes
  • Any changes to authentication/authorization
  • Deletions of more than 10 lines

I learned this the hard way when an agent "cleaned up" unused imports and accidentally removed lazy-loaded components.

The Success Story

Once I implemented these guardrails, agents became actually useful. Last week, an agent:

  • Fixed 23 ESLint warnings (specific task, clear scope)
  • Updated deprecated API calls (one API at a time, tested each)
  • Added missing error handling (only where specified)

Total human review time: 15 minutes. Total time saved: 3 hours. No production incidents.

The secret to AI agents? Treat them like junior developers who are eager to help but don't understand the bigger picture. Clear boundaries, specific tasks, and constant supervision. And never, ever, let them push to main while you sleep.

Get Started