Catalypt LogoCatalypt.ai

Industry Focus

Developer Options

Resources

Back to Blog

AI Regex: Now You Have Two Problems

2025-07-10T00:00:00.000Z Catalypt AI Team ai-first

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." - Jamie Zawinski

When you ask AI to write regex, you don't have two problems. You have n+1 problems where n is the number of edge cases the AI enthusiastically tried to handle.

The Email Regex That Broke Production

// Me: "Write a regex for email validation"
// AI: "I'll create a comprehensive RFC 5322 compliant regex!"

const emailRegex = /^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/;

// Later that day...
// User: "Why can't I use [email protected]?"
// Me: "..."

AI forgot about uppercase letters. Classic.

The Phone Number Catastrophe

// Me: "Simple phone number regex please"
// AI: "Here's one that handles international formats!"

const phoneRegex = /^[\+]?[(]?[0-9]{3}[)]?[-\s\.]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{4,6}$/;

// Matches:
// ✓ (555) 123-4567
// ✓ 555.123.4567
// ✓ +1-555-123-4567
// ✓ 555-GET-FOOD (wait, what?)
// ✓ 123456789012345 (that's... too many)
// ✓ My childhood trauma (somehow)

The URL Validator of Doom

// AI's "bulletproof" URL regex
const urlRegex = /^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:[/?#]\S*)?$/i;

// Browser: *catches fire*
// CPU: "I need a vacation"
// Me: "Maybe just check for 'http' at the start?"

Real AI Regex Disasters

The Password Validator That Validates Everything

// AI: "This ensures at least one uppercase, lowercase, number, and special character!"
const passwordRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

// Looks good until:
"Password1!" // ✓ Valid
"P@ssw0rd" // ✓ Valid  
"????????" // ✓ Valid (8 special chars)
"AAAAAAAA1!" // ✗ Invalid (no lowercase)

// The bug: lookaheads work correctly, but AI didn't explain edge cases

The Credit Card Validator

// AI tried to validate AND identify card types
const cardRegex = /^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|6(?:011|5[0-9]{2})[0-9]{12}|(?:2131|1800|35\d{3})\d{11})$/;

// Validates card numbers!
// Also validates my social security number
// And my phone number with extra digits
// But not actual valid test cards

The Catastrophic Backtracking Special

// AI: "This validates HTML tags!"
const htmlRegex = /<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)/;

// Feed it malformed HTML:
const evil = '<div<div<div<div<div<div<div<div<div<div';
// Your server: "Goodbye cruel world"

The International Disaster

// Me: "Validate names from any culture"
// AI: "I'll use Unicode categories!"

const nameRegex = /^[\p{L}\p{M}\p{Zs}'-]+$/u;

// Sounds good until:
"José" // ✓
"Māori" // ✓
"null" // ✓ (Actual name in some cultures)
"alert('hi')" // ✗ Good
"👨‍👩‍👧‍👦" // ✓ That's... a family emoji
"­" // ✓ That's an invisible character

The Date Parser From Hell

// AI went full regex warrior
const dateRegex = /^(?:31(\/|-|\.)(?:0?[13578]|1[02]))\1|(?:(?:29|30)(\/|-|\.)(?:0?[13-9]|1[0-2])\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)0?2\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9])|(?:1[0-2]))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$/;

// Handles leap years!
// Also handles:
// 31/02/2024 (February 31st?)
// 99/99/9999 (The end times)
// My will to live (gone)

The Regex That Became Sentient

// The ultimate password validator
const passwordRegexFinal = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{12,}(?!.*(.)\1{2})(?!.*password)(?!.*123).*$/i;

// I asked for password validation
// AI delivered existential dread

How to Not Regex with AI

Option 1: Use a Library

// Just... use a library
import validator from 'validator';
const isValid = validator.isEmail(email);

Option 2: Simple and Sufficient

const simpleEmail = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// Covers 99% of cases, readable by humans

Option 3: Progressive Enhancement

function validatePhone(phone) {
  // Remove all non-digits
  const digits = phone.replace(/\D/g, '');
  // Check length
  return digits.length >= 10 && digits.length <= 15;
}

AI Regex Prompt That Actually Works

Write a SIMPLE regex for [use case]. 
Requirements:
- Match [specific examples]
- Don't match [counter examples]  
- Prioritize readability over completeness
- Explain what each part does

The Debugging Nightmare

When your AI-generated regex fails:

  1. You can't read it
  2. AI can't explain it
  3. Regex debuggers crash
  4. Your coworkers hate you
  5. You switch careers

The Lessons Learned

  1. Simple > Complete - A regex that handles 90% of cases correctly is better than one that handles 99% but nobody understands

  2. Libraries > Regex - Someone already solved this problem better than your AI-generated regex

  3. Test Real Data - AI tests with perfect inputs. Your users will input chaos

  4. Document Everything - Future you will thank present you for explaining the regex

  5. Know When to Stop - If your regex looks like line noise, you've gone too far

My Favorite AI Regex Moment

// Me: "I need to validate numbers"
// AI: "Here's a comprehensive number validator!"

const numberRegex = /^[+-]?(?:(?:\d{1,3}(?:,\d{3})*)|(?:\d+))(?:\.\d+)?(?:[eE][+-]?\d+)?$/;

// Me: "I meant like... \d+"
// AI: "But what about scientific notation?"
// Me: "It's for a ZIP code"
// AI: "...oh"

AI writing regex is like using a Formula 1 car for your morning commute - technically it works, but you'll spend more time fixing problems than solving them. Regular expressions are already write-only code; adding AI to the mix creates write-never-read-never code. Sometimes the best regex is no regex. And if you must use regex, remember: the goal is to match patterns, not to prove the Riemann hypothesis. Keep it simple, keep it readable, and keep your sanity.


Originally published as a React component. Converted to Markdown for consistency.

Get Started