Daniel Doubrovkine wanted to clean up stale group DMs in his open-source Slack bot. He handed the job to an AI assistant. What followed is a useful case study in how AI-generated code can look impeccable and still take your production environment down.

Doubrovkine — former CTO of Artsy, currently at Shopify — runs slack-sup2, a Slack standup bot with real-world users. The cleanup job the AI wrote was syntactically clean and logically coherent in isolation. It was also completely blind to one critical constraint: Slack's conversations.close endpoint has a global rate limit of one request per second. Not per-user. Not per-workspace. Global.

Run that job against a workspace with hundreds of stale conversations and Slack starts throttling immediately. Because the rate limit is global, that throttling bleeds into everything else the app is doing — message posting, user info fetching, anything touching the API. The cleanup job didn't just fail; it dragged the rest of the application down with it.

When Doubrovkine asked the AI to fix the rate limiting, things got worse. The rescue block it produced inserted a blocking sleep() call inside a socketry/async fiber context. In that framework, sleep() doesn't just pause the current task — it freezes the entire fiber scheduler, killing all concurrent operations. The correct call was Async::Task.sleep(), but even that would have been a patch on a structural wound. Making hundreds of sequential calls against a globally throttled endpoint is a bad idea regardless of how gracefully each individual failure is caught.

The real fix required thinking about the system, not just the function. Doubrovkine moved the cleanup into an existing 30-minute cron job, spreading API calls naturally over time. He added a feature flag defaulted to off, and wrote a separate slow-drain script to handle the backlog of thousands of already-unclosed DMs for existing users — a migration problem the AI had never considered. He then used GitHub Copilot for a narrower task: refactoring the batching logic so the job structurally cannot close more conversations per cycle than the rate limit allows. Every change is traceable in the public repo.

Doubrovkine calls it 'plausible-looking, locally coherent, globally wrong.' The phrase nails what makes this failure mode dangerous. There's no syntax error to catch, no missing null check, no obvious bug. The code is fine until it meets real infrastructure at real scale. That's where invisible global invariants live — and where AI code assistants, so far, consistently fail to look.