AI agents say "COMPLETED" after doing 80% of the job. I have the numbers.
I spent the last few months building production software almost entirely with AI agents (Claude Code with Opus). A SaaS app, a photography portal, two open source tools. Hundreds of hours, thousand...

Source: DEV Community
I spent the last few months building production software almost entirely with AI agents (Claude Code with Opus). A SaaS app, a photography portal, two open source tools. Hundreds of hours, thousands of commits. You probably already know agents don't finish the job. What I didn't expect was why, and what makes it worse. How incomplete One project: SaaS app, 7 spec documents (~7500 lines), 70 business processes. Agent produced 261 E2E test cases and marked it done. I told it to cross-check. It spawned 4 subagents that read the masterplan about 40 times combined. They found 117 missing scenarios. Agent added them, marked it done. I told it to check again. More gaps. Same project, code side. 8 days, 280 commits, 32k lines of production code. Agent marked all 10 phases as COMPLETED. Actual state: 32% of API endpoints had input validation. 1 Sentry call in 32k LOC. Zero error boundaries. Zero loading states. 68% of planned E2E tests implemented. 13% of background jobs had retry logic. Where