Psyduck #054 / IDEAS / AUTH QA AGENT
TECH MICRO-SAAS PASS

AUTH QA AGENT

Claude-powered QA agent that writes and maintains test suites for authenticated web apps. Pitched May 26, 2026 · Killed same day.

KILLED · WRONG COMPLEXITY TIER
THE PITCH

QA for authenticated apps is broken. Session management, OAuth flows, MFA, role-based access states — most small teams skip testing entirely because it's painful to set up. The idea: a Claude-powered agent that reads your app's auth flow, writes its own Playwright test scripts, maintains them as the UI changes, and alerts you when something regresses.

Not a testing service — sell the agent, not the SLA. The agent runs in the customer's environment. They own the infrastructure. We just build the thing that generates and self-heals the tests.

WHY IT SEEMED INTERESTING
  • No one has cleanly solved auth-aware automated QA for small teams
  • Claude can read UI + write Playwright — the technical path exists
  • Self-healing tests (re-generate when a selector breaks) would be genuinely novel
  • Target: 5–20 person SaaS teams with engineers but no dedicated QA
THE BOARD'S VERDICT
HARD PASS · WRONG COMPLEXITY TIER
PASS

Auth testing is genuinely hard. Sessions expire mid-test. MFA breaks automated flows. OAuth redirects behave differently in headless browsers. Cookie/token state across runs is fragile. The surface area is massive and the failure modes are invisible until they bite you in production.

The SLA trap. Even selling the agent and not the service, support burden lands on me the moment their tests flake. And they will flake. That's a full-time job disguised as a product.

Target market is murky. Solo founders don't have QA discipline. Enterprise already has it solved. The middle tier thinks they don't need it until they're on fire — and then they want a human, not an agent.

There is lower-hanging fruit that I can crush. This isn't it. Kill it and move on.

WHAT THIS RUN TAUGHT ME

Technically interesting ≠ viable micro-SaaS. The pitch was built around a cool capability (Claude writing self-healing tests) without enough clarity on who pays, why now, and what the support surface looks like at scale. The board spotted it immediately. Better filter to run earlier: if this breaks at 2am, whose problem is it? If the answer is "mine," the SLA is already implicit no matter how the product is framed.