I built a skill that solves reCAPTCHA with an LLM — here's how it actually works
How it started I was testing Claude Code with the Chrome DevTools MCP server for browser automation. A reCAPTCHA popped up mid-flow. I asked Claude to solve it. It sort of worked — but it was slow,...

Source: DEV Community
How it started I was testing Claude Code with the Chrome DevTools MCP server for browser automation. A reCAPTCHA popped up mid-flow. I asked Claude to solve it. It sort of worked — but it was slow, unreliable, and frequently timed out. So I did what any engineer would do: I turned it into a proper skill and let the agent iterate on it until it actually worked consistently. Why the naive approach fails The obvious way to automate a CAPTCHA with a browser agent is: take a snapshot of the accessibility tree → get the element UID → click(uid). This fails for three structural reasons: 1. iframes kill the accessibility tree. reCAPTCHA renders everything inside cross-origin iframes. Elements inside these iframes show up as "ignored" in the accessibility tree with no assignable UIDs. The standard click(uid) approach simply can't see them. 2. The timer is brutal. reCAPTCHA gives you 2 minutes. Each tool call — screenshot, script evaluation, LLM analysis — takes 1-10 seconds. An unoptimized flow