I built a skill that solves reCAPTCHA with an LLM — here's how it actually works

By Blaze Glacier · March 19, 2026 · 1 min read

How it started I was testing Claude Code with the Chrome DevTools MCP server for browser automation. A reCAPTCHA popped up mid-flow. I asked Claude to solve it. It sort of worked — but it was slow, unreliable, and frequently timed out. So I did what any engineer would do: I turned it into a proper skill and let the agent iterate on it until it actually worked consistently. Why the naive approach fails The obvious way to automate a CAPTCHA with a browser agent is: take a snapshot of the accessibility tree → get the element UID → click(uid). This fails for three structural reasons: 1. iframes kill the accessibility tree. reCAPTCHA renders everything inside cross-origin iframes. Elements inside these iframes show up as "ignored" in the accessibility tree with no assignable UIDs. The standard click(uid) approach simply can't see them. 2. The timer is brutal. reCAPTCHA gives you 2 minutes. Each tool call — screenshot, script evaluation, LLM analysis — takes 1-10 seconds. An unoptimized flow

I built a skill that solves reCAPTCHA with an LLM — here's how it actually works

Related Posts

Similar Topics

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network