How do you get an AI agent to reliably see, understand, and act on a webpage? It’s one of the hardest problems builders face today, riddled with challenges from messy DOMs to complex user authentication.
In this episode of AI Tinkerers One Shot, we sit down with Andrew Baker to pull apart the architecture of modern browser-native agents. Andrew shares the hard-won lessons from his journey, including how he used Claude for Chrome to drive a web-based version of Kid Pix and the Playwright tricks he used to handle remote execution and sound streaming..
Andrew and Joe also explore the broader landscape: the future of browser-native agents, how frameworks like Stagehand are transforming automation, the importance of UI accessibility for agents, and why personal evaluation benchmarks matter for builders pushing the limits of these tools.
What Builders Will Take Away
- The evolution of browser automation from simple scripts to complex,agent-driven systems.
- Core challenges for agents: DOM parsing, vision models, and inconsistent page structures.
- A deep dive into using Claude for Chrome to control a web-based version of Kid Pix.
- The technical architecture for remote execution, sound streaming, and advanced Playwright techniques.
- How modern frameworks like Stagehand support the browser automation stack.
- The future of browser agents, including key economic, technical, and ethical hurdles.
Watch the Episode
💡 Resources
- Andrew Baker – https://www.linkedin.com/in/andrewtorkbaker/
- Andrew’s newsletter: https://implausible.ai/
- AI Tinkerers – https://aitinkerers.org
Chapters
| Time | Topic |
|---|---|
| 00:00:15 | Introduction and AI Tinkerers Community |
| 02:49 | Twilio Origins and Browser Automation Journey |
| 04:50 | Building the Airline Seat Selector |
| 07:51 | Browser Agent Challenges and Vision Models |
| 10:44 | Stagehand Framework and Browser Automation Stack |
| 13:28 | Claude for Chrome and Authentication |
| 16:58 | Kid Pix Origins and Demo Setup |
| 21:33 | Technical Architecture and Playwright Tricks |
| 29:24 | Evaluation Platform and Personal Benchmarks |
| 37:42 | Future of Browser Agents and Web Economics |
—
Submit Your Content Idea
Suggest a topic or submit a draft for a future blog post or video related to AI Tinkerers themes, such as browser agents, automation, or technical architecture. We’re always looking for builders to feature.
Subscribe for more conversations with the builders shaping the future of AI and automation!
Andrew Baker on Browser-Native AI Agents, Playwright Tricks, and the Future of Web Automation