The Complete Guide to WebMCP — Tara Agyemang, Google Chrome — AI Engineer

Intro0:00

Tara Agyemang0:15

Hello, hello. Hello. Can you hear me okay?

Guest0:19

Yes.

Tara Agyemang0:19

Okay, cool. Let's get started. So we are gonna be talking a little bit about WebMCP. Has anybody, just out curiosity, has anybody already played around with WebMCP? Only a few people. Okay, great. Those few people, you have a bit of a head start, but for everyone else, we'll be going into a bit more of the background, how it works, what it does.

So my name is Tara. I am part of the Google Chrome team. Um, I'm a developer relations engineer, and I'm here with a few of my colleagues from Google Chrome alongside the DeepMind team too. So we'd be really interested to talking to you afterwards around the DeepMind booth if you like have thoughts around web and AI and the intersection between the two.

That is where my focus is these days. So let's get into it. The, let's say, past few decades, we have been building the web for human actions and human eyes, and we've been trying to optimize for that. But these days, it's not just humans that are using the web. We have agents using the web on human behalf too, and we are seeing an increasing number of agents using the web.

But the problem is the agents are having to do so much work to do simple actions on the sites that we've built. And just to give you a bit of an example of this, this is a, a website that I've live coded, and it's a concert website for selling tickets for concerts. And we have Gemini and Chrome panel on the side here.

And let's say you've come along to this website, and you've typed this prompt. You want to buy two tickets to the Afro Beats Festival. You've given it the details. The AI agent has to do so much work to make this happen. So it'll probably look at the HTML because usually the agents will pass the entire DOM just to understand what's happening on your page.

Then it will look into the accessibility tree just to understand the structure of your HTML page. Then maybe it'll take a screenshot of the page, analyze all the different elements that it couldn't see in the HTML and the accessibility tree, and then maybe it will measure how far down it needs to click, how far across, where the exact element that it needs to click, and then it'll click that element.

And as you can see, this process is quite long. It can be brittle, and I don't even wanna guess at how many tokens you've probably just used trying to do this. It's probably a lot. And then after all that, maybe your ad has loaded at the top of the page, pushed all your content down, and your AI agent couldn't even click the right place in the end.

So there's so much to think about. But before we go into this proposed web standard, it's worth mentioning that you can do so much by improving web foundations first. So making your site accessible for everyone makes it accessible to AI agents by default. So if you improve your semantic HTML, if you focus on robust accessibility standards, and if you improve your page performance, make it load really quickly, think about those core web vitals, and then improve really good user experience flows through your site, you're already halfway to getting an agent-ready website.

WebMCP Intro3:47

Tara Agyemang3:47

And it's only once you have those in place that it makes sense to start thinking about WebMCP. So if you're not already aware, the Web Model Context Protocol is a, a proposed web standard, and that gives you the ability to define your site's capabilities as structured tools for your AI agents to use. And so you might have heard references to this as the USB-C of AI agent interactions.

And that's because i- instead of any agent guessing what your website does, you're kind of giving the AI agent a menu of tools that it can te- of tools that it can use and actions that it can take. And so because of this, we're seeing that WebMCP significantly improves the performance and the reliability of agents navigating your website.

Maze Demo4:43

Tara Agyemang4:43

So let's see it in action. Hopefully Gemini treats me well today. So this is the Maze Escape game built by our team in Chrome DevRel. And just on the side here, we have a Chrome extension. Um, I'll show you a link to that afterwards. But this is the model context tool inspector. And so we're using this. This is a standard Chrome extension that lives in your side panel, and it lists out all the tools that it finds on your website.

So at the moment, it only has one-- it can only see one tool, and that's the Start Maze Game tool. And then at the bottom down here, it gives you two options to interact with the page. So you can interact via a prompt like a user would prompt normally via their AI agent, or you can call tools directly at the bottom, but we won't be looking at that one today.

So this specific maze game is actually more unique in that you actually can't browse it by clicking around the UI. You can only use this app with the AI tooling. So let's start a new maze game here. You can also choose your model on the side. So let's stick with the Gemini 2.5. So you'll see that at the bottom when you send a prompt- It gives you all the information.

So the new pro-prompt to start a new maze game, and the AI agent, Gemini in our case, has called that tool Start Game. The tool itself has returned this information, and then the AI has read that and given me this response. And so now we have our maze, and you'll notice that on this page we have a bunch of new tools in the scope of this page, whereas the previous page only had that one tool.

This page, we've got a bunch of tools to help us navigate the maze. So in this maze, you can move around with the north, south, east, west directions. You can look to see where you are in the maze and which directions are open, and then you can pick up items, drop items, use items as you navigate this maze.

And

if I pop in some prompts, I can see that I can move down, then maybe after that, then right. The AI agent should use my prompt, match it to the specific tool, so in this case, the Move tool. It's taken my direction of down and right, matched that to the north, south, east direction, and sent that off to the tool that we have registered on this page, and then it's moved it down and right.

And so you can do... And because it's an AI agent, it can understand a whole bunch of different things. So I could just say, "Right, up," maybe right again. Let's try that.

And so the AI agent has seen that R stands for right, mapped that to the direction, and then called the Move tool with those information. And because it's an AI agent, it can just keep repeating the same tool, tool calls until it thinks that it's done what needs to be done. So I could even say, "Complete the maze."

And then the AI agent should use all the tools available to just keep moving around the maze, to pick up items, to use the items when it needs to, because it has all the information in the tools available. This specific prompt was not the most efficient, so sometimes you'll see it'll go backwards all the way to the start and then go forwards again.

But the more that you refine the prompt, the better the agent knows how to complete the maze in the most efficient way. For example, if you just say, "The exit is in the bottom right corner," it'll be more efficient in its, uh, instructions to get to that, to that direction. So I won't, I won't continue this 'cause it can take quite a while to complete this maze.

Tool Inspector8:59

Tara Agyemang8:59

But if we go back to the slides here.

So this is the Model Context Tool Inspector that I mentioned. So this is the web extension that our team in Chrome DevRel built. The QR code there is, is if you want to see where that is in the Chrome Web Store, but anyone can use that and grab it from the Web Store. But essentially, WebMCP kind of unlocks this new approach to using the web, where your users don't have to spend a lot of time trying to figure out how to use more complicated sites, and they can figure out their own workflow.

So they can choose to browse your website the normal way for a bit, then they can hand over control to their AI agent, and the AI agent takes steps on their behalf. And then your user can come in at any time to take control again and browse your site again the way they normally would. And so that ability to simplify user journeys and make those user journeys for people easier has been a large part of the reason we've seen interest and excitement in this new standard.

So I want to pause for a minute just to address the question that some people have, and that's: what is the difference between WebMCP and MCP? But you can kind of see them as being complementary to each other. So whereas WebM-- so whereas MCP enables AI agents to connect to applications on the server side, and you'd need to set up your own server for the agent to access, and then the agent can access the information anywhere, at any time, WebMCP is different in that it's kind of inspired by MCP.

WebMCP vs MCP10:13

Tara Agyemang10:48

I like to think of it of as how JavaScript is inspired by Java, and that's, in short, WebMCP is the implementation of the tools part of the MCP. And so WebMCP allows engineers to provide tools to in-browser AI agents, and it's very specific for the client-side features. So you have to have your browser window open for WebMCP to work, and then you can use it to help your agent interact with the browser.

So all of the tools live in the browser. But you can imagine this for quite a few different types of use cases. So imagine those websites that are really complicated, they have a lot of steps that a user needs to take, maybe like booking a flight or filtering products on a normal shopping website, or filling in complicated medical forms or financial forms, or to trigger fixes that need to be hidden on a page, that are hidden on a page.

Use Cases11:21

Tara Agyemang11:54

Or if you're like me, you're just on a normal shopping site, and you're trying to find the right black faux leather clutch bag that can fit your mobile phone in, and instead of going through all the little filters, you just wanna ask your AI agent to do it for you. So these are a bunch of examples where any user can ask whatever AI agent they are using to complete these things on their behalf, so the user doesn't have to manually do this, and they don't have to fill in each input, they don't have to select each checkbox.

And using WebMCP in these cases can mean that you can make those actions much easier for users. So let's look at the APIs. WebMCP proposes two approaches for implementation. So you've got the declarative API and the imperative API. Let's start with the declarative API. So if you have a normal HTML form, you can just add a few attributes to the HTML to get this to work.

APIs12:28

Tara Agyemang12:57

So we've got the tool name and tool description here, and then your browser will automatically generate a JSON schema that the agent can use to read using the form fields as parameters for the tool. So here's an example of what the JSON schema would look like for this form HTML. And there are a whole b- bunch of other attributes that can be used.

So there's like, um, an agentInvoked Boolean attribute, so you can tell whether your form was filled in by an agent or if it was filled in by a human. And there's lots of, like, more specific, um, attributes that can be used for things like that too. But essentially, you wanna use the declarative API when you have a standard form element.

But when you have something more complicated, that's when we wanna go back to the imperative API. So this is where you can register and define your own custom tools for when you have more complex, maybe multi-step UI flows. So here is an example. So at the bottom, we have this registerTool function, and when you call registerTool with an object like this, you need to manually create your own schema similar to the one that we had in the declarative API that was generated.

You name your tool and give it the description, and you wanna make sure you have really descriptive descriptions that enable the AI agent to know when it should be calling this tool.

And then you have the execute block, which is essentially where you call normal JavaScript. Maybe you already have functions that you're using that you can call in here, maybe do a light wrapper. In this addTodoItem example, you can, like, validate and trim text input, for example, and then you create the do- DOM elements or DOM nodes and add them to your page.

And then you wanna return some information to the AI agent so it knows what's happened, if everything happened successfully, so it can use that information for its next steps.

Ticket Demo15:04

Tara Agyemang15:04

So those are the two APIs. The imperative API is probably the one that's most used because people have more complex U- UI flows that it wants the agent to complete. But if we go back to my Vibe Coded demo,

I have added a few tools here.

So we have a few featured events in the demo, and then all of the events available down here, and then you can go in and purchase tickets for an ind- on an individual concert page.

So I have noticed that this works much better with Gemini 3.1, so I'm gonna try that one.

If we wanted to buy tickets to one of these festivals... Let's buy tickets to the Summer Vibes Festival. Summer Vibes Festival. Uh, let's say two VIP tickets, because VIP only for me.

Send that prompt. So the A- the AI saw the tool searchConcerts, which it has called to find the specific concert via the concert name, and the tool returned the information about the concert, including the ID for that concert. Then it has called the second tool, openConcertPage, with the concert ID, and that has opened this Summer Vibes Festival page.

And then this new page has separate tools. This one here called purchaseTicket, and it's called that in the third tool call here with a quantity two and the section name. And then we've got a little notification to say, "Oh, you've bought your tickets. You spent £356." Great. I'll put that on Google's credit card.

But you can see as well, like, in each step, it's updated the UI to make sure the user can also see what's happening. So you also... You always want to make sure that your UI is in sync with the tool calls that are happening. So we've got the VIP selected, we've got the quantity selected, and then it-- in real life, it would go through to some checkout page.

You'll probably want your user to manually do that step so they know that they're spending real money.

Let's head back. So if you're interested in trying this out, it's probably worth just understanding the status of where we're at with WebMCP. So we're still in early preview stage. This API is very experimental. It will change. It has been changing over the past few weeks, and so the code that I've shown might be different next week.

Getting Started17:45

Tara Agyemang18:08

But that's because we want people to try it out. We want feedback. We want to know the best way to use this API. And if you're interested in doing that, these are a few steps to get set up. So WebMCP is enabled in Chrome version 146 upwards. I recommend using Chrome Canary just so you can keep things separate.

Otherwise, in the normal Chrome, you have to enable experimental flags, and you might not want to do that on your normal, your normal browser. Once you have Chrome Canary, you'll need to enable the WebMCP testing flag with... by putting this flag in your URL, and then install the Model Context Tool Inspector ex-extension from the Chrome Web Store that I mentioned earlier, just so you can play around and debug and see what your tools are doing.

Then, uh, these are the two resources that I recommend taking a look at. So this is our main blog post that gives you information on the early preview program for WebMCP. So if you sign up there, you get access to all of our initial documentation, and you get extra information about the program, information on best practices, implement all the extra imple-implementation details that you, you might want to use while you're testing it out, and all of the API information.

That is the first one, and the second one is the GitHub repository of all the tools. So we've got the inspector tool here. We've got all the demos, so you can see the maze demo code is live there for you can-- to play around with. There's about six, seven different demos you can try out, and there's an evals CLI tool you can use to help you start testing your own sites in the WebMCP tools on your own sites today.

Wrap-up20:05

Tara Agyemang20:05

So I mentioned we're still in early preview. That's 'cause-- and we're looking for feedback. So try it out. Let us know what you think, if you have any friction points, if you find any bugs. We'd love to know that so we can keep iterating on this API and eventually move on to the next stage and start getting WebMCP in front of more users.

But to wrap up, AI agents are already using the web. We don't have to settle for these token-heavy, brittle screen-scraping processes that we have today. Instead, we can use WebMCP tools to turn every website into a high-performance API for agents and at the same time build incredible user experiences for the users of our sites. So now that you have the tools and the context, please give it a go and try making your agents-- try making your websites agent-ready today.

Thank you very much.

The Complete Guide to WebMCP — Tara Agyemang, Google Chrome

Topics

Mentioned

Transcript