**Update 3/26/2025 **
OpenAI added MCP support to their agent SDK.
--------------------------------
On Nov. 25, 2024, Anthropic published a blog post about the Model Context Protocol (MCP) . It didn’t get too much attention at the time because it wasn’t an announcement of much significance. Suddenly in 2025, MCP has gotten a lot of hype and attention and have also caused a lot of confusion as to why there are so many YouTube face talking about it.
It was nice that Anthropic published how they connect Claude with tools in the Claude Desktop app even if the post was a bit of marketing to sell it as an standard and to encourage an open community. There is a technical aspect (a protocol) to it, but it felt like it was a business play to get developers to extend Claude with plugins.
Large Language Models like Claude cannot perform any actions. They’re like a brain with no body. They might know that an email should be sent, but they can’t actually send the email. To connect the “thought” (send email) with the action requires basic programming. Using MCP is how Anthropic does it but it isn’t the only way. Let’s take a look at various ways that this is accomplished and then see how MCP fits in.
You Are the Agent
In this scenario, a person goes to claude.ai and has a conversation with Claude about writing an email to invite someone to lunch. Claude generates the email body letting the person know to copy it an email program to send. The person manually copies that text into their email or calendar app and sends the invitation. The person is the agent because they are performing the action.
Using an AI Assistant
Here, a person uses an app (this can be a web app, desktop app, mobile app, etc.) such as a personal assistant ala Jarvis from Iron Man. The user asks Jarvis to send an email to invite someone for lunch. Jarvis composes the invitation message and sends the invitation through a calendar app so that it also records the event on the calendar. So how does Jarvis do this?
Method 1
- Jarvis sends your question to LLM along with prompts describing available tool(s).
- Jarvis looks at the response to determine what tools to use.
- Jarvis executes the chosen tool(s) through the tool API
- The results are sent back to LLM
- LLM formulates a natural language response
- The response is displayed to you!
In the early days (2023), this might be done like this:
The user speaks to Jarvis, “Jarvis, invite Thor to lunch next Wednesday.”
The Jarvis code passes the text to an LLM along with an additional prompt:
Respond to the user, but if the request is to add to a calendar then respond in JSON:
{
Tool: “calendar”
Invitee: name
Date: date
Time: time
Body: message
}
The Jarvis program will get the response and if the response is a tool response it will parse the JSON and call the calendar API.
The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.
Method 2
- Developer registers available tools
- Jarvis sends your question to LLM
- LLM analyzes the available tools and decides which one(s) to use
- Jarvis executes the chosen tool(s) through the tool API
- The results are sent back to LLM
- LLM formulates a natural language response
- The response is displayed to you!
An enhancement was added to many LLM’s API to allow developers to register tools, their purpose and parameters. The Jarvis code will register a tool called “calendar”, give it a description such as “Tool to add, update and remove user’s calendar.”, and what parameters it needed.
Now, when Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, it will respond with JSON and Jarvis can call the calendar API.
The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.
Method 3 (MCP)
- User registers available tools.
- Jarvis sends your question to Claude
- Claude analyzes the available tools and decides which one(s) to use
- Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
- The results are sent back to Claude
- Claude formulates a natural language response
- The response is displayed to you!
With MCP, the user (on desktop/mobile) or developer (on cloud) registers MCP servers with Jarvis. Jarvis can then get the tools description from the MCP Server which it passes to the LLM.
When Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, the LLM will determine the tool to use.
Jarvis will then call the MCP server to send the calendar invite.
The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.
Comparison
With the MCP, tool registration is passed to the user and the tool description is handed off to the tool developer, but otherwise the steps remain the same.
Comparing the different methods shows that the steps are the same but just the implementation is different. This is one reason there’s a lot of confusion because seems to be very little benefit.
Having a standard protocol can be advantageous but only when all the LLM adopts it otherwise it is just how to interact with Claude.
MCP servers are potentially reusable and might ease integration which is a benefit since it’ll be like having only one API to learn. This requires wide adoption and availability which isn’t a given even if it is backed by one of the big LLM providers.
Shortcomings of the MCP
As a protocol, there are a lot of shortcomings and technical benefits are minor.
Some technical shortcomings are:
- There’s no discovery mechanism other than manual registration of MCP servers.
- There’s now extra MCP servers in the tech stack that can be achieved by a library.