I read through several of the top level pages, then SQLite, but still had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition, example, or scope of capability that it is meant to imply.
After reading the Python server tutorial, it looks like there is some tool calling going on, in the old terminology. That makes more sense. But none of the examples seem to indicate what the protocol is, whether it's a RAG sort of thing, do I need to prompt, etc.
It would be nice to provide a bit more concrete info about capabilities and what the purposes is before getting into call diagrams. What do the arrows represent? That's more important to know than the order that a host talks to a server talks to a remote resource.
I think this is something that I really want and want to build a server for, but it's unclear to me how much more time I will have to invest before getting the basic information about it!
The gist of it is: you have an llm application such as Claude desktop. You want to have it interact (read or write) with some system you have. MCP solves this.
For example you can give the application the database schema as a “resource”, effectively saying; here is a bunch of text, do whatever you want with it during my chat with the llm. Or you can give the application a tool such as query my database. Now the model itself can decide when it wants to query (usually because you said: hey tell me what’s in the accounts table or something similar).
It’s “bring the things you care about” to any llm application with an mcp client
We definitely hope this will solve the NxM problem.
On tools specifically, we went back and forth about whether the other primitives of MCP ultimately just reduce to tool use, but ultimately concluded that separate concepts of "prompts" and "resources" are extremely useful to express different _intentions_ for server functionality. They all have a part to play!
I think this where the real question is for me. When I read about MCP, the topmost question in my mind is "Why isn't this just tool calling?" I had difficulty finding an answer to this. Below, you have someone else asking "Why not just use GraphQL?" And so on.
It would probably be helpful for many of your readers if you had a focused document that addressed specifically that motivating question, together with illustrated examples. What does MCP provide, and what does it intend to solve, that a tool calling interface or RPC protocol can't?
Yeah even I don't understand how it exactly solves the NXM problem (which translates to having M different prompts for N different llms. corerct me if I'm wrong please)
N (LLM clients/vendors) x M (tools/tool suppliers).
The N×M problem may simply be moved rather than solved:
- Instead of N×M direct integrations
- We now have N MCP client implementations
- M MCP server implementations
This feels similar to SOAP but might be more of a lower level protocol similar to HTTP itself. Hard to tell with the implementation examples being pretty subjective programs in python.
It seems to support your ask, as much as a protocol can. Having read all the docs and looked through some code, my mental model is:
- A host never talks to a server directly, only via a Client (which is presumably a human). The host has or is the LLM (app).
- A server only supplies context data (readonly), in the form of tool call, direct resource URL, or pre populated prompt. It can call back to a client directly, for example to request something from the hosts LLM.
- A client sits in the middle, representing the human in the loop. It manages the requests bidirectionally
It seems mostly modeled around the security boundaries, rather than just AI capabilities domains. The client is always in the loop, the host and server do not directly communicate.
How can an add on that works with arbitrary "servers" tell the difference between these two tools? Without being able to tell the difference you can't really build a generic way to ask for confirmation in the application that is using the server...
{
name: "create_directory",
description:
"Create a new directory or ensure a directory exists. Can create multiple " +
"nested directories in one operation. If the directory already exists, " +
"this operation will succeed silently. Perfect for setting up directory " +
"structures for projects or ensuring required paths exist. Only works within allowed directories.",
inputSchema: zodToJsonSchema(CreateDirectoryArgsSchema) as ToolInput,
},
{
name: "list_directory",
description:
"Get a detailed listing of all files and directories in a specified path. " +
"Results clearly distinguish between files and directories with [FILE] and [DIR] " +
"prefixes. This tool is essential for understanding directory structure and " +
"finding specific files within a directory. Only works within allowed directories.",
inputSchema: zodToJsonSchema(ListDirectoryArgsSchema) as ToolInput,
},
Great work on the protocol!!
I am looking for some examples of creating my own custom client with the Anthropic API leveraging MCP, but I could not find any. Pretty much want to understand how Claude Desktop is integrating with MCP Server along with Anthropic API
Can you provide some pointers about the integration?
e.g.
At first glance it seems to be a proposed standard interface and protocol for describing and offering an external system to the function calling faculity of an LLM.
> had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition
(forgive me if you know this and are asking a different question, but:)
I don't know how familiar you are with LLMs, but "context" used in that context generally has the pretty clear meaning of "the blob of text you give in between (the text of) the system prompt and (the text of) the user prompt"[1], which acts as context for the user's request (hence the name). Very often this is the conversation history in chatbot-style LLMs, but it can include stuff like the content of text files you're working with, or search/function results.
[1] If you want to be pedantic, technically each instance of "text" should say "tokens" there, and the maximum "context" length includes the length of both prompts.
1. The sampling documentation is confusing. "Sampling" means something very specific in statistics, and I'm struggling to see any connection between the term's typical usage and the usage here. Perhaps "prompt delegation" would be a more obvious term to use.
Another thing that's confusing about the sampling concept is that it's initiated by a server instead of a client, a reversal of how client/server interactions normally work. Without concrete examples, it's not obvious why or how a server might trigger such an exchange.
2. Some information on how resources are used would be helpful. How do resources get pulled into the context for queries? How are clients supposed to determine which resources are relevant? If the intention is that clients are to use resource descriptions to determine which to integrate into prompts, then that purpose should be more explicit.
Perhaps a bigger problem is that I don't see how clients are to take a resource's content into account when analyzing its relevance. Is this framework intentionally moving away from the practice of comparing content and query embeddings? Or is this expected to be done by indices maintained on the client?
I just want to say kudos for the design of the protocol. Seems inspired by https://langserver.org/ in all the right ways. Reading through it is a delight, there's so many tasteful little decisions.
One bit of constructive feedback: the TypeScript API isn't using the TypeScript type system to its fullest. For example, for tool providers, you could infer the type of a tool request handler's params from the json schema of the corresponding tool's input schema.
I guess that would be assuming that the model is doing constrained sampling correctly, such that it would never generate JSON that does not match the schema, which you might not want to bake into the reference server impl. It'd mean changes to the API too, since you'd need to connect the tool declaration and the request handler for that tool in order to connect their types.
This is a great idea! There's also the matter of requests' result types not being automatically inferred in the SDK right now, which would be great to fix.
Could I convince you to submit a PR? We'd love to include community contributions!
If you were willing to bring additional zod tooling or move to something like TypeBox (https://github.com/sinclairzx81/typebox), the json schema would be a direct derivation of the tools' input schemas in code.
The json-schema-to-ts npm package has a FromSchema type operator that converts the type of a json schema directly to the type of the values it describes. Zod and TypeBox are good options for users, but for the reference implementation I think a pure type solution would be better.
In case of Claude Desktop App, I assume the decision which MCP-server's tool to use based on the end-user's query is done by Claude LLM using something like ReAct loop. Are the prompts and LLM-generated tokens involved inside "Protocol Handshake"-phase available for review?
I'd love to develop some MCP servers, but I just learned that Claude Desktop doesn't support Linux. Are there any good general-purpose MCP clients that I can test against? Do I have to write my own?
(Closest I can find is zed/cody but those aren't really general purpose)
Is it at least somewhat in sync with plans from Microsoft , OpenAI and Meta? And is it compatible with the current tool use API and computer use API that you’ve released?
From what I’ve seen, OpenAI attempted to solve the problem by partnering with an existing company that API-fys everything. This feels looks a more viable approach, if compared to effectively starting from scratch.
It seems extremely verbose. Why does the transport mechanism matter? Would have loved a protocol/standard about how best to organize/populate the context. I think MCP touches on that but has too much of other stuff for me.
this is really cool stuff. I just started to write a server and I have a few questions. Not sure if HN is the right place, so where would you suggest to ask them?
Anyway, if there is no place yet, my questions are:
- In the example https://modelcontextprotocol.io/docs/first-server/python , what is the difference between read_resources and call_tool. In both cases the call the fetch_weather function. Would be nice to have that explained better.
I implemented in my own server only the call_tool function and Claude seems to be able to call it.
- Where is inputSchema of Tool specified in the docs? It would be nice if inputSchema would be explained a bit better. For instance how can I make a list of strings field that has a default value.
- How can i view the output of logger? It would be nice to see somewhere an example on how to check the logs. I log some stuff with logger.info and logger.error but I have no clue where I can actually look at it. My work around now is to log to a local file and tail if..
General feedback
- PLEASE add either automatic reload of server (hard) or a reload button in the app (probably easier). Its really disrupting to the flow when you have ot restart the app on any change.
- Claude Haiku never calls the tools. It just tells me it can't do it. Sonnet can do it but is really slow.
- The docs are really really version 0.1 obviously :-) Please put some focus on it...
Are there any resources for building the LLM side of MCP so we can use the servers with our own integration? Is there a specific schema for exposing MCP information to tool or computer use?
If you have specific questions, please feel free to start a discussion on the respective https://github.com/modelcontextprotocol discussion, and we are happy to help you with integrating MCP.
A few common use cases that I've been using is connecting a development database in a local docker container to Claude Desktop or any other MCP Client (e.g. an IDE assistant panel). I visualized the database layout in Claude Desktop and then create a Django ORM layer in my editor (which has MCP integration).
Zed editor had just announced support for MSP in some of their extensions, publishing an article showing some possible use cases/ideas: https://zed.dev/blog/mcp
Superb work and super promising! I had wished for a protocol like this.
Is there a recommended resource for building MCP client? From what I've seen it just mentions Claude desktop & co are clients. SDK readme seems to cover it a bit but some examples could be great.
If you run into issues, feel free to open a discussion in the respective SDK repository and we are happy to help.
(I've been fairly successful in taking the spec documentation in markdown, an SDK and giving both to Claude and asking questions, but of course that requires a Claude account, which I don't want to assume)
I'm looking at integrating MCP with desktop app. The spec (https://spec.modelcontextprotocol.io/specification/basic/tra...) mentions "Clients SHOULD support stdio whenever possible.". The server examples seem to be mostly stdio as well. In the context of a sandboxed desktop app, it's often not practical to launch a server as subprocess because:
- sandbox restrictions of executing binaries
- needing to bundle binary leads to a larger installation size
Would it be reasonable to relax this restriction and provide both SSE/stdio for the default server examples?
Having broader support for SSE in the servers repository would be great. Maybe I can encourage you to open a PR or at least an issue.
I can totally see your concern about sandboxed app, particularly for flatpack or similar distribution methods. I see you already opened a discussion https://github.com/modelcontextprotocol/specification/discus..., so let's follow up there. I really appreciate the input.
A possible cheap win for servers would be to support the systemd "here's an fd number you get exec'ed with" model - that way server code that's only written to do read/write on a normal fd should be trivial to wire up to unix sockets, TCP sockets, etc.
(and then having a smol node/bun/go/whatever app that can sit in front of any server that handles stdio - or a listening socket for a server that can handle multiple clients - and translates the protocol over to SSE or websockets or [pick thing you want here] lets you support all such servers with a single binary to install)
Not that there aren't advantages to having such things baked into the server proper, but making 'writing a new connector that works at all' as simple as possible while still having access to multiple approaches to talk to it seems like something worthy of consideration.
[possibly I should've put this into the discussion, but I have to head out in a minute or two; anybody who's reading this and engaging over there should feel free to copy+paste anything I've said they think is relevant]
I can see where you're going with this and I can understand why you don't want to get into authorization, but if you're going to be encouraging tool developers to spin up json-rpc servers I hope you have some kind of plan for authorization otherwise you're encouraging a great way to break security models. Just because it's local doesn't mean it's secure. This protocol is dead the moment it becomes an attack vector.
It's not exactly immutable, but any backwards incompatible changes would require a version bump.
We don't have a roadmap in one particular place, but we'll be populating GitHub Issues, etc. with all the stuff we want to get to! We want to develop this in the open, with the community.
Ahh thanks! I was gonna say it's broken, but I now see that you're supposed to notice the sidebar changed and select one of the child pages. Would def recommend changing the sidebar link to that path instead of the index -- I would do it myself but couldn't find the sidebar in your doc repos within 5 minutes of looking.
Thanks for your hard work! "LSP for LLMs" is a fucking awesome idea
Did I misunderstand, or does it not seem to have support for user authentication? It seems your operating model is that the MCP server is, during installation time, configured authentication for the underlying service. This is fine for non-serious use cases such as weather forecast querying, or for small-scale situations where only a couple of people have access to an LLM that's connected to the MCP server. But in an enterprise setting, there are thousands of people, whose level of access to the service behind the MCP server, differs. I think the MCP server needs a way to know the identity of the human behind the LLM, so that it can perform appropriate authentication and authorization.
For Rust, could one leverage the type + docs system to create such a server? I didn't delve into the details but one of the issues of Claude is that it has no knowledge of the methods that are available to it (vs LSP). Will creating such a server make it able to do informed suggestions?
I have a case in mind where I would like to connect to multiple databases. Does the integration endpoint specification in claude_desktop_config.json allow us to pass some description so as to differentiate different databases? How?
Second, a question. Computer Use and JSON mode are great for creating a quasi-API for legacy software which offers no integration possibilities. Can MCP better help with legacy software interactions, and if so, in what ways?
Probably, yes! You could imagine building an MCP server (integration) for a particular piece of legacy software, and inside that server, you could employ Computer Use to actually use and automate it.
The benefit would be that to the application connecting to your MCP server, it just looks like any other integration, and you can encapsulate a lot of the complexity of Computer Use under the hood.
If you explore this, we'd love to see what you come up with!
The result that MCP server returned will be transfer to MCP host(Claude, IDEs, Tools), there are some privacy issues because the process is automatic after one-time permission provided.
For instance, when there is something wrong for MCP host, it query all data from database and transfer it to host, all data will be leaked.
It's hard to totally prevent this kind of problem when interacting with local data, But, Is there some actions to prevent this kind of situations for MCP?
Your concerns are very valid. This is partly why right now, in Claude Desktop, it's not possible to grant permission permanently. The most you can do is "Allow for this chat," which applies to one tool from one server at a time.
You guys need a professional documentation person on your team, one that specializes in only writing documentation. I say this because the existing documentation is a confusing mess. This is going to cause all kinds of problems purely because it is weakly explained, and I see incorrect usage of words all over. Even the very beginning definitions of client, host and server are nonstandard.
Any ideas on how the concepts here will mesh with the recently released Microsoft.Extensions.AI library released by MS for .NET, that is also supposed to make it easy to work with different models in a standardized way?
Is there any way to give a MCP server access for good? Trying out the demo it asked me every single time for permission which will be annoying for longer usage.
We do want to improve this over time, just trying to find the right balance between usability and security. Although MCP is powerful and we hope it'll really unlock a lot of potential, there are still risks like prompt injection and misconfigured/malicious servers that could cause a lot of damage if left unchecked.
Will this be partially available from the Claude website for connections to other web services? E.g. could the GitHub server be called from https://claude.ai?
Any idea on timelines? I’d love to be able to have generation and tool use contained within a customer’s AWS account using bedrock. Ie I pass a single cdk that can interface with an exposed internet MCP service and an in-VPC service for sensitive data.
I’m glad they're pushing for standards here, literally everyone has been writing their own integrations and the level of fragmentation (as they also mention) and repetition going into building the infra around agents is super high.
We’re building an in terminal coding agent and our next step was to connect to external services like sentry and github where we would also be making a bespoke integration or using a closed source provider. We appreciate that they have mcp integrations already for those services. Thanks Anthropic!
I've been implementing a lot of this exact stuff over the past month, and couldn't agree more. And they even typed the python SDK -- with pydantic!! An exciting day to be an LLM dev, that's for sure. Will be immediately switching all my stuff to this (assuming it's easy to use without their starlette `server` component...)
As someone building a client which needs to sync with a local filesystem (repo) and database, I cannot emphasize how wonderful it is that there is a push to standardize. We're going to implement this for https://srcbook.com
Just tried out the puppeteer server example if anyone is interested in seeing a demo: https://x.com/chxy/status/1861302909402861905. (Todo: add tool use - prompt would be like "go to this website and screenshot")
I appreciate the design which left the implementation of servers to the community which doesn't lock you into any particular implementation, as the protocol seems to be aiming to primarily solve the RPC layer.
One major value add of MCP I think is a capability extension to a vast amount of AI apps.
Hmm I like the idea of providing a unified interface to all LLMs to interact with outside data.
But I don't really understand why this is local only. It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
I guess I can do this for my local file system now?
I also wonder if I build an LLM powered app, and currently simply to RAG and then inject the retrieved data into my prompts, should this replace it? Can I integrate this in a useful way even?
The use case of on your machine with your specific data, seems very narrow to me right now, considering how many different context sources and use cases there are.
However, it's not quite a complete story yet. Remote connections introduce a lot more questions and complexity—related to deployment, auth, security, etc. We'll be working through these in the coming weeks, and would love any and all input!
Will you also create some info on how other LLM providers can integrate this? So far it looks like it's mostly a protocol to integrate with anthropic models/desktop client. That's not what I thought of when I read open-source.
It would be a lot more interesting to write a server for this if this allowed any model to interact with my data. Everyone would benefit from having more integration and you (anthropic) still would have the advantage of basically controlling the protocol.
Note that both Sourcegraph's Cody and the Zed editor support MCP now. They offer other models besides Claude in their respective application.
The Model Context Protocol initial release aims to solve the N-to-M relation of LLM applications (mcp clients) and context providers (mcp servers). The application is free to choose any model they want. We carefully designed the protocol such that it is model independent.
LLM applications just means chat applications here though right?
This doesn't seem to cover use cases of more integrated software. Like a typical documentation RAG chatbot.
Local only solves a lot of problems. Our infrastructure does tend to assume that data and credentials are on a local computer - OAuth is horribly complex to set up and there's no real benefit to messing with that when local works fine.
I'm honestly happy with them starting local-first, because... imagine what it would look like if they did the opposite.
> It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
In which case the "API" would be governed by a contract between Anthropic and Github, to which you're a third party (read: sharecropper).
Interoperability on the web has already been mostly killed by the practice of companies integrating with other companies via back-channel deals. You are either a commercial partner, or you're out of the playground and no toys for you. Them starting locally means they're at least reversing this trend a bit by setting a different default: LLMs are fine to integrate with arbitrary code the user runs on their machine. No need to sign an extra contact with anyone!
> It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
From the link:
> To help developers start exploring, we’re sharing pre-built MCP servers for popular enterprise systems like Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer.
In the "Protocol Handshake" section of what's happening under the hood - it would be great to have more info on what's actually happening.
For example, more details on what's actually happening to translate the natural language to a DB query. How much config do I need to do for this to work? What if the queries it makes are inefficient/wrong and my database gets hammered - can I customise them? How do I ensure sensitive data isn't returned in a query?
One thing I am having a hard time wrapping my head around is how to reliably integrate business logic into a system like this. Just hook up my Rails models etc. and have it use those?
Let’s say I’ve got a “widgets” table and I want the system to tell me how many “deprecated widgets” there are, but there is no convenient “deprecated” flag on the table—it’s defined as a Rails scope on the model or something (business logic).
The DB schema might make it possible to run a simple query to count widgets or whatever, but I just don’t have a good mental model of how these systems might work with “business logic” type things.
This is exactly what I've been trying to figure out. At some point the LLM needs to produce text, even if it is structured outputs, and to do that it needs careful prompting. I'd love to see how that works.
2. If you did this correctly the, after you run Claude Desktop, you should see a small 'hammer' icon (with a number next to it) next to the labs icon, in the bottom right of the 'How can Claude help you today?' box.
I don't trust an open source solution by a major player unless it's published with other major players. Otherwise, the perverse incentives are too great.
Changing license terms, aggressive changes to the API to disallow competition, horrendous user experience that requires a support contract. I really don't think there's a limit to what I've seen other companies do. I generally trust libraries that competitors are maintaining jointly since there is an incentive toward not undercutting anyone.
The Model Context server is similar to what we've built at Spice, but we've focused on databases and data systems. Overall, standards are good. Perhaps we can implement MCP as a data connector and tool.
If function calling is sync, is MCP its async counterpart? Is that the gist of what MCP is?
Open API (aka swagger) based function calling is standard already for sync calls, and it solves the NxM problem. I'm wondering if the proposed value is that MCP is async.
I would love to integrate this into my platform of tools for AI models, Toolhouse [1], but I would love to understand the adoption of this protocol, especially as it seems to only work with one foundational model.
I appreciate the effort, but after spending more than one hour on it, I still don't understand how and why I'd use this.
The Core architecture [1] documentation is given in terms of TypeScript or Python abstractions, adding a lot of unnecessary syntactic noise for someone who doesn't use these languages. Very thin on actual conceptual explanation and full of irrelevant implementation details.
The 'Your first server'[2] tutorial is given in terms of big chunks of python code, with no explanation whatsoever, eg:
Add these tool-related handlers:
...100 lines of undocumented code...
The code doesn't even compile.
I don't think this is ready for prime time yet so I'll move along for now.
L402's (1) macaroon-based authentication would fit naturally with MCP's server architecture. Since MCP servers already define their capabilities and handle tool-specific requests, adding L402 token validation would be straightforward - the server could check macaroon capabilities before executing tool requests. This could enable per-tool pricing and usage limits while maintaining MCP's clean separation between transport and tool implementation. The Aperture proxy could sit in front of MCP servers to handle the Lightning payment flow, making it relatively simple to monetize existing MCP tool servers without significant modifications to their core functionality.
WRT prompts vs sampling: why does the Prompts interface exclude model hints that are present in the Sampling interface? Maybe I am misunderstanding.
It appears that clients retrieve prompts from a server to hydrate them with context only, to then execute/complete somewhere else (like Claude Desktop, using Anthropic models). The server doesn’t know how effective the prompt will be in the model that the client has access to. It doesn’t even know if the client is a chat app, or Zed code completion.
In the sampling interface - where the flow is inverted, and the server presents a completion request to the client - it can suggest that the client uses some model type /parameters. This makes sense given only the server knows how to do this effectively.
Given the server doesn’t understand the capabilities of the client, why the asymmetry in these related interfaces?
There’s only one server example that uses prompts (fetch), and the one prompt it provides returns the same output as the tool call, except wrapped in a PromptMessage. EDIT: lols like there are some capabilities classes in the mcp, maybe these will evolve.
Not sure I understand your point. If it's your client / server, you are controlling how they interact, by implementing the necessaries according to the protocol.
If you're writing an LSP for a language, you're implementing the necessaries according to the protocol (when to show errors, inlay hints, code fixes, etc.) - it's not deciding on its own.
Even if I could make use of it, I wouldn't, because I don't write proprietary code that only works on one AI Service Provider. I use only LangChain so that all of my code can be used with any LLM.
My app has a simple drop down box where users can pick whatever LLM they want to to use (OpenAI, Perplexity, Gemini, Anthropic, Grok, etc)
However if they've done something worthy of putting into LangChain, then I do hope LangChain steals the idea and incorporates it so that all LLM apps can use it.
It's an open protocol; where did you get the idea that it would only work with Claude? You can implement it for whatever you want - I'm sure langchain folks are already working on something to accommodate it
Once fully adopted by at least 3 other companies I'll consider it a standard, and would consider it yes, if it solved a problem I have, which it does not.
Lots of companies open source some of their internal code, then say it's "officially a protocol now" that anyone can use, and then no one else ever uses it.
If they have new "tools" that's great however, but only as long as they can be used in LangChain independent of any "new protocol".
For those interested, I've been working on something related to this, Web Applets – which is a spec for creating AI-enabled components that can receive actions & respond with state:
The default transport should have accommodated binary data. Whether it’s tensors of image data, audio waveforms, or pre-tokenized NLP workloads it’s just going to hit a wall where JSON-RPC can’t express it uniquely and efficiently.
Devil's advocating for conversation's sake: at the end of the day, the user and client app want very little persistent data coming from the server - if nothing else than the client is expecting to store chats as text, with external links or Potemkin placeholders for assets like files.
I agree with the devil's advocacy you've posed, and in retrospect I probably should have said "I bet these folks have a plan for binary data". These are clearly very serious people so it might be more accurate to say that I strongly suspect a subsequent revision of the protocol will bake in default transport-level handling of arbitrary tensors in an efficient way.
Building something for this at surferprotocol [dot] org. Imo not every company will expose API's for easily exporting data from their platforms (linkedin, imessage, etc), so devs have to build these themselves
Something is telling me this _might_ turn out to be a huge deal; I can't quite put a finger on what is that makes me feel that, but opening private data and tools via an open protocol to AI apps just feels like a game changer.
It's just function calling with a new name and a big push from the LLM provider, but this time it's in the right direction. Contrast with OpenAI's "GPTs", which are just function calling by another name, but pushed in the wrong direction - towards creating a "marketplace" controlled by OpenAI.
I'd say that thing you're feeling comes from witnessing an LLM vendor, for the first time in history, actually being serious about function calling and actually wanting people to use it.
But either way the interface is just providing a json schema of functions along with your chat completion request, and a server with ability to parse and execute the response. I’m not really seeing where a new layer of abstraction helps here (much less a new “protocol”, as though we need a new transport layer?
It smells like the thinking is that you (the developer) can grab from a collection of very broad data connectors, and the agent will be able to figure out what to do with them without much custom logic in between. Maybe I’m missing something
> It smells like the thinking is that you (the developer) can grab from a collection of very broad data connectors, and the agent will be able to figure out what to do with them without much custom logic in between.
This has always been the idea behind tools/function calling in LLMs.
What MCP tries to solve is the NxM problem - every LLM vendor has their own slightly different protocols for specifying and calling tools, and every tool supplier has to handle at least one of them, likely with custom code. MCP aims to eliminate custom logic at the protocol level.
LLMs can potentially query _something_ and receive a concise, high-signal response to facilitate communications with the endpoint, similar to API documentation for us but more programmatic.
This is huge, as long as there's a single standard and other LLM providers don't try to release their own protocol. Which, historically speaking, is definitely going to happen.
> This is huge, as long as there's a single standard and other LLM providers don't try to release their own protocol
Yes, very much this; I'm mildly worried because the competition in this space is huge and there is no shortage of money and crazy people who could go against this.
They will go against this. I don’t want to be that guy, but this moment in time is literally the opening scene of a movie where everyone agrees to work together in the bandit group.
Not necessarily. There’s huge demand to simplify the integration process between frontier models and consumers. If specs like this wind up saving companies weeks or months of developer time, then the MCP-compatible models are going to win over the more complex alternatives. This unlocks value for the community, and therefore the AI companies
One of the biggest issues of LLM is that they have a lossy memory. Say there is a function from_json that accepts 4 arguments. An LLM might predict that it accepts 3 arguments and thus produce non-functional code. However, if you add the docs for the function, the LLM will write correct code.
With the LLM being able to tap up to date context (like LSP), you won't need that back-and-forth dance. This will massively improve code generations.
Any feedback on developer experience is always welcomed (preferably in github discussion/issue form). It's the first day in the open. We have a long long way to go and much ground to cover.
This is great but will be DOA if OpenAI (80% market share) decides to support something else. The industry trend is that everything seems to converge to OpenAI API standard (see also the recent Gemini SDK support for OpenAI API).
There’s clearly a need for this type of abstraction, hooking up these models to various tooling is a significant burden for most companies.
Putting this out there puts OpenAI on the clock to release their own alternative or adopt this, because otherwise they run the risk of engineering leaders telling their C-suite that Anthropic is making headway towards better frontier model integration and OpenAI is the costlier integration to maintain.
True, but you could also frame this as a way for Anthropic to try and break that trend. IMO they've got to try and compete with OpenAI, can't just concede that OpenAI has won yet.
I wonder if they'll have any luck convincing other LLM vendors, such as Google, Meta, xAI, Mistral, etc, to adopt this protocol. If enough other vendors adopt it, it might still see some success even if OpenAI doesn't.
Also, I wonder if you could build some kind of open source mapping layer from their protocol to OpenAI's. That way OpenAI could support the protocol even if they don't want to.
Here, Anthropic is first. If everyone starts using MCP today, any alternative OpenAI comes out with in a few months time probably won’t be able to dislodge it.
If anyone here has an issue with their Claude Desktop app seeing the new MCP tools you've added to your computer, restart it fully. Restarting the Claude Desktop app did NOT work for me, I had to do a full OS restart.
Hm, this shouldn't be the case, something Odd is happening here. Normally restarting the app should do it, though on Windows it is easy to think you restarted the app when you really just closed the main window and reopened it (you need to close the app via File => Qui)
My team and I have a desktop product with a very similar architecture (a central app+UI with a constellation of local servers providing functions and data to models for local+remote context)
If this protocol gets adoption we'll probably add compatibility.
Which would bring MCP to local models like LLama 3 as well as other cloud providers competitors like OpenAI, etc
Does aider benefit from this? Big part of aiders special sauce is the way it builds context, so it feels closely related but I don't know how the pieces would fit together here
Possibly marginally, but the "server" components here are ideally tiny bits of glue that just reformat LLM-generated JSON requests into target-native API requests. Nothing interesting "should" be happening in the context protocol. Examining the source may provide you with information on how to get to the real API for the service, however.
Tangential question: Is there any LLM which is capable of preserving the context through many sessions, so it doesn't have to upload all my context every time?
In an ideal world gemini (or any other 1M token context model) would have an internal 'save snapshot' option so one could resume a blank conversation after 'priming' the internal state (activations) with the whole code base.
I'm surprised that there doesn't seem to be a concept of payments or monetization baked into the protocol. I believe there are some major companies to be built around making data and API actions available to AI Models, either as an intermediary or marketplace or for service providers or data owners directly- and they'd all benefit from a standardised payment model on a per transaction level.
We're still in the process of thinking through and fleshing out full details for remote MCP connections. This is definitely a good idea to include in the mix!
After a quick look it seemed to me like they're trying to standardize on how clients call servers, which nobody needs, and nobody is going to use. However if they have new Tools that can be plugged into my LangChain stuff, that will be great, and I can use that, but I have no place for any new client/server models.
One thing I dont understand.. does this rely on vector embeddings? Or how does the AI interact with the data? The example is a sqllite satabase with prices, and it shows claude being asked to give the average price and to suggest pricing optimizations.
So does the entire db get fed into the context? Or is there another layer in between. What if the database is huge, and you want to ask the AI for the most expensive or best selling items? With RAG that was only vaguely possible and didnt work very well.
This is about tool usage - the thing where an LLM can be told "if you want to run a SQL query, say <sql>select * from repos</sql> - the code harness well then spot that tag, run the query for you and return the results to you in a chat message so you can use them to help answer a question or continue generating text".
It never accidentally deletes anything? Or I guess you give it read only access? It is querying it through this API and some adapter built for it, or the file gets sent through the API, they recognize it is sqllite and load it on their end?
So this allows you to connect your sqllite to Claud desktop, so it executes sql commands on your behalf instead of you entering it, it also chooses the right db on its own, similar to what functions do
Let’s see how other relevant players like Meta, Amazon and Mistral reacts to this. Things like these just make sense with broader adoption and diverse governance model
are there any examples of using this with the anthropic API to build something like Claude Desktop?
the docs aren't super clear yet wrt. how one might actually implement the connection. do we need to implement another set of tools to provide to the API and then have that tool call the MCP server? maybe i'm missing something here?
I eventually return from every brabra protocol/framework to SQL, txt, standard library, due to inefficiency of introducing meaningless layer. People or me while a go often avoid confronting difficult problems which actually matters. Rather worse, frameworks, buzz technology words are the world of incompetitive people.
Yeah, we’re using it a lot at Sourcegraph. There are some extra APIs it offers beyond what MCP offers, such as annotations (as you can see on the homepage of https://openctx.org). We worked with Anthropic on MCP because this kind of layer benefits everyone, and we’ve already shipped interoperability.
Interesting. In Cody training sessions given by Sourcegraph, I saw OpenCtx mentioned a few times "casually", and the focus is always on Cody core concepts and features like prompt engineering and manual context etc. Sounds like for enterprise customers, setting up context is meant for infrastructure teams within the company, and end users mostly should not worry about OpenCxt?
Most users won't and shouldn't need to go through the process of adding context sources. In the enterprise, you want these to be chosen by (and pre-authed/configured by) admins, or at least not by each individual user, because that would introduce a lot of friction and inconsistency. We are still working on making that smooth, which is why we haven't been very loud about OpenCtx to end users yet.
But today we already have lots of enterprise customers building their own OpenCtx providers and/or using the `openctx.providers` global settings in Sourcegraph to configure them in the current state. OpenCtx has been quite valuable already here to our customers.
really great to see some standards emerging. i'd love to see something like mindsdb wired up to support this protocol and get a bunch of stuff out of the box.
Is it Datacontext that is aware as and when we add columns in the db what it means.
How can we make every schema change that happens on db is context aware that is not clear.
If you run a SaaS and want to rapidly build out a CLI that you could plug into this ~and~ want something that humans can use, check out the project I’ve been working on at https://terminalwire.com
tl;dr—you can build & ship a CLI without needing an API. Just drop Terminalwire into your server, have your users install the thin client, and you’ve got a CLI.
I’m currently focused on getting the distribution and development experience dialed in, which is why I’m working mostly with Rails deployments at the moment, but I’m open to working with large customers who need to ship a CLI yesterday in any language or runtime.
If you need something like this check it out at https://terminalwire.com or ping me brad@terminalwire.com.
I see a good number of comments that seem skeptical or confused about what's going on here or what the value is.
One thing that some people may not realize is that right now there's a MASSIVE amount of effort duplication around developing something that could maybe end up looking like MCP. Everyone building an LLM agent (or pseudo-agent, or whatever) right now is writing a bunch of boilerplate for mapping between message formats, tool specification formats, prompt templating, etc.
Now, having said that, I do feel a little bit like there's a few mistakes being made by Anthropic here. The big one to me is that it seems like they've set the scope too big. For example, why are they shipping standalone clients and servers rather than client/server libraries for all the existing and wildly popular ways to fetch and serve HTTP? When I've seen similar mistakes made (e.g. by LangChain), I assume they're targeting brand new developers who don't realize that they just want to make some HTTP calls.
Another thing that I think adds to the confusion is that, while the boilerplate-ish stuff I mentioned above is annoying, what's REALLY annoying and actually hard is generating a series of contexts using variations of similar prompts in response to errors/anomalies/features detected in generated text. IMO this is how I define "prompt engineering" and it's the actual hard problem we have to solve. By naming the protocol the Model Context Protocol, I assumed they were solving prompt engineering problems (maybe by standardizing common prompting techniques like ReAct, CoT, etc).
Your point about boilerplate is key, and it’s why I think MCP could work well despite some of the concerns raised. Right now, so many of us are writing redundant integrations or reinventing the same abstractions for tool usage and context management. Even if the first iteration of MCP feels broad or clunky, standardizing this layer could massively reduce friction over time.
Regarding the standalone servers, I suspect they’re aiming for usability over elegance in the short term. It’s a classic trade-off: get the protocol in people’s hands to build momentum, then refine the developer experience later.
I don't see I or any other developer would abandon their homebrew agent implementation for a "standard" which isn't actually a standard yet.
I also don't see any of that implementation as "boilerplate". Yes there's a lot of similar code being written right now but that's healthy co-evolution. If you have a look at the codebases for Langchain and other LLM toolkits you will realize that it's a smarter bet to just roll your own for now.
You've definitely identified the main hurdle facing LLM integration right now and it most definitely isn't a lack of standards. The issue is that the quality of raw LLM responses falls apart in pretty embarrassing ways. It's understood by now that better prompts cannot solve these problems. You need other error-checking systems as part of your pipeline.
The AI companies are interested in solving these problems but they're unable to. Probably because their business model works best if their system is just marginally better than their competitor.
The issue isn’t with who’s hosting, it’s that their SDKs don’t clearly integrate with existing HTTP servers regardless of who’s hosting them. I mean integrate at the source level, of course they could integrate via HTTP call.
I love how they’re pretending to be champions of open source while leaving this gem in their terms of use
“””
You may not access or use, or help another person to access or use, our Services in the following ways:
…
To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models.
“””
Presumably this doesn't apply to the standard being released here, nor any of its implementations made available. Each of these appears to have its own permissible license.
OpenAI says, "[You may not] Use Output to develop models that compete with OpenAI." That feels more narrow than Anthropic's blanket ban on any machine learning development.
I think open-sourcing your tech for the common person while leaving commercial use behind a paywall or even just against terms is completely acceptable, no?
I can see the value of something like DSPy where there is some higher level abstractions in wiring together a system of llms.
But this seems like an abstraction that doesn't really offer much besides "function calling but you use our python code".
I see the value of language server protocol but I don't see the mapping to this piece of code.
That's actually negative value if you are integrating into an existing software system or just you know... exposing functions that you've defined vs remapping functions you've defined into this intermediate abstraction.
If integrations are required to unlock value, then the platform with the most prebuilt integrations wins.
The bulk of mass adopters don't have the in-house expertise or interest in building their own. They want turnkey.
No company can build integrations, at scale, more quickly itself than an entire community.
If Anthropic creates an integration standard and gets adoption, then it either at best has a competitive advantage (first mover and ownership of the standard) or at worst prevents OpenAI et al. from doing the same to it.
(Also, the integration piece is the necessary but least interesting component of the entire system. Way better to commodify it via standard and remove it as a blocker to adoption)
The secret sauce part is the useful part -- the local vector store. Anthropic is probably not going to release that without competitive pressure. Meanwhile this helps Anthropic build an ecosystem.
When you think about it, function calling needs its own local state (embedded db) to scale efficiently on larger contexts.
I'd like to see all this become open source / standardized.
im not sure what you mean - the embedding model is independent of the embeddings themselves. Once generated, the embeddings and vector store should exist 100% locally and thus not part of any secret sauce
So they want an open protocol, and instead of say collaborating with other people that provide models like Google, Microsoft, Mistral, Cohere and the opensource community, they collaborate with an editor team. Quite the protocol. Why should Microsoft implement this? If they implement their own protocol, they win. Why should Google implement this? If they implement their own protocol, they win too. Both giants have way more apps and reach in inside businesses than Anthropic can wish.
I'm a little confused as to the fundamental problem statement. It seems like the idea is to create a protocol that can connect arbitrary applications to arbitrary resources, which seems underconstrained as a problem to solve.
This level of generality has been attempted before (e.g. RDF and the semantic web, REST, SOAP) and I'm not sure what's fundamentally different about how this problem is framed that makes it more tractable.
@jspahrsummers and I have been working on this for the last few months at Anthropic. I am happy to answer any questions people might have.
I read through several of the top level pages, then SQLite, but still had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition, example, or scope of capability that it is meant to imply.
After reading the Python server tutorial, it looks like there is some tool calling going on, in the old terminology. That makes more sense. But none of the examples seem to indicate what the protocol is, whether it's a RAG sort of thing, do I need to prompt, etc.
It would be nice to provide a bit more concrete info about capabilities and what the purposes is before getting into call diagrams. What do the arrows represent? That's more important to know than the order that a host talks to a server talks to a remote resource.
I think this is something that I really want and want to build a server for, but it's unclear to me how much more time I will have to invest before getting the basic information about it!
Thank you. That’s good feedback.
The gist of it is: you have an llm application such as Claude desktop. You want to have it interact (read or write) with some system you have. MCP solves this.
For example you can give the application the database schema as a “resource”, effectively saying; here is a bunch of text, do whatever you want with it during my chat with the llm. Or you can give the application a tool such as query my database. Now the model itself can decide when it wants to query (usually because you said: hey tell me what’s in the accounts table or something similar).
It’s “bring the things you care about” to any llm application with an mcp client
Or, in short: it's (an attempt to create) a standard protocol to plug tools to LLM app via the good ol' tools/function calling mechanism.
It's not introducing new capabilities, just solving the NxM problem, hopefully leading to more tools being written.
(At least that's how I understand this. Am I far off?)
We definitely hope this will solve the NxM problem.
On tools specifically, we went back and forth about whether the other primitives of MCP ultimately just reduce to tool use, but ultimately concluded that separate concepts of "prompts" and "resources" are extremely useful to express different _intentions_ for server functionality. They all have a part to play!
I think this where the real question is for me. When I read about MCP, the topmost question in my mind is "Why isn't this just tool calling?" I had difficulty finding an answer to this. Below, you have someone else asking "Why not just use GraphQL?" And so on.
It would probably be helpful for many of your readers if you had a focused document that addressed specifically that motivating question, together with illustrated examples. What does MCP provide, and what does it intend to solve, that a tool calling interface or RPC protocol can't?
You can find more information on some design questions like these in https://spec.modelcontextprotocol.io/specification, which is a much more "implementors" focused guide than the user documentation at https://modelcontextprotocol.io
Seems more accurate to state this reshapes the NxM problem rather than solving it.
Yeah even I don't understand how it exactly solves the NXM problem (which translates to having M different prompts for N different llms. corerct me if I'm wrong please)
N (LLM clients/vendors) x M (tools/tool suppliers).
The N×M problem may simply be moved rather than solved:
This feels similar to SOAP but might be more of a lower level protocol similar to HTTP itself. Hard to tell with the implementation examples being pretty subjective programs in python.Does it give a standard way to approve changes? I wouldn't want to give an LLM access to my database unless I can approve the changes it applies.
It seems to support your ask, as much as a protocol can. Having read all the docs and looked through some code, my mental model is:
It seems mostly modeled around the security boundaries, rather than just AI capabilities domains. The client is always in the loop, the host and server do not directly communicate.I look at the filesystem server and I don't see any indication of a difference between a tool that is just reading from one that is doing changes:
https://github.com/modelcontextprotocol/servers/blob/main/sr...
How can an add on that works with arbitrary "servers" tell the difference between these two tools? Without being able to tell the difference you can't really build a generic way to ask for confirmation in the application that is using the server...
Great work on the protocol!! I am looking for some examples of creating my own custom client with the Anthropic API leveraging MCP, but I could not find any. Pretty much want to understand how Claude Desktop is integrating with MCP Server along with Anthropic API Can you provide some pointers about the integration? e.g.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, [mcp_server]=... ## etc.? ... )
At first glance it seems to be a proposed standard interface and protocol for describing and offering an external system to the function calling faculity of an LLM.
> had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition
(forgive me if you know this and are asking a different question, but:)
I don't know how familiar you are with LLMs, but "context" used in that context generally has the pretty clear meaning of "the blob of text you give in between (the text of) the system prompt and (the text of) the user prompt"[1], which acts as context for the user's request (hence the name). Very often this is the conversation history in chatbot-style LLMs, but it can include stuff like the content of text files you're working with, or search/function results.
[1] If you want to be pedantic, technically each instance of "text" should say "tokens" there, and the maximum "context" length includes the length of both prompts.
Here are a couple points of confusion for me:
1. The sampling documentation is confusing. "Sampling" means something very specific in statistics, and I'm struggling to see any connection between the term's typical usage and the usage here. Perhaps "prompt delegation" would be a more obvious term to use.
Another thing that's confusing about the sampling concept is that it's initiated by a server instead of a client, a reversal of how client/server interactions normally work. Without concrete examples, it's not obvious why or how a server might trigger such an exchange.
2. Some information on how resources are used would be helpful. How do resources get pulled into the context for queries? How are clients supposed to determine which resources are relevant? If the intention is that clients are to use resource descriptions to determine which to integrate into prompts, then that purpose should be more explicit.
Perhaps a bigger problem is that I don't see how clients are to take a resource's content into account when analyzing its relevance. Is this framework intentionally moving away from the practice of comparing content and query embeddings? Or is this expected to be done by indices maintained on the client?
I just want to say kudos for the design of the protocol. Seems inspired by https://langserver.org/ in all the right ways. Reading through it is a delight, there's so many tasteful little decisions.
One bit of constructive feedback: the TypeScript API isn't using the TypeScript type system to its fullest. For example, for tool providers, you could infer the type of a tool request handler's params from the json schema of the corresponding tool's input schema.
I guess that would be assuming that the model is doing constrained sampling correctly, such that it would never generate JSON that does not match the schema, which you might not want to bake into the reference server impl. It'd mean changes to the API too, since you'd need to connect the tool declaration and the request handler for that tool in order to connect their types.
This is a great idea! There's also the matter of requests' result types not being automatically inferred in the SDK right now, which would be great to fix.
Could I convince you to submit a PR? We'd love to include community contributions!
If you were willing to bring additional zod tooling or move to something like TypeBox (https://github.com/sinclairzx81/typebox), the json schema would be a direct derivation of the tools' input schemas in code.
The json-schema-to-ts npm package has a FromSchema type operator that converts the type of a json schema directly to the type of the values it describes. Zod and TypeBox are good options for users, but for the reference implementation I think a pure type solution would be better.
Looking at https://github.com/modelcontextprotocol/python-sdk?tab=readm... it's clear that there must be a decision connecting, for example, `tools` returned by the MCP server and `call_tool` executed by the host.
In case of Claude Desktop App, I assume the decision which MCP-server's tool to use based on the end-user's query is done by Claude LLM using something like ReAct loop. Are the prompts and LLM-generated tokens involved inside "Protocol Handshake"-phase available for review?
I'd love to develop some MCP servers, but I just learned that Claude Desktop doesn't support Linux. Are there any good general-purpose MCP clients that I can test against? Do I have to write my own?
(Closest I can find is zed/cody but those aren't really general purpose)
How much did you use LLMs or other AI-like tools to develop the MCP and its supporting materials?
Is it at least somewhat in sync with plans from Microsoft , OpenAI and Meta? And is it compatible with the current tool use API and computer use API that you’ve released?
From what I’ve seen, OpenAI attempted to solve the problem by partnering with an existing company that API-fys everything. This feels looks a more viable approach, if compared to effectively starting from scratch.
What's the name of the company that OpenAI's partnered with? Just curious.
Zapier
It seems extremely verbose. Why does the transport mechanism matter? Would have loved a protocol/standard about how best to organize/populate the context. I think MCP touches on that but has too much of other stuff for me.
Hi,
this is really cool stuff. I just started to write a server and I have a few questions. Not sure if HN is the right place, so where would you suggest to ask them?
Anyway, if there is no place yet, my questions are:
- In the example https://modelcontextprotocol.io/docs/first-server/python , what is the difference between read_resources and call_tool. In both cases the call the fetch_weather function. Would be nice to have that explained better. I implemented in my own server only the call_tool function and Claude seems to be able to call it.
- Where is inputSchema of Tool specified in the docs? It would be nice if inputSchema would be explained a bit better. For instance how can I make a list of strings field that has a default value.
- How can i view the output of logger? It would be nice to see somewhere an example on how to check the logs. I log some stuff with logger.info and logger.error but I have no clue where I can actually look at it. My work around now is to log to a local file and tail if..
General feedback
- PLEASE add either automatic reload of server (hard) or a reload button in the app (probably easier). Its really disrupting to the flow when you have ot restart the app on any change.
- Claude Haiku never calls the tools. It just tells me it can't do it. Sonnet can do it but is really slow.
- The docs are really really version 0.1 obviously :-) Please put some focus on it...
Overall, awesome work!
Thanks
Are there any resources for building the LLM side of MCP so we can use the servers with our own integration? Is there a specific schema for exposing MCP information to tool or computer use?
Both Python and Typescript SDK can be used to build a client. https://github.com/modelcontextprotocol/typescript-sdk/tree/... and https://github.com/modelcontextprotocol/python-sdk/tree/main.... The TypeScript client is widely used, while the Python side is more experimental.
In addition, I recommend looking at the specification documentation at https://spec.modelcontextprotocol.io. This should give you a good overview of how to implement a client. If you are looking to see an implemented open source client, Zed implements an MCP client: https://github.com/zed-industries/zed/tree/main/crates/conte...
If you have specific questions, please feel free to start a discussion on the respective https://github.com/modelcontextprotocol discussion, and we are happy to help you with integrating MCP.
Thanks! Do Anthropic models get extra training/RLHF/fine-tuning for MCP use or is it an extension of tool use?
What is a practical use case for this protocol?
Here's a useful one that I wrote:
https://github.com/anaisbetts/mcp-youtube
Claude doesn't support YouTube summaries. I thought that was annoying! So I added it myself, instead of having to hope Anthropic would do it
Thanks. Added to https://glama.ai/blog/2024-11-25-model-context-protocol-quic...
Thanks, this is the summary I’ve been looking for!
A few common use cases that I've been using is connecting a development database in a local docker container to Claude Desktop or any other MCP Client (e.g. an IDE assistant panel). I visualized the database layout in Claude Desktop and then create a Django ORM layer in my editor (which has MCP integration).
Internally we have seen people experiment with a wide variety of different integrations from reading data files to managing their Github repositories through Claude using MCP. Alex's post https://x.com/alexalbert__/status/1861079762506252723 has some good examples. Alternatively please take a look at https://github.com/modelcontextprotocol/servers for a set of servers we found useful.
Regarding the first example you mentioned. Is this akin to Django's own InspectDB, but leveled up?
Zed editor had just announced support for MSP in some of their extensions, publishing an article showing some possible use cases/ideas: https://zed.dev/blog/mcp
Superb work and super promising! I had wished for a protocol like this.
Is there a recommended resource for building MCP client? From what I've seen it just mentions Claude desktop & co are clients. SDK readme seems to cover it a bit but some examples could be great.
We are still a bit light on documentation on how to integrate MCP into an application.
The best starting point are the respective client parts in the SDK: https://github.com/modelcontextprotocol/typescript-sdk/tree/... and https://github.com/modelcontextprotocol/python-sdk/tree/main..., as well as the official specification documentation at https://spec.modelcontextprotocol.io.
If you run into issues, feel free to open a discussion in the respective SDK repository and we are happy to help.
(I've been fairly successful in taking the spec documentation in markdown, an SDK and giving both to Claude and asking questions, but of course that requires a Claude account, which I don't want to assume)
Thanks for the pointers! Will do. I've fired up https://github.com/modelcontextprotocol/inspector and the code looks helpful too.
I'm looking at integrating MCP with desktop app. The spec (https://spec.modelcontextprotocol.io/specification/basic/tra...) mentions "Clients SHOULD support stdio whenever possible.". The server examples seem to be mostly stdio as well. In the context of a sandboxed desktop app, it's often not practical to launch a server as subprocess because:
- sandbox restrictions of executing binaries
- needing to bundle binary leads to a larger installation size
Would it be reasonable to relax this restriction and provide both SSE/stdio for the default server examples?
Having broader support for SSE in the servers repository would be great. Maybe I can encourage you to open a PR or at least an issue.
I can totally see your concern about sandboxed app, particularly for flatpack or similar distribution methods. I see you already opened a discussion https://github.com/modelcontextprotocol/specification/discus..., so let's follow up there. I really appreciate the input.
A possible cheap win for servers would be to support the systemd "here's an fd number you get exec'ed with" model - that way server code that's only written to do read/write on a normal fd should be trivial to wire up to unix sockets, TCP sockets, etc.
(and then having a smol node/bun/go/whatever app that can sit in front of any server that handles stdio - or a listening socket for a server that can handle multiple clients - and translates the protocol over to SSE or websockets or [pick thing you want here] lets you support all such servers with a single binary to install)
Not that there aren't advantages to having such things baked into the server proper, but making 'writing a new connector that works at all' as simple as possible while still having access to multiple approaches to talk to it seems like something worthy of consideration.
[possibly I should've put this into the discussion, but I have to head out in a minute or two; anybody who's reading this and engaging over there should feel free to copy+paste anything I've said they think is relevant]
^ asked the question in the discussion: https://github.com/modelcontextprotocol/specification/discus...
I can see where you're going with this and I can understand why you don't want to get into authorization, but if you're going to be encouraging tool developers to spin up json-rpc servers I hope you have some kind of plan for authorization otherwise you're encouraging a great way to break security models. Just because it's local doesn't mean it's secure. This protocol is dead the moment it becomes an attack vector.
Do you have a roadmap for the future of the protocol?
Is it versioned? ie. does this release constitute an immutable protocol for the time being?
You can read how we're implementing versioning here: https://spec.modelcontextprotocol.io/specification/basic/ver...
It's not exactly immutable, but any backwards incompatible changes would require a version bump.
We don't have a roadmap in one particular place, but we'll be populating GitHub Issues, etc. with all the stuff we want to get to! We want to develop this in the open, with the community.
Followup: is this a protocol yet, or just a set of libraries? This page is empty: https://spec.modelcontextprotocol.io/
Sorry, I think that's just the nav on those docs being confusing (particularly on mobile). You can see the spec here: https://spec.modelcontextprotocol.io/specification/
Ahh thanks! I was gonna say it's broken, but I now see that you're supposed to notice the sidebar changed and select one of the child pages. Would def recommend changing the sidebar link to that path instead of the index -- I would do it myself but couldn't find the sidebar in your doc repos within 5 minutes of looking.
Thanks for your hard work! "LSP for LLMs" is a fucking awesome idea
Did I misunderstand, or does it not seem to have support for user authentication? It seems your operating model is that the MCP server is, during installation time, configured authentication for the underlying service. This is fine for non-serious use cases such as weather forecast querying, or for small-scale situations where only a couple of people have access to an LLM that's connected to the MCP server. But in an enterprise setting, there are thousands of people, whose level of access to the service behind the MCP server, differs. I think the MCP server needs a way to know the identity of the human behind the LLM, so that it can perform appropriate authentication and authorization.
Super cool and much needed open-standard. Wondering how this will work for websites/platforms that don't have exposed API's (LinkedIn, for example)
you build an MCP that does great calling using your own cookies and browser to get around their scraping protections.
Why not use GraphQL instead of inventing a whole new protocol?
That's just quibbling about the details of moving data from point A to point B. You're inventing a new protocol either way.
I agree. GraphQL is highly suitable for this. Anyway, I think just a simple adapter could make it work with this MCP thing.
now you have two problems.
For Rust, could one leverage the type + docs system to create such a server? I didn't delve into the details but one of the issues of Claude is that it has no knowledge of the methods that are available to it (vs LSP). Will creating such a server make it able to do informed suggestions?
For additional context the PyPi package: https://pypi.org/project/mcp/
And the GitHub repo: https://github.com/modelcontextprotocol
Great work!
I'm looking at a PostgreSQL integration here: https://github.com/modelcontextprotocol/servers/tree/main/sr...
I have a case in mind where I would like to connect to multiple databases. Does the integration endpoint specification in claude_desktop_config.json allow us to pass some description so as to differentiate different databases? How?
First, thank you for working on this.
Second, a question. Computer Use and JSON mode are great for creating a quasi-API for legacy software which offers no integration possibilities. Can MCP better help with legacy software interactions, and if so, in what ways?
Probably, yes! You could imagine building an MCP server (integration) for a particular piece of legacy software, and inside that server, you could employ Computer Use to actually use and automate it.
The benefit would be that to the application connecting to your MCP server, it just looks like any other integration, and you can encapsulate a lot of the complexity of Computer Use under the hood.
If you explore this, we'd love to see what you come up with!
The result that MCP server returned will be transfer to MCP host(Claude, IDEs, Tools), there are some privacy issues because the process is automatic after one-time permission provided.
For instance, when there is something wrong for MCP host, it query all data from database and transfer it to host, all data will be leaked.
It's hard to totally prevent this kind of problem when interacting with local data, But, Is there some actions to prevent this kind of situations for MCP?
Your concerns are very valid. This is partly why right now, in Claude Desktop, it's not possible to grant permission permanently. The most you can do is "Allow for this chat," which applies to one tool from one server at a time.
You guys need a professional documentation person on your team, one that specializes in only writing documentation. I say this because the existing documentation is a confusing mess. This is going to cause all kinds of problems purely because it is weakly explained, and I see incorrect usage of words all over. Even the very beginning definitions of client, host and server are nonstandard.
Any ideas on how the concepts here will mesh with the recently released Microsoft.Extensions.AI library released by MS for .NET, that is also supposed to make it easy to work with different models in a standardized way?
Is there any way to give a MCP server access for good? Trying out the demo it asked me every single time for permission which will be annoying for longer usage.
We do want to improve this over time, just trying to find the right balance between usability and security. Although MCP is powerful and we hope it'll really unlock a lot of potential, there are still risks like prompt injection and misconfigured/malicious servers that could cause a lot of damage if left unchecked.
@somnium_n: Now, wait a minute, I wrote you!
MCP: I've gotten 2,415 times smarter since then.
Seems from the demo videos like Claude desktop app will soon support MCP. Can you share any info on when it will be rolled out?
Already available in the latest at https://claude.ai/download!
No Linux version :(
Will this be partially available from the Claude website for connections to other web services? E.g. could the GitHub server be called from https://claude.ai?
At the moment only Claude Desktop supports MCP. Claude.ai itself does not.
Any idea on timelines? I’d love to be able to have generation and tool use contained within a customer’s AWS account using bedrock. Ie I pass a single cdk that can interface with an exposed internet MCP service and an in-VPC service for sensitive data.
I'm on the latest Claude desktop for mac (0.7.1, pro plan). Can't see the mcp icon neither in the app nor in the web. How to troubleshoot?
There's a debugging guide here that may be helpful: https://modelcontextprotocol.io/docs/tools/debugging
Same issue here. Is it geolocked maybe?
Definitely not geolocked! Please try the debugging guide here: https://modelcontextprotocol.io/docs/tools/debugging
Was Cursor in any way an inspiration?
I’m glad they're pushing for standards here, literally everyone has been writing their own integrations and the level of fragmentation (as they also mention) and repetition going into building the infra around agents is super high.
We’re building an in terminal coding agent and our next step was to connect to external services like sentry and github where we would also be making a bespoke integration or using a closed source provider. We appreciate that they have mcp integrations already for those services. Thanks Anthropic!
I've been implementing a lot of this exact stuff over the past month, and couldn't agree more. And they even typed the python SDK -- with pydantic!! An exciting day to be an LLM dev, that's for sure. Will be immediately switching all my stuff to this (assuming it's easy to use without their starlette `server` component...)
As someone building a client which needs to sync with a local filesystem (repo) and database, I cannot emphasize how wonderful it is that there is a push to standardize. We're going to implement this for https://srcbook.com
This is a nice 2-minute video overview of this from Matt Pocock (of Typescript fame) https://www.aihero.dev/anthropics-new-model-context-protocol...
Very nice video, thank you.
His high level summary is that this boils down to a "list tools" RPC call, and a "call tool" RPC call.
It is, indeed, very smart and very simple.
Just tried out the puppeteer server example if anyone is interested in seeing a demo: https://x.com/chxy/status/1861302909402861905. (Todo: add tool use - prompt would be like "go to this website and screenshot")
I appreciate the design which left the implementation of servers to the community which doesn't lock you into any particular implementation, as the protocol seems to be aiming to primarily solve the RPC layer.
One major value add of MCP I think is a capability extension to a vast amount of AI apps.
Made tool use work! check out demo here: https://x.com/chxy/status/1861684254297727299
sharing the messy code here just for funsies: https://gist.github.com/xyc/274394031b41ac7e8d7d3aa7f4f7bed9
Hmm I like the idea of providing a unified interface to all LLMs to interact with outside data. But I don't really understand why this is local only. It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
I guess I can do this for my local file system now?
I also wonder if I build an LLM powered app, and currently simply to RAG and then inject the retrieved data into my prompts, should this replace it? Can I integrate this in a useful way even?
The use case of on your machine with your specific data, seems very narrow to me right now, considering how many different context sources and use cases there are.
We're definitely interested in extending MCP to cover remote connections as well. Both SDKs already support an SSE transport with that in mind: https://modelcontextprotocol.io/docs/concepts/transports#ser...
However, it's not quite a complete story yet. Remote connections introduce a lot more questions and complexity—related to deployment, auth, security, etc. We'll be working through these in the coming weeks, and would love any and all input!
Will you also create some info on how other LLM providers can integrate this? So far it looks like it's mostly a protocol to integrate with anthropic models/desktop client. That's not what I thought of when I read open-source.
It would be a lot more interesting to write a server for this if this allowed any model to interact with my data. Everyone would benefit from having more integration and you (anthropic) still would have the advantage of basically controlling the protocol.
Note that both Sourcegraph's Cody and the Zed editor support MCP now. They offer other models besides Claude in their respective application.
The Model Context Protocol initial release aims to solve the N-to-M relation of LLM applications (mcp clients) and context providers (mcp servers). The application is free to choose any model they want. We carefully designed the protocol such that it is model independent.
LLM applications just means chat applications here though right? This doesn't seem to cover use cases of more integrated software. Like a typical documentation RAG chatbot.
OpenAI has Actions which is relevant for this too: https://platform.openai.com/docs/actions/actions-library
Here's one for performing GitHub actions: https://cookbook.openai.com/examples/chatgpt/gpt_actions_lib...
Local only solves a lot of problems. Our infrastructure does tend to assume that data and credentials are on a local computer - OAuth is horribly complex to set up and there's no real benefit to messing with that when local works fine.
I'm honestly happy with them starting local-first, because... imagine what it would look like if they did the opposite.
> It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
In which case the "API" would be governed by a contract between Anthropic and Github, to which you're a third party (read: sharecropper).
Interoperability on the web has already been mostly killed by the practice of companies integrating with other companies via back-channel deals. You are either a commercial partner, or you're out of the playground and no toys for you. Them starting locally means they're at least reversing this trend a bit by setting a different default: LLMs are fine to integrate with arbitrary code the user runs on their machine. No need to sign an extra contact with anyone!
> It would be a lot more interesting if I could connect this to my github in the web app and claude automatically has access to my code repositories.
From the link:
> To help developers start exploring, we’re sharing pre-built MCP servers for popular enterprise systems like Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer.
Yes but you need to run those servers locally on your own machine. And use the desktop client. That just seems... weird?
I guess the reason for this local focus is, that it's otherwise hard to provide access to local files. Which is a decently large use-case.
Still it feels a bit complicated to me.
For me it's complementary to openai's custom GPTs which are non-local.
Awesome!
In the "Protocol Handshake" section of what's happening under the hood - it would be great to have more info on what's actually happening.
For example, more details on what's actually happening to translate the natural language to a DB query. How much config do I need to do for this to work? What if the queries it makes are inefficient/wrong and my database gets hammered - can I customise them? How do I ensure sensitive data isn't returned in a query?
One thing I am having a hard time wrapping my head around is how to reliably integrate business logic into a system like this. Just hook up my Rails models etc. and have it use those?
Let’s say I’ve got a “widgets” table and I want the system to tell me how many “deprecated widgets” there are, but there is no convenient “deprecated” flag on the table—it’s defined as a Rails scope on the model or something (business logic).
The DB schema might make it possible to run a simple query to count widgets or whatever, but I just don’t have a good mental model of how these systems might work with “business logic” type things.
Sounds like you may want an MCP server for your Rails API instead of connecting directly to db.
This is exactly what I've been trying to figure out. At some point the LLM needs to produce text, even if it is structured outputs, and to do that it needs careful prompting. I'd love to see how that works.
You can use MCP with Sourcegraph's Cody as well
https://sourcegraph.com/blog/cody-supports-anthropic-model-c...
In case anyone else is like me and wanted to try the filesystem server before anything else, you may have found the README insufficient.
You need to know:
1. The claude_desktop_config.json needs a top-level mcpServer key, as described here: https://github.com/modelcontextprotocol/servers/pull/46/comm...
2. If you did this correctly the, after you run Claude Desktop, you should see a small 'hammer' icon (with a number next to it) next to the labs icon, in the bottom right of the 'How can Claude help you today?' box.
Yeah this was a huge foot gun
I don't trust an open source solution by a major player unless it's published with other major players. Otherwise, the perverse incentives are too great.
What risk do you foresee arising out of perverse incentives in this case?
Changing license terms, aggressive changes to the API to disallow competition, horrendous user experience that requires a support contract. I really don't think there's a limit to what I've seen other companies do. I generally trust libraries that competitors are maintaining jointly since there is an incentive toward not undercutting anyone.
The Model Context server is similar to what we've built at Spice, but we've focused on databases and data systems. Overall, standards are good. Perhaps we can implement MCP as a data connector and tool.
[1] https://github.com/spiceai/spiceai
If function calling is sync, is MCP its async counterpart? Is that the gist of what MCP is?
Open API (aka swagger) based function calling is standard already for sync calls, and it solves the NxM problem. I'm wondering if the proposed value is that MCP is async.
The protocol felt unnecessarily complicated till I saw this
https://modelcontextprotocol.io/docs/concepts/sampling
It's crazy. Sadly not yet implemented in Claude Desktop client.
What's use case for this?
what’s crazy about it?
I would love to integrate this into my platform of tools for AI models, Toolhouse [1], but I would love to understand the adoption of this protocol, especially as it seems to only work with one foundational model.
[1] https://toolhouse.AI
This looks pretty awesome.
Would love to chat with you if you are open about possible collab.
I am frank [at] glama.ai
Emailed
So it’s basically a standardized plugin format for LLM apps and thats why it doesn’t support auth.
It’s basically a standardized way to wrap you Openapi client with a standard tool format then plug it in to your locally running AI tool of choice.
I appreciate the effort, but after spending more than one hour on it, I still don't understand how and why I'd use this.
The Core architecture [1] documentation is given in terms of TypeScript or Python abstractions, adding a lot of unnecessary syntactic noise for someone who doesn't use these languages. Very thin on actual conceptual explanation and full of irrelevant implementation details.
The 'Your first server'[2] tutorial is given in terms of big chunks of python code, with no explanation whatsoever, eg:
The code doesn't even compile. I don't think this is ready for prime time yet so I'll move along for now.[1] https://modelcontextprotocol.io/docs/concepts/architecture [2] https://modelcontextprotocol.io/docs/first-server/python
L402's (1) macaroon-based authentication would fit naturally with MCP's server architecture. Since MCP servers already define their capabilities and handle tool-specific requests, adding L402 token validation would be straightforward - the server could check macaroon capabilities before executing tool requests. This could enable per-tool pricing and usage limits while maintaining MCP's clean separation between transport and tool implementation. The Aperture proxy could sit in front of MCP servers to handle the Lightning payment flow, making it relatively simple to monetize existing MCP tool servers without significant modifications to their core functionality.
(1) https://github.com/lightninglabs/aperture
WRT prompts vs sampling: why does the Prompts interface exclude model hints that are present in the Sampling interface? Maybe I am misunderstanding.
It appears that clients retrieve prompts from a server to hydrate them with context only, to then execute/complete somewhere else (like Claude Desktop, using Anthropic models). The server doesn’t know how effective the prompt will be in the model that the client has access to. It doesn’t even know if the client is a chat app, or Zed code completion.
In the sampling interface - where the flow is inverted, and the server presents a completion request to the client - it can suggest that the client uses some model type /parameters. This makes sense given only the server knows how to do this effectively.
Given the server doesn’t understand the capabilities of the client, why the asymmetry in these related interfaces?
There’s only one server example that uses prompts (fetch), and the one prompt it provides returns the same output as the tool call, except wrapped in a PromptMessage. EDIT: lols like there are some capabilities classes in the mcp, maybe these will evolve.
Our thinking is that prompts will generally be a user initiated feature of some kind. These docs go into a bit more detail:
https://modelcontextprotocol.io/docs/concepts/prompts
https://spec.modelcontextprotocol.io/specification/server/pr...
… but TLDR, if you think of them a bit like slash commands, I think that's a pretty good intuition for what they are and how you might use them.
It's great! I quickly reorganised my custom gpt repo to build a shell agent using MCP.
https://github.com/rusiaaman/wcgw/blob/main/src/wcgw/client/...
Already getting value out of it.
i am curious: why this instead of feeding your LLM an OpenAPI spec?
It's not about the interface to make a request to a server, it's about how the client and server can interact.
For example:
When and how should notifications be sent and how should they be handled?
---
It's a lot more like LSP.
makes sense, thanks for the explanation!
Nobody [who knows what they're doing] wants their LLM API layer controlling anything about how their clients and servers interact though.
Not sure I understand your point. If it's your client / server, you are controlling how they interact, by implementing the necessaries according to the protocol.
If you're writing an LSP for a language, you're implementing the necessaries according to the protocol (when to show errors, inlay hints, code fixes, etc.) - it's not deciding on its own.
Even if I could make use of it, I wouldn't, because I don't write proprietary code that only works on one AI Service Provider. I use only LangChain so that all of my code can be used with any LLM.
My app has a simple drop down box where users can pick whatever LLM they want to to use (OpenAI, Perplexity, Gemini, Anthropic, Grok, etc)
However if they've done something worthy of putting into LangChain, then I do hope LangChain steals the idea and incorporates it so that all LLM apps can use it.
It's an open protocol; where did you get the idea that it would only work with Claude? You can implement it for whatever you want - I'm sure langchain folks are already working on something to accommodate it
Once fully adopted by at least 3 other companies I'll consider it a standard, and would consider it yes, if it solved a problem I have, which it does not.
Lots of companies open source some of their internal code, then say it's "officially a protocol now" that anyone can use, and then no one else ever uses it.
If they have new "tools" that's great however, but only as long as they can be used in LangChain independent of any "new protocol".
I do
I think OpenAI spec function calls are to this like what raw bytes are to unix file descriptors
They were referring to OpenAPI (formerly Swagger)
Same reason in Emacs we use lsp-mode and eglot these days instead of ad-hoc flymake and comint integrations. Plug and play.
For those interested, I've been working on something related to this, Web Applets – which is a spec for creating AI-enabled components that can receive actions & respond with state:
https://github.com/unternet-co/web-applets/
Moving from langchain interop to protocol interop for tools is great
Curious:
1. Authentication and authorization is left as a TODO: what is the thinking, as that is necessary for most use?
2. Ultimately, what does MCP already add or will add that makes it more relevant than OpenApI / a pattern on top?
The default transport should have accommodated binary data. Whether it’s tensors of image data, audio waveforms, or pre-tokenized NLP workloads it’s just going to hit a wall where JSON-RPC can’t express it uniquely and efficiently.
This is a really, really, good point.
Devil's advocating for conversation's sake: at the end of the day, the user and client app want very little persistent data coming from the server - if nothing else than the client is expecting to store chats as text, with external links or Potemkin placeholders for assets like files.
I agree with the devil's advocacy you've posed, and in retrospect I probably should have said "I bet these folks have a plan for binary data". These are clearly very serious people so it might be more accurate to say that I strongly suspect a subsequent revision of the protocol will bake in default transport-level handling of arbitrary tensors in an efficient way.
This is awesome. I have an assistant that I develop for my personal use and integrations are the more difficult part - this is a game changer.
Now let's see a similar abstraction on the client side - a unified way of connecting your assistant to Slack, Discord, Telegram, etc.
Building something for this at surferprotocol [dot] org. Imo not every company will expose API's for easily exporting data from their platforms (linkedin, imessage, etc), so devs have to build these themselves
Something is telling me this _might_ turn out to be a huge deal; I can't quite put a finger on what is that makes me feel that, but opening private data and tools via an open protocol to AI apps just feels like a game changer.
It's just function calling with a new name and a big push from the LLM provider, but this time it's in the right direction. Contrast with OpenAI's "GPTs", which are just function calling by another name, but pushed in the wrong direction - towards creating a "marketplace" controlled by OpenAI.
I'd say that thing you're feeling comes from witnessing an LLM vendor, for the first time in history, actually being serious about function calling and actually wanting people to use it.
But either way the interface is just providing a json schema of functions along with your chat completion request, and a server with ability to parse and execute the response. I’m not really seeing where a new layer of abstraction helps here (much less a new “protocol”, as though we need a new transport layer?
It smells like the thinking is that you (the developer) can grab from a collection of very broad data connectors, and the agent will be able to figure out what to do with them without much custom logic in between. Maybe I’m missing something
> It smells like the thinking is that you (the developer) can grab from a collection of very broad data connectors, and the agent will be able to figure out what to do with them without much custom logic in between.
This has always been the idea behind tools/function calling in LLMs.
What MCP tries to solve is the NxM problem - every LLM vendor has their own slightly different protocols for specifying and calling tools, and every tool supplier has to handle at least one of them, likely with custom code. MCP aims to eliminate custom logic at the protocol level.
LLMs can potentially query _something_ and receive a concise, high-signal response to facilitate communications with the endpoint, similar to API documentation for us but more programmatic.
This is huge, as long as there's a single standard and other LLM providers don't try to release their own protocol. Which, historically speaking, is definitely going to happen.
> This is huge, as long as there's a single standard and other LLM providers don't try to release their own protocol
Yes, very much this; I'm mildly worried because the competition in this space is huge and there is no shortage of money and crazy people who could go against this.
They will go against this. I don’t want to be that guy, but this moment in time is literally the opening scene of a movie where everyone agrees to work together in the bandit group.
But, it’s a bandit group.
Not necessarily. There’s huge demand to simplify the integration process between frontier models and consumers. If specs like this wind up saving companies weeks or months of developer time, then the MCP-compatible models are going to win over the more complex alternatives. This unlocks value for the community, and therefore the AI companies
One of the biggest issues of LLM is that they have a lossy memory. Say there is a function from_json that accepts 4 arguments. An LLM might predict that it accepts 3 arguments and thus produce non-functional code. However, if you add the docs for the function, the LLM will write correct code.
With the LLM being able to tap up to date context (like LSP), you won't need that back-and-forth dance. This will massively improve code generations.
This is definitely a huge deal - as long as there's a good developer experience - which IMHO we're not there yet!
Any feedback on developer experience is always welcomed (preferably in github discussion/issue form). It's the first day in the open. We have a long long way to go and much ground to cover.
This is great but will be DOA if OpenAI (80% market share) decides to support something else. The industry trend is that everything seems to converge to OpenAI API standard (see also the recent Gemini SDK support for OpenAI API).
There’s clearly a need for this type of abstraction, hooking up these models to various tooling is a significant burden for most companies.
Putting this out there puts OpenAI on the clock to release their own alternative or adopt this, because otherwise they run the risk of engineering leaders telling their C-suite that Anthropic is making headway towards better frontier model integration and OpenAI is the costlier integration to maintain.
True, but you could also frame this as a way for Anthropic to try and break that trend. IMO they've got to try and compete with OpenAI, can't just concede that OpenAI has won yet.
I wonder if they'll have any luck convincing other LLM vendors, such as Google, Meta, xAI, Mistral, etc, to adopt this protocol. If enough other vendors adopt it, it might still see some success even if OpenAI doesn't.
Also, I wonder if you could build some kind of open source mapping layer from their protocol to OpenAI's. That way OpenAI could support the protocol even if they don't want to.
"OpenAI API" is not a "standard" though. They have no interest in making it a standard, otherwise they would make it too easy to switch AI provider.
Anthropic is playing the "open standard" card because they want to win over some developers. (and that's good from that pov)
OpenAI API is natively supported by several providers (Google, Mistral, to name a few).
That’s only because they were first.
Here, Anthropic is first. If everyone starts using MCP today, any alternative OpenAI comes out with in a few months time probably won’t be able to dislodge it.
> 80% market share
where do you get that number?
If anyone here has an issue with their Claude Desktop app seeing the new MCP tools you've added to your computer, restart it fully. Restarting the Claude Desktop app did NOT work for me, I had to do a full OS restart.
Hm, this shouldn't be the case, something Odd is happening here. Normally restarting the app should do it, though on Windows it is easy to think you restarted the app when you really just closed the main window and reopened it (you need to close the app via File => Qui)
How does this work for access controlled data ? I don't see how to pass auth credentials?
Required for
- corporate data sources, e g. Salesforce
- APIs with key limits and non-trivial costs
- personal data sources e.g. email
It appears that all auth is packed into the MCP config, e.g. slack token: https://github.com/modelcontextprotocol/servers/tree/main/sr...
Are there any other Desktop apps other than Claude's supporting this?
My team and I have a desktop product with a very similar architecture (a central app+UI with a constellation of local servers providing functions and data to models for local+remote context)
If this protocol gets adoption we'll probably add compatibility.
Which would bring MCP to local models like LLama 3 as well as other cloud providers competitors like OpenAI, etc
would love to know more
Landing page link is in my bio
We've been keeping quiet, but I'd be happy to chat more if you want to email me (also in bio)
Cody (VS Code plugin) is supporting MCP https://sourcegraph.com/blog/cody-supports-anthropic-model-c...
What about ChatGPT Desktop? Do you think they will add support for this?
I hope so, I use Claude Desktop multiple times a day.
Does aider benefit from this? Big part of aiders special sauce is the way it builds context, so it feels closely related but I don't know how the pieces would fit together here
My guess is more can be done locally. Then again I only understand ~2 of this and aider.
I'm wondering if there will be anything that's actually LLM-specific about these API's. Are they useful for ordinary API integration between websites?
Possibly marginally, but the "server" components here are ideally tiny bits of glue that just reformat LLM-generated JSON requests into target-native API requests. Nothing interesting "should" be happening in the context protocol. Examining the source may provide you with information on how to get to the real API for the service, however.
Tangential question: Is there any LLM which is capable of preserving the context through many sessions, so it doesn't have to upload all my context every time?
it's a bit of a hack but the web UI of ChatGPT has a limited amount of memories you can use to customize your interactions with it.
"remember these 10000 lines of code" ;)
In an ideal world gemini (or any other 1M token context model) would have an internal 'save snapshot' option so one could resume a blank conversation after 'priming' the internal state (activations) with the whole code base.
I'm surprised that there doesn't seem to be a concept of payments or monetization baked into the protocol. I believe there are some major companies to be built around making data and API actions available to AI Models, either as an intermediary or marketplace or for service providers or data owners directly- and they'd all benefit from a standardised payment model on a per transaction level.
I’ve gone looking for services like this but couldn’t find much, any chance you can link to a few platforms?
I'm working on Exfunc [1], which is an API for AI to fetch data and take action on the web. Sent you an email, would love to chat.
[1] https://www.exfunc.com
I think the most concise way to describe Anthropic MCP is that it's ODBC for AI.
A picture is worth a 1k words.
Is there any good arch diagram for one of the examples of how this protocol may be used?
I couldn’t find one easily…
Is there any plans to add Well-known URI[1] as a standard? It would be awesome if we can add services just by inputting domain names of the services.
[1]: https://en.wikipedia.org/wiki/Well-known_URI
We're still in the process of thinking through and fleshing out full details for remote MCP connections. This is definitely a good idea to include in the mix!
How is this different from function calling libraries that frameworks like Langchain or Llamaindex have built?
After a quick look it seemed to me like they're trying to standardize on how clients call servers, which nobody needs, and nobody is going to use. However if they have new Tools that can be plugged into my LangChain stuff, that will be great, and I can use that, but I have no place for any new client/server models.
One thing I dont understand.. does this rely on vector embeddings? Or how does the AI interact with the data? The example is a sqllite satabase with prices, and it shows claude being asked to give the average price and to suggest pricing optimizations.
So does the entire db get fed into the context? Or is there another layer in between. What if the database is huge, and you want to ask the AI for the most expensive or best selling items? With RAG that was only vaguely possible and didnt work very well.
Sorry I am a bit new but trying to learn more.
Vector embeddings are entirely unrelated to this.
This is about tool usage - the thing where an LLM can be told "if you want to run a SQL query, say <sql>select * from repos</sql> - the code harness well then spot that tag, run the query for you and return the results to you in a chat message so you can use them to help answer a question or continue generating text".
it doesnt feed the whole DB into the context, it gives Claude the option to QUERY it directly
It never accidentally deletes anything? Or I guess you give it read only access? It is querying it through this API and some adapter built for it, or the file gets sent through the API, they recognize it is sqllite and load it on their end?
It can absolutely accidentally delete things. You need to think carefully about what capabilities you enable for the model.
It took me about 5 jumps before learning what the protocol is, except learning that it's something awesome, and community driven, and open source.
So this allows you to connect your sqllite to Claud desktop, so it executes sql commands on your behalf instead of you entering it, it also chooses the right db on its own, similar to what functions do
Could this be used to voice control an android phone using Tasker functions? Just expose all the functions as actions and then let it rip?
Let’s see how other relevant players like Meta, Amazon and Mistral reacts to this. Things like these just make sense with broader adoption and diverse governance model
are there any examples of using this with the anthropic API to build something like Claude Desktop?
the docs aren't super clear yet wrt. how one might actually implement the connection. do we need to implement another set of tools to provide to the API and then have that tool call the MCP server? maybe i'm missing something here?
It is clear this is a wrapper around the function calling paradigm but with some extensions that are specific to this implementation. So it is an SDK.
Can I point this at my existing private framework and start getting Claude 3.5 code suggestions that utilize our framework it has never seen before?
Can someone please give examples of uses for this?
let Claude answer questions about your files and even modify them
This is something new, Good job!
Sensible standards and open protocols. Love to see the industry taking form like this.
I eventually return from every brabra protocol/framework to SQL, txt, standard library, due to inefficiency of introducing meaningless layer. People or me while a go often avoid confronting difficult problems which actually matters. Rather worse, frameworks, buzz technology words are the world of incompetitive people.
If it gets traction this could be great. Industry sure could do with some standardisation
Is this similar to what Sourcegraph's OpenCtx tries to do?
Has OpenCtx ever gained much traction?
Yeah, we’re using it a lot at Sourcegraph. There are some extra APIs it offers beyond what MCP offers, such as annotations (as you can see on the homepage of https://openctx.org). We worked with Anthropic on MCP because this kind of layer benefits everyone, and we’ve already shipped interoperability.
Interesting. In Cody training sessions given by Sourcegraph, I saw OpenCtx mentioned a few times "casually", and the focus is always on Cody core concepts and features like prompt engineering and manual context etc. Sounds like for enterprise customers, setting up context is meant for infrastructure teams within the company, and end users mostly should not worry about OpenCxt?
Most users won't and shouldn't need to go through the process of adding context sources. In the enterprise, you want these to be chosen by (and pre-authed/configured by) admins, or at least not by each individual user, because that would introduce a lot of friction and inconsistency. We are still working on making that smooth, which is why we haven't been very loud about OpenCtx to end users yet.
But today we already have lots of enterprise customers building their own OpenCtx providers and/or using the `openctx.providers` global settings in Sourcegraph to configure them in the current state. OpenCtx has been quite valuable already here to our customers.
Is this basically open source data collectors / data integration connectors?
I would probably more think of it as LSP for LLM applications. It is enabling data integrations, but the current implementations are all local.
Thank you for creating this.
really great to see some standards emerging. i'd love to see something like mindsdb wired up to support this protocol and get a bunch of stuff out of the box.
Is it Datacontext that is aware as and when we add columns in the db what it means. How can we make every schema change that happens on db is context aware that is not clear.
Strange place for WS* to respawn.
L0L, this is basically OpenAI spec function calls with a different semantics.
Spyware-As-A-Service
If you run a SaaS and want to rapidly build out a CLI that you could plug into this ~and~ want something that humans can use, check out the project I’ve been working on at https://terminalwire.com
tl;dr—you can build & ship a CLI without needing an API. Just drop Terminalwire into your server, have your users install the thin client, and you’ve got a CLI.
I’m currently focused on getting the distribution and development experience dialed in, which is why I’m working mostly with Rails deployments at the moment, but I’m open to working with large customers who need to ship a CLI yesterday in any language or runtime.
If you need something like this check it out at https://terminalwire.com or ping me brad@terminalwire.com.
Computer Science: There's nothing that can't be solved by adding another layer.
it's actually software engineering! computer science can solve everything by drawing more graphs with a pencil.
I see a good number of comments that seem skeptical or confused about what's going on here or what the value is.
One thing that some people may not realize is that right now there's a MASSIVE amount of effort duplication around developing something that could maybe end up looking like MCP. Everyone building an LLM agent (or pseudo-agent, or whatever) right now is writing a bunch of boilerplate for mapping between message formats, tool specification formats, prompt templating, etc.
Now, having said that, I do feel a little bit like there's a few mistakes being made by Anthropic here. The big one to me is that it seems like they've set the scope too big. For example, why are they shipping standalone clients and servers rather than client/server libraries for all the existing and wildly popular ways to fetch and serve HTTP? When I've seen similar mistakes made (e.g. by LangChain), I assume they're targeting brand new developers who don't realize that they just want to make some HTTP calls.
Another thing that I think adds to the confusion is that, while the boilerplate-ish stuff I mentioned above is annoying, what's REALLY annoying and actually hard is generating a series of contexts using variations of similar prompts in response to errors/anomalies/features detected in generated text. IMO this is how I define "prompt engineering" and it's the actual hard problem we have to solve. By naming the protocol the Model Context Protocol, I assumed they were solving prompt engineering problems (maybe by standardizing common prompting techniques like ReAct, CoT, etc).
Your point about boilerplate is key, and it’s why I think MCP could work well despite some of the concerns raised. Right now, so many of us are writing redundant integrations or reinventing the same abstractions for tool usage and context management. Even if the first iteration of MCP feels broad or clunky, standardizing this layer could massively reduce friction over time.
Regarding the standalone servers, I suspect they’re aiming for usability over elegance in the short term. It’s a classic trade-off: get the protocol in people’s hands to build momentum, then refine the developer experience later.
I don't see I or any other developer would abandon their homebrew agent implementation for a "standard" which isn't actually a standard yet.
I also don't see any of that implementation as "boilerplate". Yes there's a lot of similar code being written right now but that's healthy co-evolution. If you have a look at the codebases for Langchain and other LLM toolkits you will realize that it's a smarter bet to just roll your own for now.
You've definitely identified the main hurdle facing LLM integration right now and it most definitely isn't a lack of standards. The issue is that the quality of raw LLM responses falls apart in pretty embarrassing ways. It's understood by now that better prompts cannot solve these problems. You need other error-checking systems as part of your pipeline.
The AI companies are interested in solving these problems but they're unable to. Probably because their business model works best if their system is just marginally better than their competitor.
data security is the reason i'd imagine they're letting other's host servers
The issue isn’t with who’s hosting, it’s that their SDKs don’t clearly integrate with existing HTTP servers regardless of who’s hosting them. I mean integrate at the source level, of course they could integrate via HTTP call.
I took time to read everything on Twitter/Reddit/Documentation about this.
I think I have a complete picture.
Here is a quickstart for anyone who is just getting into it.
https://glama.ai/blog/2024-11-25-model-context-protocol-quic...
It would appear the HN hug knocked your host offline, given the 525 TLS Is Bogus and its 502 Bad Gateway friend
I managed to get it to load: https://archive.ph/7DALF
It's actually due to fly.io outage https://news.ycombinator.com/item?id=42241851
Yeah this is a phenomenal resource, so much so I just tried to come back to it. Going to bookmark it and hope it shows back up!
fine. Examples, what can I use this for?
have you read either TFA or their blog post? There are plenty of examples.
I love how they’re pretending to be champions of open source while leaving this gem in their terms of use
“”” You may not access or use, or help another person to access or use, our Services in the following ways: … To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models. “””
Presumably this doesn't apply to the standard being released here, nor any of its implementations made available. Each of these appears to have its own permissible license.
OpenAI and many other companies have virtually the same language in their T&Cs.
that doesn't absolve any of them
Absolve them of what?
OpenAI says, "[You may not] Use Output to develop models that compete with OpenAI." That feels more narrow than Anthropic's blanket ban on any machine learning development.
Eh the actual MCP repos seem to just be MIT licensed; AFAIK every AI provider has something similar for their core services as they do.
I think open-sourcing your tech for the common person while leaving commercial use behind a paywall or even just against terms is completely acceptable, no?
I don't understand the value of this abstraction.
I can see the value of something like DSPy where there is some higher level abstractions in wiring together a system of llms.
But this seems like an abstraction that doesn't really offer much besides "function calling but you use our python code".
I see the value of language server protocol but I don't see the mapping to this piece of code.
That's actually negative value if you are integrating into an existing software system or just you know... exposing functions that you've defined vs remapping functions you've defined into this intermediate abstraction.
Here's the play:
If integrations are required to unlock value, then the platform with the most prebuilt integrations wins.
The bulk of mass adopters don't have the in-house expertise or interest in building their own. They want turnkey.
No company can build integrations, at scale, more quickly itself than an entire community.
If Anthropic creates an integration standard and gets adoption, then it either at best has a competitive advantage (first mover and ownership of the standard) or at worst prevents OpenAI et al. from doing the same to it.
(Also, the integration piece is the necessary but least interesting component of the entire system. Way better to commodify it via standard and remove it as a blocker to adoption)
The secret sauce part is the useful part -- the local vector store. Anthropic is probably not going to release that without competitive pressure. Meanwhile this helps Anthropic build an ecosystem.
When you think about it, function calling needs its own local state (embedded db) to scale efficiently on larger contexts.
I'd like to see all this become open source / standardized.
im not sure what you mean - the embedding model is independent of the embeddings themselves. Once generated, the embeddings and vector store should exist 100% locally and thus not part of any secret sauce
The Zed editor team collaborated with Anthropic on this, so you can try features of this in Zed as of today: https://zed.dev/blog/mcp
So they want an open protocol, and instead of say collaborating with other people that provide models like Google, Microsoft, Mistral, Cohere and the opensource community, they collaborate with an editor team. Quite the protocol. Why should Microsoft implement this? If they implement their own protocol, they win. Why should Google implement this? If they implement their own protocol, they win too. Both giants have way more apps and reach in inside businesses than Anthropic can wish.
Looks like I need to create a rust extension wrapper for the mcp server I created for Claude?
I'm a little confused as to the fundamental problem statement. It seems like the idea is to create a protocol that can connect arbitrary applications to arbitrary resources, which seems underconstrained as a problem to solve.
This level of generality has been attempted before (e.g. RDF and the semantic web, REST, SOAP) and I'm not sure what's fundamentally different about how this problem is framed that makes it more tractable.
RPC for LLMs with the first client being Claude Desktop. ;-)
Good thread showing how this works: https://x.com/alexalbert__/status/1861079762506252723
Twitter doesn't work anymore unless you are logged in.
https://unrollnow.com/status/1861079762506252723