Hualin Luan Cloud Native · Quant Trading · AI Engineering
Back to articles

Article

What MCP changes is not tool access, but the cost structure of Agents.

The real significance of MCP is not just to unify tool access, but to move a large number of intermediate processes that should be handled by the runtime out of the expensive LLM cycle. What it changes is not 'how many tools can be connected', but how the agent uses context, code execution and runtime control flow. This article is based on Anthropic's discussion of code execution with MCP and extends my complete understanding of direct tool-calling, progressive disclosure, runtime economics and executable skills.

Meta

Published

3/25/2026

Category

interpretation

Reading Time

12 min read

Original reference: Adam Jones & Conor Kelly, “Code execution with MCP: Building more efficient agents”. This article is an original interpretation, not a translation.

What MCP changes is not tool access, but the cost structure of Agents.

If you have been watching MCP-related discussions in recent months, you will find that there is a particularly popular saying: the meaning of MCP is to unify tool access standards.

This is not wrong, but it is too shallow. It is shallow enough to make many teams go astray when actually setting up the system - they will regard MCP as a layer of protocol adapters, busy counting “how many servers we have connected” and “what tools we can call now”, but they do not realize that what MCP really changes is not the tool directory, but the entire runtime cost structure of the agent. **

The most worth reading of Anthropic’s “Code execution with MCP: Building more efficient agents” is not that it once again explains what MCP is, but that it advances the perspective from “protocol correctness” to “runtime economics”.

It reminds us: when an agent faces a large number of tools, complex processes, large intermediate results, and multi-step control flows, what is really expensive is often not the tools themselves, but how these things enter and exit the LLM loop. **

From this perspective, the meaning of MCP completely changes. It is no longer just “more standard”, but “can more work that originally relied on repeated participation of the model be migrated to a cheaper, more stable, and more orchestrated runtime to complete?”

This is where I think MCP is really worth focusing on in 2026.

1. Why is the term “Unified Tool Agreement” misleading?

The biggest problem with understanding MCP as a tool access standard is not that it is incorrectly stated, but that it will bring the team’s attention to secondary contradictions.

Because once this is understood, the most natural optimization direction will become: connect more tools, maintain more schemas, build a more complete tool catalog, strengthen server discoverability, and count the number of connections.

Of course, these things are valuable, but they only solve “whether it can be connected”, not “whether the overall system will be more efficient after being connected.”

The real cost black hole of the agent system often occurs at another level: there are too many tool definitions, and just injecting context is very expensive; the tool results are too large, and intermediate results repeatedly enter and exit the model; the control flow is forced to be placed in the model loop, resulting in rounds of redundant reasoning; a large number of things that can be processed programmatically are handed over to the model to orchestrate back and forth.

In other words, many agents today are not lacking tools, but rather a runtime structure that can separate the “middle-layer work used by tools” from the LLM loop.

If you start from here, you will find that the real value of MCP is not in “more tools”, but in “more reasonable work distribution”.

2. Why does direct tool-calling become more and more expensive in real systems?

Many agent systems take the direct tool-calling route by default: define the tool to the model, the model selects the tool, the tool is executed, the results are returned to the context, and the model continues to decide the next step.

This process is usually fine or even useful in small-scale demo scenarios. Because: fewer tools, short results, shallow paths, and easy for people to stare at.

But once you enter a real production environment, problems will quickly arise, and it is not a bug, but a systemic cost increase.

1. Schema injection will become a context tax

As the number of tools increases, just feeding these schemas, descriptions, and parameter definitions into the model will begin to consume a lot of context budget. The more tools you have, the higher this tax is.

2. Intermediate results will generate context churn

The results of tool A are returned to the model, and the model then decides to call B, and the results of B are returned. As a result, a lot of data that could have been processed internally during runtime is forced to enter prompt, exit prompt, and enter prompt again.

Not only is this expensive, it also tends to contaminate working memory.

3. Control flow is forced into the model loop

Things like these: retry, polling, waiting, paging, batch filtering, join / aggregate, intermediate value desensitization - originally more suitable for code execution, but in the direct tool-calling mode, it is often driven by the model. As a result, the originally programmed tasks were packaged into expensive reasoning processes.

I feel more and more that this is the most unnecessary and expensive waste of many agent runtimes today.

3. The real value of MCP + code execution is to move the programmatic middle layer out of the model

The really important thing about Anthropic’s article is that it doesn’t just talk about MCP, but how the division of responsibilities will change after the combination of MCP and code execution.

This change can be summarized in one sentence:

The model is responsible for deciding “what to do”, and the execution environment is responsible for completing “how to batch, how to loop, how to filter, and how to organize intermediate results”.

Once labor is divided like this, the cost structure of the entire system changes.

Original model: The model not only has to make high value judgments, but also repeatedly participates in many mechanical intermediate steps.

New model: Model planning and code generation; the code calls MCP-backed APIs at runtime to complete a large amount of programmable work, and finally only brings the truly necessary key information back to the model.

This means: a large number of intermediate results can be left inside the runtime, a large number of control flows do not have to go through LLM at every step, privacy processing can be moved forward, large results can be aggregated/filtered first and then fed back, and tool usage can be more like a program rather than more like a conversation.

This is not a small optimization, but changes the agent’s running mode from “high-frequency human brain manually staring at the console” to “an execution system with an automated middle layer.”

4. Why is progressive disclosure the practical value of MCP in a large-scale tool environment?

If I could sum up the large-scale value of MCP in just one most pragmatic statement, I would say: **It makes it more natural to expose tools on demand. **

This is what progressive disclosure is about.

In a truly large tool ecosystem, the most unreasonable thing is to assume that the model should see all tool definitions from the beginning. That’s not giving it power, that’s creating cognitive debt for it.

A more reasonable approach should be layering: first know what servers/capability domains there are, then search for related tools when really needed, read more detailed schema before confirming the call, and let the runtime take on the mechanical steps when actually executing.

The benefits brought by this layering are very practical: reducing context occupancy, reducing irrelevant capability noise, reducing interference between tool descriptions, and making it easier for the agent to focus on the current task.

This is essentially the same direction as many trends in skills, retrieval, and context compaction today:

**Not all abilities should be injected initially; many abilities should only enter working memory when they are truly needed. **

MCP makes this model easier to systematize.

5. The importance of execution environment in the MCP era is far higher than that of the protocol layer

I think there is still an obvious deviation in many teams’ MCP discussions: excessive focus on protocol definition and underestimation of runtime design.

But once you really accept the code execution + MCP model, you will find that the most critical issues actually come to the runtime: from execution environment - where the code runs, how long it runs, what permissions it has, and which MCP servers it is connected to; to state management - how to persist intermediate results, how to keep large output in the runtime, and how to prevent sensitive information from flowing back into the context; to fault tolerance - how to retry or recover if it fails.

In other words, what really determines the system experience is no longer “whether MCP is followed or not”, but “how MCP is consumed by a suitable operating environment”.

This is why I increasingly understand MCP as a runtime substrate, not just an interoperability protocol.

Agreements are important, of course, but agreements by themselves do not automatically produce good systems. Only a good runtime will.

6. Not all tools are suitable for direct exposure to models

What I particularly agree with about this article is that it implicitly challenges a common default assumption: as long as a certain capability exists, it is best to expose it directly to the model as a tool schema.

I don’t think this is true in a large-scale agent runtime.

When capabilities have these characteristics, direct exposure is usually not the best solution: there are many tools, the results are large, require long control flows, require multi-step aggregation, intermediate results are not worth looking at one by one in the model, and involve sensitive information processing.

This type of capability is more suitable for entering the runtime code layer first, and then returning it to the model in a higher-level, compressed form.

In other words, a truly mature agent runtime will be layered: use direct tool-calling for simple, short-path, low-result tasks; use code execution + MCP for complex, multi-step, large-result tasks; and use workflow-level orchestration for larger process-level tasks.

This is not “weakening the model”, but liberating the model from low-value mechanical links.

7. What MCP really changes is the contextual economics of agents.

I have always felt that this is the most worthwhile concept to take away from the entire article: **Contextual Economics. **

Because many people today still regard context as a container into which things can be continuously stuffed. But context is actually an expensive, scarce, and perishable working memory resource. Anything that comes into context, comes with an opportunity cost.

From this perspective, the real value of MCP is not to “connect a few more tools”, but to help the system redistribute the following issues: what must go into the context, what can stay in the runtime, what should be compressed by the code first and then come back, what should be completely externalized, and what should be exposed to the model only on demand.

This is a very typical system design idea:

**Let expensive resources only carry high-value information, and let cheap resources undertake mechanical work. **

If you understand MCP from this perspective, you will realize that what it changes is not the tool catalog, but the cost curve of the entire agent system.

8. What changes will occur to skills as a result: from prompt bundle to executable module

The article also mentioned a direction worth exploring: reusable code modules and skills.

I think this matter is actually of great significance. Because once the agent can write a stable intermediate module in the runtime, encapsulate the MCP calling logic, and reuse it in subsequent tasks, the boundaries of the skill will change.

In the past, many skills were more like: a set of instructions, a strategic tip, and a set of precautions.

But looking forward, more and more skills may become: instructions + executable code, semantic description + call encapsulation, policy template + runtime asset.

In other words, in the future, skills will not only tell the agent “what to think”, but will increasingly tell the agent “what intermediate execution skeletons are already available and can be used directly.”

This will allow the skill to evolve from prompt asset to execution asset.

9. A more practical judgment framework: when to choose local runtime and when to choose remote runtime

If I were to give the team a very practical selection framework, I would hope that they would at least ask these few things first:

Signs suitable for remote runtime: Tasks naturally generate reusable artifacts, people are likely to need to take over midway, process visibility itself is valuable, the computing load is focused on local and is not an ideal execution surface, and the output is not just a conclusion but a space to continue working.

Signs suitable for local runtime: Feedback loops must be extremely short, strong dependence on local repositories and long-term processes, a large number of fine-grained combinations of CLI are required, human developers themselves are the main operators, and the resulting artifacts do not need to be retained as a collaboration space for a long time.

The significance of this framework is not to make every choice absolutely correct, but to remind the team that runtime is not the default value, but should be the task matching result.

10. A truly mature agent runtime does not leave everything to the model, but becomes increasingly clear about which things should not be left to the model.

If I had to condense the entire article into one harsher sentence, it would be:

A truly mature runtime does not let the model control everything, but becomes increasingly clear about which intermediate links should be firmly taken back from the model.

On the surface, this thing looks like saving money, but it is actually doing something more important: freeing the model from meaningless intermediate labor and allowing it to spend its reasoning budget on the most valuable judgments.

Because of this, I think the real value of MCP is not that it allows the system to have more ability names, but that it allows the system to finally know which abilities should stay at which level.


Who should read this

This article is suitable for the following types of readers:

  • Engineering team working on MCP server/client/runtime design
  • People who have felt that there are more tools but more chaos in the agent system
  • People who want to re-layer code execution, tool use, and runtime orchestration
  • The platform team is designing skills, execution environment, and policy layer
  • People who have personal experience of “why agents are expensive, slow, and easily dragged to death by big results”

Reading path

Continue along this topic path

Follow the recommended order for MCP Runtime instead of jumping through random articles in the same topic.

View full topic path →

Next step

Go deeper into this topic

If this article is useful, continue from the topic page or subscribe to follow later updates.

Return to topic Subscribe via RSS

RSS Subscribe

Subscribe to updates

Follow new articles in an RSS reader without checking the site manually.

Recommended readers include Follow , Feedly or Inoreader and other RSS readers.

Comments and discussion

Sign in with GitHub to join the discussion. Comments are synced to GitHub Discussions

Loading comments...