Article

Original interpretation: MCP protocol - the USB-C moment of the Agent ecosystem

An in-depth analysis of the essence of the Model Context Protocol protocol design and why standardization is the key to the prosperity of the Agent ecosystem

Topic · AI engineering practice

Mcp Model Context Protocol Agent Tools Interoperability Original Interpretation

Introduction: That Monday morning of integration nightmare

It was a Monday morning in the spring of 2024, and I arrived an hour earlier than usual. Not because of diligence, but because of anxiety.

In the past two weeks, our team has been working on a “simple” task: enabling the newly developed Agent to call the company’s internal system - query the user database, send notification emails, and create work order records. According to initial estimates, this should only take three days: one day to read the documentation, one day to write code, and one day to test.

But two weeks passed and we were stuck.

First system - User database, using PostgreSQL. We wrote the SQL query, but found that the permission models did not match. Agent needs to query as “application”, and our permission system is designed for “user”. After rewriting the permission logic, it was discovered that there was a problem with the connection pool configuration, and the connections would be exhausted when concurrency was high.

Second System - Mail service, using SendGrid. The API call is very simple, but the email template needs to be generated dynamically. We found that the HTML generated by the Agent was often malformed, and the email client rendered it into a mess. Later, it was discovered that the API flow is limited when sending in batches, and retry and backoff logic need to be implemented.

Third system - Work order system, using internal REST API. The documentation is incomplete, and some parameters can only be known by looking at the source code. To make matters worse, the API version is being upgraded, and the endpoint we are connecting to is marked as “deprecated”, but the documentation for the new version has not yet been completed.

By the end of the third week, we had written thousands of lines of “adaptation code”: database connection pool management, email template rendering, API error handling, current limit control, retry logic… These codes have nothing to do with the “intelligence” of the Agent, they are just tedious plumbing.

Ironically, another team heard about our “integration experience” and wanted to reuse our code. But what they use is not PostgreSQL, but MySQL; not SendGrid, but Mailgun; although the work order system has the same origin, it has different versions. What we call “experience” is almost impossible to reuse.

At that moment I realized: **Agent’s ability lies not in how strong the model is, but in how many external systems it can collaborate with. The cost of integration is often the main reason for project failure. **

So when Anthropic released MCP (Model Context Protocol), I felt a sense of long-lost excitement - this might be the solution we’ve been waiting for.

Chapter 1: Why the Agent World Needs “USB-C”

1.1 The painful reality of tool integration

Before we dive into MCP, let’s take a look at what a world without MCP would look like.

Scenario 1: Database query

Each database has a different connection method: PostgreSQL uses psycopg2, MySQL uses mysql-connector, MongoDB uses pymongo, and Snowflake uses a dedicated SDK. The connection parameters vary: some use URL, some use host + port + database name, and some also require warehouse and schema.

Permission management is even more diverse: some use username and password, some use IAM roles, some use OAuth, and some use certificates.

The agent needs to “know” all these differences, hardcoding the adaptation logic in the code. Every time a database is added, a set of adaptation codes must be added.

Scenario 2: API call

External APIs are called in different ways:

Authentication method: API Key, OAuth, JWT, HMAC signature
Parameter passing: JSON body, form data, query string
Error handling: HTTP status code, JSON error body, custom format
Current limiting strategy: requests per second, daily quota, concurrency limit

Agent needs to write specific client code for each API. To make matters worse, these APIs will be upgraded, deprecated, and behavior will change, and the adaptation code requires continuous maintenance.

Scenario Three: Documentation and Discovery

Suppose you are an Agent and want to know what tools are currently available. Under traditional architecture, you need:

Look through the code to find definitions for all tools
Read the comments or documentation to understand the purpose of the tool
View the function signature to understand parameters and return values
Guess the behavior of certain edge cases

The process is manual, non-standard, and error-prone.

1.2 MCP Analogy: Unification of USB-C

The value of MCP can be understood by analogy with USB-C.

Before the emergence of USB-C, electronic devices had various charging interfaces: Apple’s Lightning, Android’s Micro-USB, laptop’s round power supply, and dedicated interfaces for various devices. Traveling requires bringing a bunch of different charging cables and chargers.

USB-C provides a unified standard: an interface that simultaneously supports charging, data transmission, and video output. Device manufacturers only need to support USB-C to be compatible with various accessories. Consumers only need a USB-C cable to charge various devices.

MCP plays a similar role in the Agent world:

Unified interface: Agents and tools communicate using standardized protocols.
Self-description: The tool automatically declares its capabilities and the Agent automatically discovers them.
Plug and Play: MCP compliant tool that can be used by any MCP client

1.3 The value of standardized protocols

The value of MCP is not only at the technical level, but also at the economic level.

For Agent Developers:

No need to write adaptation code for each tool
Quick access to a large number of ready-made tools
Low tool switching cost (switching from PostgreSQL to MySQL does not require rewriting code)

For tool developers:

No need to write adaptation code for each Agent platform
Implement once, use everywhere
Can be automatically discovered and increase exposure

For the ecology as a whole:

Lower the threshold for integration and promote tool innovation
Form a network effect: the more Agents support MCP, the more willing tool developers are to implement MCP; the more tools support MCP, the more Agent developers are willing to adopt MCP
Eventually a standardized Agent tool market will be formed

Chapter 2: Core design of MCP protocol

2.1 Layered architecture of the protocol

MCP adopts a clear layered design, similar to the layering of network protocols.

Transport layer: Defines how messages are transmitted

stdio: standard input and output, suitable for local process communication
SSE: Server-Sent Events, suitable for remote services
HTTP: RESTful style, widely compatible

Protocol layer: Define message format

Based on JSON-RPC 2.0
Standard message types: Request, Response, Notification
Error handling and timeout mechanisms

Application layer: Define semantic content

Tool declaration: tool description, parameters, return value
Tool calling: how to call and how to pass parameters
Capability negotiation: Capability exchange between client and server

The advantages of this layering are: the transport layer can be flexibly replaced (switching from a local process to a remote service does not require changing the application layer code), the protocol layer ensures interoperability, and the application layer defines business semantics.

2.2 Tool life cycle

MCP defines the complete life cycle of the tool: declaration, discovery, invocation, response.

Statement: The tool provider describes the capabilities of the tool through Schema

Name: Unique tool identifier
Description: Function description of the tool (for the Agent to “see”)
Parameters: Schema of input parameters (type, required, constraints, etc.)
Return value: Schema of the output result

Discovery: After the Agent connects to the MCP server, it automatically obtains the list of available tools.

Call: Agent selects the appropriate tool based on user intent and passes in parameters.

Response: After the tool is executed, structured results are returned

The key to this process is self-description: the tool’s capability description is machine-readable, and the Agent can understand and use it, without the need for manual writing of documents.

2.3 Security model

MCP has designed a complete security mechanism.

Authentication: Supports multiple authentication methods

API Key: Simple key authentication
OAuth 2.0: Standard authorization flow
No certification: local development or trusted environment

Authorization: fine-grained permission control

Which agents can use which tools
Which users can use which features
Which operations require additional confirmation

Sandboxing: Environment isolation for tool execution

Resource limits: CPU, memory, disk usage upper limit
Network restrictions: whether external networks can be accessed
Timeout control: prevent long hangs

The design principle of this security model is: Security by Default. Tool providers can define security policies, which the Agent will automatically comply with when executing.

Chapter 3: From integration dilemma to ecological prosperity

3.1 Integration mode before MCP

Before MCP, there were several modes of Agent tool integration, each with obvious flaws.

Mode 1: Hard-coded integration

The tool API is called directly in the Agent code. This is the most common pattern and also the most vulnerable.

defect:

Every time a tool is added, the Agent code needs to be modified.
Changes in tool APIs will disrupt Agent functionality
The tool’s error handling logic is scattered throughout

Mode 2: Configuration integration

The tool calling method is defined through the configuration file and loaded dynamically when the Agent is running.

defect:

There is no standard configuration format, and various Agent platforms are incompatible.
Configuration cannot express complex interaction logic
Semantic information of tools (such as parameter descriptions) is difficult to convey

Mode 3: Plug-in integration

The tool is provided in the form of a plug-in, and the Agent is called through the plug-in interface.

defect:

The plug-in interface is different for each Agent platform.
Plug-in development and maintenance costs are high
Compatibility between plug-ins is difficult to guarantee

3.2 Paradigm changes brought about by MCP

MCP changes the paradigm of Agent tool integration from “adaptation” to “plug and play”.

Adaptation mode: Agent to adapt tools

Agent needs to know how to call each tool
Agents need to handle the special cases of each tool
Agent needs to maintain tool-related code

Plug and Play Mode: Tools declare their own standard interfaces

The tool implements the MCP protocol and declares its capabilities
Agent communicates with any tool through MCP protocol
The specific implementation of the tool is transparent to the Agent

The core of this transformation is separation of concerns: Agents focus on “what tools to use to solve what problems”, and tools focus on “how to perform tasks efficiently”.

3.3 The flywheel effect of ecological development

MCP has the potential to form a benign ecological flywheel.

Phase 1: Infrastructure

MCP protocol definition and SDK release
Early adopters (Anthropic Claude, etc.) support MCP
Basic tools (file system, database, etc.) implement MCP interface

Phase 2: Tool richness

More tool developers join to implement MCP interface
Tool market/warehouse is formed to facilitate discovery
Rapid growth in tool quality and variety

The third stage: Agent popularization

Agent developers can easily access a large number of tools
Rapid expansion of agent capability boundaries
More scenarios can be solved with Agent

Stage 4: Ecological Prosperity

Agents and tools create network effects
Specialized division of labor emerges: some people focus on being agents, others focus on making tools
Mature business model: tools can be charged and agents can be platform-based

Chapter 4: The relationship between MCP and Function Calling

4.1 Differences in positioning between the two

Many people ask: What is the difference between MCP and OpenAI/Claude’s Function Calling?

Function Calling is the capability layer of LLM:

Models can generate structured function call requests
Defined at the model API level
It is up to the application developer to implement the specific logic of the function

MCP is an application layer protocol:

Define communication standards between agents and tools
Cross-model platform compatibility
Tools can self-declare and agents can automatically discover

The relationship between the two is not competition, but complementarity.

4.2 Collaboration model

Typical collaboration process:

User input -> Agent understands intent
Agent queries the MCP server to obtain a list of available tools
Agent decides that it needs to call the “query weather” tool
Agent generates call request through Function Calling
The MCP client converts the request into the MCP protocol and sends it to the tool server
The tool server executes and returns the results
The MCP client returns the result to the Agent
Agent generates final reply

In this process:

Function Calling is the ability of the model to generate call requests
MCP is the interoperability protocol of the tool ecosystem

4.3 Migration and coexistence

Agents that have already implemented Function Calling can be smoothly migrated to MCP.

Migration Strategy:

Retain the capability layer of Function Calling
Migrate tool implementation to MCP server
Add MCP client layer to convert Function Calling request to MCP protocol

Coexistence Strategy:

Core tools are accessed through MCP
Special tools retain Function Calling direct connection
Migrate gradually to reduce risk

Chapter 5: Challenges and Responses in MCP Practice

5.1 The Art of Tool Design

Even with MCP, tool design is still an art. Good tool design can make the Agent get twice the result with half the effort, while poor tool design can make the Agent feel at a loss.

Principle 1: Atomicity - the power of single responsibility

Each tool only does one thing. Don’t make “query user orders and send email notification” into one tool, but split it into two tools, “query order” and “send email”, and let the agent decide when to use them together.

Why insist on atomicity?

Composability. Atomic tools are like Lego bricks, which can be combined in different ways to solve different problems. If you make “query orders + send emails” into a tool, then this tool cannot be used when the user only wants to check orders and does not want to send emails. But if you split them into two tools, the Agent can decide whether to use only the first one, only the second one, or both according to the specific situation.

Testability. Atomic tools are easier to test. You can independently test the “query order” function without worrying about interference from email sending; you can also independently test the “send email” function without worrying about database problems. Test coverage is simpler and bug location is faster.

Reusability. Atomic tools can be reused in different scenarios. The “Send Email” tool can not only be used for order notifications, but can also be used in various scenarios such as password reset, marketing push, and system alarms.

Principle 2: Self-descriptive - let the agent truly understand the tool

The name and description of the tool are for the Agent to “see”, not for humans. Describe the function in a way that the Agent can understand.

Common description errors:

Too technical:

Bad description: “execute_sql”
Bad description: “Call the database API to get data”
Good description: “query_database”
Good description: “Query data records with specified conditions from the database”

Too vague:

Bad description: “Handling user requests”
Good description: “Get user details including name, contact details and account status based on user ID”

Includes implementation details:

Bad description: “Querying the orders table using REST API”
Good description: “Get the order list of the specified user, support filtering by time range and order status”

A good tool description should answer three questions: What does this tool do? What information does it require? What result does it return?

Principle 3: Idempotence - Guarantee of safe retry

The same input should produce the same result. This way the Agent can retry safely without worrying about side effects.

Why is idempotence particularly important in Agent systems?

The non-deterministic nature of the Agent system means that tool calls may fail, or they may succeed but not return the correct results. Agents need to be able to safely retry failed calls. But if the tool is not idempotent, retrying may lead to repeated operations - such as repeated deductions, repeated emails, and repeated creation of records.

Ways to achieve idempotence:

Unique identifier: Generate a unique ID for each operation, and the system determines whether it has been processed based on the ID.

Status Check: Check the current status before performing an operation. If the target state has been reached, success will be returned directly.

Optimistic Lock: When updating data, check whether the data versions match to prevent concurrent modifications.

Principle 4: Defensive Design—Assume Agent Will Make Mistakes

When designing tools, it should be assumed that the Agent may pass in wrong parameters and take precautions.

Parameter verification: Check whether required parameters exist, whether the parameter type is correct, and whether the value is within a reasonable range.

Default Value: Provide reasonable default values for optional parameters to reduce the decision-making burden of the Agent.

Error prompt: When a parameter is wrong, clear and actionable error information is returned to help the Agent understand how to correct it.

Boundary processing: Handle boundary situations, such as empty results, oversized results, special characters, etc.

Principle 5: Contextual awareness - let the tool understand the environment

A good tool should be context-aware and adapt its behavior to the environment.

For example, a “send notification” tool should be able to:

Select notification channel (email, SMS, App push) according to user preference
Choose the right sending time based on time (avoid late night interruptions)
Choose the appropriate message format according to the length of the content (email for long content, text message for short content)

This context-awareness can be passed through parameters or implemented through state management within the tool.

5.2 Performance Optimization - Cost and Balance of Abstraction Layer

MCP introduces an additional communication layer, which inevitably brings performance overhead. The key is finding a balance between flexibility and efficiency.

Understand the sources of performance overhead

Serialization cost: MCP uses JSON as the message format. Each call needs to serialize the parameters into JSON and deserialize when returning the result. This adds additional CPU overhead compared to direct function calls.

Network Latency: If the MCP server is remote, network round trip time (RTT) can become a bottleneck. A single tool call can require tens to hundreds of milliseconds of network latency.

Connection establishment: If there is no connection pool, each call needs to establish a new connection, which will become a serious performance problem in high concurrency scenarios.

Protocol processing: Although MCP’s message routing, error handling, timeout management and other functions improve reliability, it also increases processing overhead.

Detailed explanation of optimization strategy

Connection Pooling: The Art of Reuse

Implementing MCP connection pool needs to consider:

Pool Size: Set an appropriate pool size based on concurrency requirements. Too small will cause waiting, too large will waste resources.
Health Check: Regularly check whether the connection is available and remove failed connections in a timely manner.
Timeout Management: Set reasonable connection timeout and idle timeout to prevent resource leakage.
Load Balancing: If there are multiple MCP servers, load balancing needs to be achieved at the pool level.

Cache: space for time

Caching can significantly reduce duplicate calls:

Tool metadata caching: Schema declarations of tools usually do not change frequently and can be cached for a long time.
Result Caching: For idempotent query tools, results can be cached to avoid repeated execution.
Intelligent caching strategy: Design different caching strategies (TTL, LRU, etc.) based on tool characteristics and parameter characteristics.

Batch Processing: Reduce Round Trips

If the Agent needs to call multiple tools continuously, consider:

Batch call: One request contains multiple tool calls, reducing network round-trips.
Preloading: Predict the data that may be needed, query and cache it in advance.
Parallel Calls: Tool calls without dependencies can be executed in parallel.

Localized deployment: eliminate network delays

For frequently called tools:

Local MCP Server: Deploy the tool on the same machine or network as the Agent to eliminate network delays.
Edge deployment: Deploy tools closest to users to reduce transmission delays.

Performance degradation strategy

In extreme cases, performance degradation needs to be considered:

Direct call mode: In performance-sensitive scenarios, it is allowed to directly call tools bypassing MCP.
Asynchronous processing: Non-critical operations can be executed asynchronously without blocking the main process.
Downgraded results: When the tool call times out, return cached data or default values.

Key indicators for performance monitoring

Establish a complete performance monitoring system:

End-to-end delay: The complete time from the Agent initiating the call to receiving the result.
Tool call delay: Pure tool execution time after network transmission is excluded.
Success Rate: The proportion of successful tool calls.
Retry rate: The proportion of calls that need to be retried.
Queue Depth: Number of calls waiting to be executed.

5.3 Error handling - the philosophy of graceful failure

Tool calls can fail, this is an unavoidable reality in a production environment. MCP defines a standard error format, but how to handle errors still needs to be carefully designed.

The Art of Misclassification

Not all errors should be treated equally. Proper error classification is key to designing robust systems.

Retryable Error: Usually a temporary problem, retrying may succeed

Network timeout
Service is temporarily unavailable
rate limit trigger
Connection interrupted

Processing strategy: Exponential backoff retry, set the maximum number of retries, and turn it into a non-retryable error after exceeding it.

Non-retryable error: Usually a logic problem, retry will not succeed

Insufficient permissions
Invalid parameter
Resource does not exist
business rule violation

Processing strategy: Return the error immediately, do not try again, and let the Agent decide how to handle it.

Partial Success: The operation is partially completed and requires special handling

Batch operation partially successful
Multi-step operation partially completed
Data part updated

Processing strategy: Return detailed operation results, allowing the Agent to understand which ones succeeded and which ones failed, and decide whether compensation operations are needed.

Readability of error messages

The error message is not only for the system to see, but also for the Agent to “see”.

Bad Error Message:

Error code: 500
Internal server error

Good error message:

Tool call failed: database connection timeout.
Possible cause: database load is too high or the network is unstable.
Recommended action: Retry after 30 seconds, or ask the database administrator to check database health.

A good error message should contain:

What error occurred
why it happens
How to solve or circumvent
Is manual intervention required?

Downgrade plan design

When a tool fails, are there alternatives?

Active and standby switching:

When the primary database fails, switch to the standby database
When the main API fails, switch to the backup API
NOTE: Data for alternative options may not be up to date

Function downgrade:

When real-time data query fails, cached data is returned
When complex analysis fails, simplified analysis is returned
When multi-source data fails, data from available data sources is returned.

Manual intervention:

When key operations fail, manual processing is performed
Record the failure context to facilitate manual takeover
Provide convenient manual intervention interface

User Feedback Strategy

Is the user notified that the tool call failed? This is a matter of trade-off.

Transparency:

Tell users what problems they encountered
Describe the remedial measures being taken
Provide alternatives or suggestions

Silent processing:

Users switch to the downgrade solution without noticing
Record errors and alert in the background
Suitable for scenarios that have a greater impact on user experience

Mixed Processing:

Determine notification strategy based on error type
Critical errors must be communicated to the user
Minor errors can be handled silently

Best Practices for Error Handling

Fail Fast: If the error is unrecoverable, fail as quickly as possible instead of trying continuously.

Graceful Downgrade: Always have a Plan B to ensure that the system can still provide services in the event of partial failure.

Context retention: Preserve complete context information during error propagation to facilitate problem diagnosis.

User First: The primary goal of error handling is to protect the user experience, not mask the problem.

5.4 Security Boundary—The Eternal Game of Convenience and Security

MCP provides a security mechanism, but you still need to be careful how to configure it. Security design requires finding a balance between convenience and security.

Practice of the principle of least privilege

Tool Level Permissions:

Only open necessary tools to the Agent
Regularly audit tool usage and remove unused tools
Assign different tool permissions based on the Agent’s role

Operation Level Permissions:

Distinguish between read-only operations and write operations
Sensitive operations (deletion, transfer, configuration modification) require additional authorization
Set upper limits for batch operations to prevent accidental large-scale changes

Data Level Permissions:

Limit the range of data that the Agent can access
Sensitive data desensitization processing
Restrict data access based on user identity

Sensitive operation confirmation mechanism

Which operations require additional confirmation?

Financial related:

Any operation involving funds
Operations where the amount exceeds the threshold
Transfer to new payee

Data security related:

Deletion of data
Modify key configuration operations
Batch data export

Compliance related:

Operations involving personal privacy information
Data access across data boundaries
Operations that may violate regulations

Confirmation mechanism design:

Explicit Confirm: Ask the user to explicitly enter “confirm” or click the confirm button
Two-step verification: Second-step verification through SMS, email, etc.
Delayed execution: Delayed execution of sensitive operations, giving the user a time window for cancellation
Manual review: Key operations are submitted to manual review and executed only after passing

Construction of audit log

Comprehensive audit logs are the basis for post-event tracing and problem diagnosis.

Record content:

Who (which Agent/user)
what time
What tool was called?
What parameters are passed in?
What results are returned
Execution time
Is it successful?

Log storage strategy:

Structured storage for easy query and analysis
Set reasonable retention periods to balance storage costs and audit needs
Desensitize sensitive information to prevent log leaks from causing security issues

Log Analysis:

Real-time monitoring of abnormal calling patterns
Regularly analyze tool usage trends
Identify potential security threats

Rate Limiting and Abuse Protection

Preventing tools from being misused is a must in production environments.

Multi-dimensional current limiting:

Current limiting by Agent: The upper limit of the calling frequency of each Agent
Current limiting by tool: The upper limit of concurrent calls for each tool
Per-user current limit: Call quota for each user
Global throttling: Overall capacity protection of the system

Current limiting strategy:

Token Bucket: Smooth burst traffic and allow certain bursts
Leaky Bucket: Strictly control the output rate
Sliding window: Precisely control the number of calls within the time window

Abuse Detection:

Identify abnormal calling patterns (such as a large number of calls in a short period of time)
Monitor the success rate of tool calls. A sudden drop may be a signal of attack.
Establish a blacklist mechanism to block abnormal sources

fuse mechanism

When tools continue to fail, fuse protection should be used to avoid cascading failures.

Circuit trigger conditions:

The error rate exceeds a threshold (such as 50%)
The number of consecutive failures exceeds the threshold
Response time exceeds threshold

Post-circuit breaker behavior:

Return the error directly and no longer call the tool
Switch to backup plan
Notify operation and maintenance personnel

Circuit Break Recovery:

Periodically try the half-open state to check if service has been restored
Automatically close the circuit breaker after service is restored
Record circuit break events to facilitate root cause analysis

Chapter Six: Future Prospects of MCP

6.1 Protocol evolution direction

MCP is still developing rapidly and may add:

Resource Subscription: Support real-time data push
Streaming Response: Supports long-running tools returning results incrementally
Multi-modal: Supports non-text content such as images and audio

6.2 Ecological construction

Key nodes of MCP ecology:

Official Tool Library: High-quality tools provided by Anthropic and ecological partners
Tool Market: A discovery and distribution platform for third-party tools
Certification System: Safety and quality certification of tools

6.3 Possibility of industry standards

MCP has the potential to become the de facto standard for agent tools:

The technical design is reasonable and solves real pain points.
Have strong promoters (Anthropic)
Open source, community participation

But whether it can become a true industry standard also depends on:

Will other major manufacturers follow suit (OpenAI, Google, etc.)
Can ecology form a network effect?
Verification of actual production environment

Chapter 7: Suggestions for Practitioners - Action Guide for MCP Implementation

7.1 Decision-making framework for the initial stage

When you decide to adopt MCP, the following decision framework can help you make an informed choice.

Phase 1: Assessment Phase (1-2 weeks)

**Is it suitable for MCP? **

Ask yourself the following questions:

How many external tools does your Agent need to call? (less than 3 probably not worth it)
Will these tools be reused by multiple agents? (The more reuse, the greater the value)
Is the tool’s interface stable? (Frequent changes require the decoupling value of MCP)
Does the team have the ability to maintain the agreement? (MCP requires additional development and operation and maintenance investment)

**MCP or other solutions? **

Compare other integration solutions:

Direct call: Suitable for scenarios with a small number of tools, simple interfaces, and infrequent changes
Configurable integration: Suitable for scenarios with a medium number of tools and limited team technical capabilities
MCP: Suitable for scenarios with a large number of tools that need to be shared across teams and maintained for a long time.

Phase 2: Pilot phase (2-4 weeks)

Select pilot tool:

Choose 1-2 most commonly used tools to pilot
Give priority to tools with relatively stable interfaces and high frequency of use.
Avoid selecting business-critical tools as first pilots

Verify value:

Comparing the development costs of MCP integration versus direct integration
Test whether the performance of MCP integration meets the requirements
Collect feedback from developers on MCP development experience

Phase 3: Promotion phase (1-3 months)

Gradual migration:

Adapt tools and strategies based on pilot experience
Migrate remaining tools in batches to avoid a one-time overhaul
Keep the old and new solutions running in parallel for a period of time

Establish specifications:

Formulate MCP tool development specifications
Establish tool registration and discovery processes
Training team members

7.2 In-depth comparison between MCP and Function Calling

Many people are confused about the relationship between MCP and Function Calling. Let’s compare these two concepts in depth.

Positioning and abstraction levels

Function Calling:

Level: Model capability layer
Function: Allow the model to generate structured function call requests
Scope: Defined at the model API level, it is the “language capability” for the model to interact with the external world.

MCP:

Level: Application layer protocol
Role: Define the communication standard between Agent and tool
Scope: Cross-model platform, the “interoperability protocol” of the tool ecosystem

Analogy understanding:

Function Calling is “the ability to speak”
MCP is “Content and Format of Speech”

Without Function Calling, the Agent does not know how to “speak”; without MCP, the Agent and the tool “speak the same language” but “cannot understand each other”.

Technical implementation comparison

Dimensions	Function Calling	MCP
Protocol format	Model manufacturer customization	Standardized JSON-RPC 2.0
tool discovery	Application layer hard coding	Autodiscover (list_tools)
Tool description	Application layer definition	Tool self-description (Schema)
Transmission method	direct function call	Support stdio/SSE/HTTP multiple transmissions
security model	Application layer implementation	Built-in authentication and authorization mechanism
Cross-platform	Depend on specific model	Cross-model platform compatibility

Detailed explanation of collaboration mode

Typical collaboration process:

Intent Understanding: User input → Agent understands the intent
Tool Selection: The Agent decides which tool to call based on the intent and the list of available tools.
Call generation: Generate structured call requests through Function Calling
Protocol conversion: MCP client converts Function Calling request into MCP protocol format
Service call: MCP server receives the request and calls specific tools
Result Return: The tool execution result is returned through the MCP protocol
Response generation: Agent generates final response based on the results

In this process:

Function Calling is responsible for “generating calls”
The MCP is responsible for “executing the call”

The relationship between the two is upstream and downstream, not a substitution relationship.

Migration Strategy and Coexistence Mode

For Agents that have already implemented Function Calling:

Smooth migration strategy:

Keep the capability layer of Function Calling unchanged.
Migrate tool implementation to MCP server
Add MCP client layer to convert Function Calling request to MCP protocol
Gradually migrate tools to maintain compatibility

Hybrid Architecture:

New tools are accessed via MCP
Legacy tools retain direct connection to Function Calling
Unified calling through adapter

This hybrid architecture is useful during transition periods and allows for gradual evolution without a one-time overhaul.

Select Suggestions

When to use Function Calling:

Low number of tools (less than 5)
Tools change infrequently
rapid prototyping
For internal use and not for external sharing

When to use MCP:

Large number of tools (more than 10)
Tools need to be reused by multiple agents
Tools need to be provided externally
Long-term maintenance system

Mixed Use:

Core tools are accessed through MCP
Special tools retain Function Calling direct connection
Mask differences through unified interface layer

7.3 Team Capacity Building—Skills Model in the MCP Era

The introduction of MCP is not only a technical choice, but also a challenge to team capabilities.

Three types of key roles

1. MCP Architect

Responsible for the overall design and evolution of the MCP system.

Core Competencies:

Understand the underlying principles of the MCP protocol
Ability to design scalable MCP architectures
Possess safety design capabilities
Learn how to optimize performance

Main Responsibilities:

Formulate MCP development specifications
Classification and organization of design tools
Evaluate and introduce new MCP tools
Solve complex integration problems

2. Tool Developer

Responsible for packaging existing services into MCP tools.

Core Competencies:

Familiar with MCP SDK and protocol details
Have API design and packaging capabilities
Understand Schema definition and validation
Have error handling and logging capabilities

Main Responsibilities:

Implement MCP tool interface
Write tool documentation and examples
Maintain tool versions and compatibility
Handling tool-related bugs

3. MCP Operation and Maintenance Engineer

Responsible for the stable operation of the MCP system.

Core Competencies:

Familiar with MCP deployment and monitoring
Ability to diagnose and recover from faults
Understand performance tuning methods
Have security audit capabilities

Main Responsibilities:

Deploy and maintain MCP servers
Monitor MCP system health status
Handling MCP related faults
Conduct regular security audits

Skill Development Suggestions

Theoretical Learning:

Read the MCP protocol specification in depth
Study official examples and best practices
Learn the JSON-RPC 2.0 protocol

Practical Training:

Start with simple tool packaging
Participate in the MCP open source project
Establish internal MCP tools marketplace

Community Engagement:

Join the MCP Developer Community
Share experiences and pitfall experiences
Contribute tools and tool libraries

7.4 Common pitfalls and avoidance strategies

Trap 1: Over-instrumentation

Symptoms: Encapsulating every small function into a tool leads to an explosion in the number of tools.

as a result of:

Agent selection tool is difficult
Increased tool management costs
Too long call chain affects performance

avoid:

Follow the principle of atomicity, but also consider practicality
Regularly review the need for tools and merge or remove redundant tools
Establish a tool classification and labeling system

Trap 2: Ignoring backward compatibility

Symptoms: The interface is directly modified when the tool is upgraded, causing the Agent that relies on it to fail.

as a result of:

Production environment failure
emergency rollback
Trust among teams damaged

avoid:

Follow semantic versioning specifications
Maintain backward compatibility when interface changes
Use incremental migration for major changes

Trap 3: Lack of safety design

Symptoms: Only focus on functional implementation and ignore safety design.

as a result of:

data breach
Unauthorized access
The system is attacked

avoid:

Safety design front
Conduct regular security audits
Establish a security response mechanism

Trap 4: Performance Neglect

Symptoms: Only focus on functions when developing and do not test performance.

as a result of:

Performance is not up to standard after launch
Deterioration of user experience
Requires massive refactoring

avoid:

Incorporate performance testing into the development process
Establish performance baselines and monitor
Design with performance optimization in mind

Trap Five: Lack of Monitoring

Symptoms: MCP system is running but lacks monitoring.

as a result of:

Problem discovery lags
Difficulty in troubleshooting
Unable to continuously optimize

avoid:

Establish a comprehensive monitoring system
Set reasonable alarm thresholds
Perform regular performance analysis

Appendix: Three real cases of cheating in MCP practice

Case 1: The “standard” but incompatible MCP implementation

Background: We implemented a database query tool in accordance with the MCP protocol specification and released it to the internal tool market with confidence. The Agent developer of another team connected according to the MCP specification, but found that it could not be used normally.

Problem Phenomenon: The connection is successful and the tool is successfully discovered, but the error “Parameter format error” is always reported when calling.

Troubleshooting:

After two days of troubleshooting, it was found that the problem lies in the parsing of JSON Schema:

Our tool uses a relaxed JSON Schema validator that allows certain “approximate” matches
The other party’s Agent uses a strict validator that requires exact Schema matching.
Although the MCP protocol defines standards, there are differences in implementation details

The deeper problem is: some fields of the MCP protocol are vaguely defined, and different implementations have different understandings. For example, should the “description” field be plain text or support Markdown? How are “required” fields inherited in nested objects? These details are not spelled out in the agreement.

Solution:

Conservative Implementation: Implement the protocol according to the strictest interpretation, ensuring compatibility with any compliant client
Clear Documentation: Clearly state implementation details in tool documentation, especially those related to protocol ambiguities
Compatibility Test: Compatibility test with mainstream MCP clients
Version Lock: Clearly declare the supported MCP protocol version to avoid version confusion

After implementation, compatibility issues were significantly reduced. But this also exposes a reality: there are still differences in the actual implementation of so-called “standards”.

Lesson: Protocol standards are the starting point, not the end point. Actual interoperability requires more testing and coordination.

Case 2: The performance nightmare MCP call chain

Background: We transformed multiple tools into MCP interfaces, and Agent can call all tools through a unified MCP client. The architecture looks elegant.

Problem Phenomenon: After going online, it was found that the response time of the Agent increased from 2 seconds to 8 seconds, and the user experience seriously deteriorated.

Troubleshooting:

After in-depth analysis, the performance bottleneck was found:

Each tool call needs to establish an MCP connection (we did not implement a connection pool)
MCP’s message serialization/deserialization overhead is 3 times greater than direct API calls
Data transfer between tools requires multiple encoding and decoding

The deeper problem is: MCP’s abstraction layer brings flexibility, but it also brings overhead. When the tool is called frequently, these overheads accumulate and become a serious problem.

Solution:

Connection Pool: Implement MCP connection pool and reuse connections instead of creating new ones every time
Batch call: Batch tool calls as much as possible to reduce the number of round trips
Local cache: Cache tool metadata to avoid repeated queries
Performance degradation: In performance-sensitive scenarios, direct calls are allowed to bypass MCP.

After implementation, the response time dropped to 3.5 seconds, which is still slower than a direct call, but within the acceptable range.

Lesson: Abstraction has a cost. In performance-sensitive scenarios, flexibility and efficiency need to be weighed.

Case 3: The abused MCP tool

Background: We open the company’s core database query tool to multiple agents through MCP. The original intention was to improve the standardization of data access.

Problem Phenomenon: One month after going online, the CPU usage of the database soared, and some queries caused the database to lock up.

Troubleshooting:

The investigation found:

An Agent repeatedly calls the database query tool in a loop without caching the results.
Another Agent generated a complex SQL query, but did not limit the amount of returned data, querying millions of data at once.
There is also an Agent that does not limit the calling frequency in concurrent scenarios, causing the database connection pool to be exhausted.

The deeper problem is: MCP makes it easy to use the tool, but it also makes it easy to abuse the tool. Agent can dynamically generate calling parameters, which makes it difficult for traditional current limiting and protection mechanisms to take effect.

Solution:

Call current limit: Implement call frequency limit at the MCP server layer
Cost Quota: A query cost quota is assigned to each Agent. If it exceeds the limit, you need to apply.
Query Review: Static analysis of generated SQL to intercept dangerous queries
Audit Log: records all tool calls and regularly reviews for abnormal patterns
Circuit breaker: When the database load is too high, new query requests are automatically rejected

After implementation, the database load returned to stability. But this made us realize: MCP’s security model needs a more rigorous design.

Lesson: Convenience and security are often at odds. While lowering the threshold for use, safety control must be strengthened.

Conclusion: Standardization is the prerequisite for scale

Back to the integration nightmare at the beginning of the article - if MCP had become popular at that time:

The database provides MCP interface, we do not need to write connection pool management code
The mail service provides an MCP interface, and we do not need to deal with templates and current limiting.
The work order system provides an MCP interface, so we don’t need to chew through incomplete documents.

The integration work has changed from “writing thousands of lines of adaptation code” to “configuring several MCP connections”.

**The value of MCP lies not in the new capabilities it creates, but in the fact that it lowers the threshold for integration. **

In the history of software development, standardized protocols are often the starting point for ecological prosperity:

HTTP allows web applications to communicate with each other
REST unifies API design
USB allows peripherals to plug and play

MCP is expected to become a similar catalyst for the Agent ecosystem:

Lower the barriers to tool development and integration
Promote innovation in Agent applications
Form a healthy tool market

For Agent developers: Embracing MCP means being able to access a rich tool ecosystem and focus on the intelligence of the Agent itself.

For tool developers: Embracing MCP means developing once and using it everywhere, expanding the influence of tools.

For the entire ecosystem: MCP may be a key step for Agent to move from “proof of concept” to “scale application”.

Standardization is never an end, but a means. The real goal is to enable Agent technology to serve more people, solve more problems, and create greater value.

MCP may not be perfect, but it takes an important step.

Reference resources

original:

Kaggle Agent Tools & MCP Whitepaper

MCP official resources:

MCP official documentation: https://modelcontextprotocol.io
MCP GitHub: https://github.com/modelcontextprotocol
Python SDK: https://github.com/modelcontextprotocol/python-sdk

*This article is an original practical summary, written based on personal project experience. *

Last updated: 2026-03-12

Reading path

Continue along this topic path

Follow the recommended order for AI engineering practice instead of jumping through random articles in the same topic.

View full topic path →

Next step

Go deeper into this topic

If this article is useful, continue from the topic page or subscribe to follow later updates.

Introduction: That Monday morning of integration nightmare

Chapter 1: Why the Agent World Needs “USB-C”

1.1 The painful reality of tool integration

1.2 MCP Analogy: Unification of USB-C

1.3 The value of standardized protocols

Chapter 2: Core design of MCP protocol

2.1 Layered architecture of the protocol

2.2 Tool life cycle

2.3 Security model

Chapter 3: From integration dilemma to ecological prosperity

3.1 Integration mode before MCP

3.2 Paradigm changes brought about by MCP

3.3 The flywheel effect of ecological development

Chapter 4: The relationship between MCP and Function Calling

4.1 Differences in positioning between the two

4.2 Collaboration model

4.3 Migration and coexistence

Chapter 5: Challenges and Responses in MCP Practice

5.1 The Art of Tool Design

5.2 Performance Optimization - Cost and Balance of Abstraction Layer

5.3 Error handling - the philosophy of graceful failure

5.4 Security Boundary—The Eternal Game of Convenience and Security

Chapter Six: Future Prospects of MCP

6.1 Protocol evolution direction

6.2 Ecological construction

6.3 Possibility of industry standards

Chapter 7: Suggestions for Practitioners - Action Guide for MCP Implementation

7.1 Decision-making framework for the initial stage

7.2 In-depth comparison between MCP and Function Calling

7.3 Team Capacity Building—Skills Model in the MCP Era

7.4 Common pitfalls and avoidance strategies

Appendix: Three real cases of cheating in MCP practice

Case 1: The “standard” but incompatible MCP implementation

Case 2: The performance nightmare MCP call chain

Case 3: The abused MCP tool

Conclusion: Standardization is the prerequisite for scale

Reference resources

Continue along this topic path

Technical Interpretation Index | Curated Translations

Original interpretation: Discovery and prevention of silent hallucination in RAG system

Original interpretation: How AI Agent implements large-scale testing quality access control

Continue with this topic

Original interpretation: Agent quality assessment - the cornerstone of trust in the AI ​​era

Original Interpretation: Contextual Engineering—The Forgotten Core Battlefield in the AI ​​Era

Original interpretation: Kaggle white paper "Introduction to Agents" - AI Agent introduction and architecture panorama

Go deeper into this topic

Subscribe to updates

Comments and discussion

Original interpretation: Agent quality assessment - the cornerstone of trust in the AI era

Original Interpretation: Contextual Engineering—The Forgotten Core Battlefield in the AI Era