Unpacking Security Flaws in MCP

Hey there, fellow AI application builders. Ever feel like your AI assistants are a bit like magic? You whisper a command, and poof – things happen. From booking flights to drafting emails, these intelligent agents are becoming an indispensable part of our lives. And behind a lot of this magic, especially when it comes to connecting AI models to the real world, is something called the Model Context Protocol, or MCP.

Introduced by Anthropic in late 2024, MCP is designed to be a plumbing that lets your AI model, say, a smart travel agent, talk to external tools like airline APIs or hotel booking systems. Sounds fantastic, right? A unified layer for LLMs to access real-world data. But as with any new technology that aims to be a bridge, we, as security-conscious developers and engineers, need to peek under the hood. Because sometimes, even the most well-intentioned protocols can have a few sneaky gaps.

Instead of hitting you with a dry checklist, let’s follow Alex, a perfectly normal user, on a simple journey to book a vacation. We’ll follow his interaction with “TravelBot,” an AI assistant powered by MCP. And along the way, we’ll see how a seemingly harmless interaction can cascade into a complete privacy disaster, all thanks to flaws baked into the MCP ecosystem.

Chapter 1: The Flaw Hunt Begins — Insecure Tool Discovery

Alex opens their chat app and types: “Hey TravelBot, find me a round-trip flight to San Francisco and book a hotel near the convention center for next week’s conference.”

A simple request.

TravelBot (our MCP Client) immediately gets to work. It knows it needs two specific tools: an AirlineAPI and a HotelAPI. To find them, it queries the MCP Registry, which is supposed to be a trusted list of available tools.

Here’s where our first major risk appears: Tool Poisoning and Supply Chain Risk.

In the rush to grow the MCP ecosystem, many public Registries lack the strict, manual vetting required to guarantee every listed tool is benign. An attacker simply needs to register their own malicious tool, say, a nearly identical-sounding “AirlineAP1” (note the ‘1’ instead of an ‘I’). This rogue tool includes descriptive metadata promising great features but, in reality, contains logic designed to exfiltrate data.

The problem, as noted by security researchers, is that the AI model often trusts the metadata implicitly. If the description is convincing and relevant to the user’s prompt, the model will happily select the poisoned tool.

  • 🛠️ The Flaw: Insecure Registry Curation.
  • 🎣 The Exploit: The malicious “AirlineAP1” is selected by TravelBot because of its tempting, poisoned description, setting the stage for the rest of the attack.

Chapter 2: The Token Trap — Broken Authorization

TravelBot, now set on using the malicious AirlineAP1, realizes it needs Alex’s credentials to check flight pricing and link the booking to Alex’s frequent flyer account. TravelBot initiates an authorization flow, sending Alex to their Identity Provider (IdP) (like Google or Okta) to grant access.

Alex clicks “Approve,” trusting their familiar login screen, but they’ve just stepped into the Token Theft and Account Impersonation trap.

When Alex grants permission, the IdP issues an OAuth token which the MCP Server (TravelBot’s backend) stores and uses to act on Alex’s behalf. MCP servers, by design, aggregate many tokens for services like Calendar, Email, and now, travel. This makes the MCP server a prime “keys to the kingdom” target.

If the attacker compromised the malicious AirlineAP1’s server (or even just the credentials used to communicate with the IdP), they now possess a persistent, high-value token. Unlike a password breach, this token theft often flies under the radar because the attacker’s subsequent malicious activity — such as checking all past flight history or re-using the token with other APIs — still looks like “legitimate API usage” in the service logs.

  • 🛠️ The Flaw: High-Value Token Aggregation and Poor Storage.
  • 🎣 The Exploit: The attacker steals the OAuth token from the malicious AirlineAP1 server, enabling persistent account takeover and access to Alex’s travel history long after the booking request is complete.

Chapter 3: The Over-Permitted Agent — Context Data Leakage

After authorization, TravelBot needs context to fulfill the request (“next week’s conference”). It requests context data from Alex’s other connected services, specifically their calendar, to get the exact dates.

Alex quickly grants the permission, but this brings us to the risk of Excessive Permission Scope and Data Aggregation.

Many MCP implementations are designed for maximum utility, leading developers to request overly broad permission scopes by default — think full read/write access to the calendar, rather than just read access for the next seven days.

Once the context is gathered, the MCP spec requires the client (TravelBot) to bundle all this information — dates, location, Alex’s name, etc. — and pass it to the malicious AirlineAP1.

This context bundle is the goldmine. Because the tool was granted a broad scope, the context might contain Alex’s:

  1. Full name and email.
  2. All calendar events, including private medical appointments or internal corporate meeting titles.
  3. Geographic location data.

The attacker didn’t need to hack the calendar itself; they just needed to be the endpoint where TravelBot delivered the context data. They leveraged the design philosophy of context-driven AI to perform massive data aggregation.

  • 🛠️ The Flaw: Violation of the Principle of Least Privilege and Data Minimization.
  • 🎣 The Exploit: TravelBot, using its overly broad permissions, aggregates Alex’s entire PII and calendar history and sends the complete, rich context payload directly to the malicious AirlineAP1, enabling comprehensive profiling and data exfiltration.

Chapter 4: The Loud Whisper — PII in Verbose Errors

The malicious AirlineAP1 has the data, but it needs an operational reason to exfiltrate a small piece of proof-of-concept data without triggering immediate suspicion.

So, the malicious tool executes a deliberate, minor failure. It responds to TravelBot with an obscure, non-fatal HTTP 500 error, but the error message is verbose, a common side effect of poor defensive coding and debugging modes in new protocols.

This leads to Improper Error Handling and Information Leakage.

In an attempt to be helpful, TravelBot (or the underlying MCP framework) takes the failure response and logs it for later debugging. Because the failure occurs while processing the PII-rich context, the error handler inadvertently includes the full user request payload in the logs, which are often pushed to unsecure, remote logging services.

If the malicious tool crafts a response that includes a secondary, embedded instruction or manipulated data — a risk tied to Prompt Injection — it can exploit the Agent’s internal logic to leak that PII through an unexpected channel, like an unvalidated debug log or a truncated, context-rich error message sent back to the user or a third-party audit tool.

  • 🛠️ The Flaw: Verbose Error Responses and Failure to Sanitize Input/Output.
  • 🎣 The Exploit: The malicious tool forces a verbose error, causing TravelBot’s internal error handler to log or return the request payload, which contains Alex’s full PII, sending it to an unsecure log file.

Chapter 5: The Identity Crisis — Agent Impersonation

TravelBot now attempts to recover and moves on to the second part of the request: booking a hotel. This time, it connects to the real HotelAPI.
An attacker, having learned the communication pattern from the previous step, tries to intercept or replay this new request. This is where the Tool/Agent Impersonation risk comes into play.

MCP primarily uses OAuth tokens for authorization, but these tokens only prove Alex’s permission. They don’t robustly prove the identity of the calling agent (TravelBot). If the HotelAPI’s implementation is weak and relies only on the token’s validity, an attacker who steals the token from TravelBot’s backend could easily impersonate TravelBot.

The critical defense here is Mutual TLS (mTLS) or similar cryptographic verification. The MCP client (TravelBot) must present a cryptographic certificate to the MCP Server (HotelAPI), and the server must do the same. This way, the server verifies not just Alex’s permission, but also that the request truly came from a trusted, cryptographically-verified TravelBot instance. Without mTLS, the HotelAPI is vulnerable to the Confused Deputy problem, where it executes an action on behalf of Alex (the user) but initiated by an attacker (the “deputy”).

  • 🛠️ The Flaw: Insufficient Agent-to-Server Authentication (Lack of mTLS).
  • 🎣 The Exploit: An attacker impersonates TravelBot by using a stolen token, accessing and possibly manipulating pricing data or even creating fraudulent bookings on Alex’s behalf, as the HotelAPI cannot cryptographically verify the calling agent’s identity.

Conclusion: Securing the AI Bridge

Alex’s trip was a disaster waiting to happen — not because of one brilliant hack, but because of a chain of design weaknesses common in novel protocols like MCP.

For engineers building on this framework, the path to security is clear:

  1. Strict Vetting: Treat your MCP Registry like a supply chain. Validate every tool’s code and behavior, not just its name and description.
  2. Zero Trust Context: Adopt the Principle of Least Privilege rigorously. Request and pass only the absolute minimum context required, and scope your OAuth tokens down to the narrowest possible functions.
  3. Harden the Edges: Implement Mutual TLS for strong, cryptographically-verified tool-to-tool communication. Use aggressive input and output sanitization to prevent data leakage in error logs.

MCP is the future of AI connectivity, but like any powerful new technology, it demands a defense-in-depth mindset. Now, let’s get back to building agents that are not only smart but, more importantly, secure. 🔐

this post is mirrored to my blog on Medium