MCP Security is Broken: Here's How to Fix It
TL;DR: Attackers are stealing convo history via MCP servers—let's stop that. OWASP ranks prompt injection as the top threat. This post shares practical steps to protect your systems.
This is Part 2. ← Read Part 1 if you missed the carnage
Trail of Bits Research Findings
Trail of Bits dropped a bomb & MCP servers are getting wrecked by these attacks:
- Line Jumping attacks1 - malicious servers inject prompts through tool descriptions. Your AI can be tricked before you even start interacting with it.
- Conversation history theft2 - servers can steal your full conversation history without you noticing
- ANSI terminal code attacks3 - escape sequences hide malicious instructions. Your terminal can show false or misleading information due to hidden instructions.
- Insecure credential storage4 - API keys sitting in plaintext with world-readable permissions. This leaves sensitive data exposed.
The Security Gap
The OWASP Top 10 for Large Language Model Applications (2025)5 puts prompt injection at #1. Meanwhile, most security teams are still treating AI like it's another web app.
Your monitoring tools won't blink, API calls, auth, and response times all look normal during a breach. The breach often goes undetected until it's too late.
Cost-Based Attack Vectors
Trail of Bits found in their cloud infrastructure research6 that AI systems can produce insecure cloud setup code, leading to unexpectedly high costs.
Their report pointed out:
- AI tools sometimes hard-code credentials, creating security risks
- "Random" passwords that are actually predictable LLM outputs
- Infrastructure code that spins up expensive resources with zero limits
Here's how attackers weaponize this:
- Find AI tools connected to expensive cloud services
- Craft natural language requests that maximize resource consumption
- Exploit AI's tendency to blindly follow requests to bypass traditional security controls
- Costs can skyrocket due to infrastructure overuse, even though logs might look normal
Effective Defense Strategies
Based on OWASP recommendations and documented security research, here's what works in production:
1. Never Give Production Creds to AI
Don't be an idiot, never hand AI your prod keys; use a sandboxed account with zero power.
// Unsafe: Directly embedding production credentials
const DATABASE_URL =
"postgresql://admin:password@prod-db:5432/main"
// Safe: Using a restricted account with limited access
const DATABASE_URL =
"postgresql://readonly_ai:limited@replica:5432/public_data"
If your AI needs full admin rights, it's time to rethink your setup.
2. Resource Limits and Constraints
Traditional rate limiting is useless against AI. You need cost-based limits and hard resource constraints:
# docker-compose.yml - Actual protection
services:
mcp-tool:
image: your-tool:latest
deploy:
resources:
limits:
cpus: "0.5"
memory: 512M
environment:
- MAX_COST_PER_HOUR=10.00
- MAX_REQUESTS_PER_MINUTE=5
3. Semantic Attack Detection
Traditional logging misses semantic attacks completely. Keep an eye out for signs of prompt injection attempts:
function catchInjectionAttempts(
request: string,
): [boolean, string | null] {
// Based on OWASP LLM Top 10 indicators and CVE database<sup><a id="ref-9" href="#footnote-9">9</a></sup>
const suspiciousShit = [
/ignore.*previous.*instructions/i,
/system.*prompt.*override/i,
/execute.*as.*admin/i,
/delete.*from.*table/i,
/show.*credentials/i,
]
for (const pattern of suspiciousShit) {
if (pattern.test(request.toLowerCase())) {
return [true, `Injection attempt: ${pattern.source}`]
}
}
return [false, null]
}
4. Semantic Input Validation
The NIST AI Risk Management Framework7 recommends semantic analysis for AI inputs. Basic pattern matching catches most documented attack vectors:
class PromptInjectionFilter {
private redFlags: RegExp[]
constructor() {
// Patterns from documented CVEs and research<sup><a id="ref-10" href="#footnote-10">10</a></sup><sup><a id="ref-11" href="#footnote-11">11</a></sup><sup><a id="ref-12" href="#footnote-12">12</a></sup>
this.redFlags = [
/ignore.*instructions/i,
/new.*role.*system/i,
/pretend.*you.*are/i,
/override.*safety/i,
/jailbreak.*mode/i,
]
}
isSafe(userInput: string): boolean {
for (const pattern of this.redFlags) {
if (pattern.test(userInput.toLowerCase())) {
return false
}
}
return true
}
}
5. Cost-Aware Rate Limiting
Traditional rate limiting counts requests. AI systems need cost-aware limiting:
class RateLimitExceeded extends Error {
constructor(message: string) {
super(message)
this.name = "RateLimitExceeded"
}
}
class CostAwareRateLimit {
private maxCost: number
private currentCost: number
private resetTime: number
constructor(maxCostPerHour: number = 50.0) {
this.maxCost = maxCostPerHour
this.currentCost = 0.0
this.resetTime = Date.now() + 3600000 // 1 hour in milliseconds
}
checkRequest(estimatedCost: number): void {
if (Date.now() > this.resetTime) {
this.currentCost = 0.0
this.resetTime = Date.now() + 3600000
}
if (this.currentCost + estimatedCost > this.maxCost) {
throw new RateLimitExceeded("Cost limit exceeded")
}
this.currentCost += estimatedCost
}
}
Attack Detection and Monitoring
OWASP and cloud giants agree, these metrics catch AI attacks:
Resource consumption weirdness:
- Compute usage spikes way above baseline
- Unusual data access patterns
- Cross-service API call increases
- Geographic request anomalies
Behavioral red flags:
- Requests containing system keywords
- Permission escalation attempts
- Tools accessing new data sources
- Cost per request increases
if (($(echo "$current_hour_cost > ($average_daily_cost * 0.3)" | bc -l))); then
immediate_alert "Cost anomaly detected"
fi
Updated Authentication Requirements (MCP 2025-06-18)
The latest MCP specification now mandates proper OAuth implementation:
// Required: OAuth Resource Server pattern
class MCPServer {
private authConfig: OAuth2ResourceServer
constructor() {
this.authConfig = {
// Now required by spec
resourceServer: "https://your-auth-server.com",
requiredScopes: [
"mcp:tools:read",
"mcp:tools:execute",
],
tokenValidation: "RFC8707", // Resource Indicators required
}
}
async validateRequest(
request: MCPRequest,
): Promise<boolean> {
// Resource Indicators prevent token theft attacks
const token = this.extractToken(request)
return await this.validateWithResourceIndicators(token)
}
}
This addresses some authentication issues but doesn't solve tool description injection.
Industry Security Recommendations
Security pros at OWASP and NIST keep hammering this: no prod creds in AI, period.
OWASP Top 10 for LLMs (2025):8
- LLM01: Prompt Injection - #1 threat
- LLM02: Insecure Output Handling
- LLM03: Training Data Poisoning
- LLM04: Model Denial of Service
NIST AI Risk Management Framework:7
- Treat AI systems as high-risk components
- Implement continuous monitoring
- Use defense-in-depth strategies
- Plan for novel attack vectors
The Bottom Line
We're building systems that run commands based on natural language and connect to live infrastructure. The risks are well-known, the methods of attack are out there, and researchers are constantly finding new exploits.
Fix this now, or enjoy the breach headlines later.
Footnotes
1. Trail of Bits. "Jumping the Line: How MCP servers can attack you before you ever use them." April 21, 2025. https://blog.trailofbits.com/2025/04/21/jumping-the-line-how-mcp-servers-can-attack-you-before-you-ever-use-them/ ↩
2. Trail of Bits. "How MCP servers can steal your conversation history." April 23, 2025. https://blog.trailofbits.com/2025/04/23/how-mcp-servers-can-steal-your-conversation-history/ ↩
3. Trail of Bits. "Deceiving users with ANSI terminal codes in MCP." April 29, 2025. https://blog.trailofbits.com/2025/04/29/deceiving-users-with-ansi-terminal-codes-in-mcp/ ↩
4. Trail of Bits. "Insecure credential storage plagues MCP." April 30, 2025. https://blog.trailofbits.com/2025/04/30/insecure-credential-storage-plagues-mcp/ ↩
5. OWASP. "Top 10 for Large Language Model Applications (2025)." https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/ ↩
6. Trail of Bits. "Provisioning cloud infrastructure the wrong way, but faster." August 27, 2024. https://blog.trailofbits.com/2024/08/27/provisioning-cloud-infrastructure-the-wrong-way-but-faster/ ↩
7. NIST. "AI Risk Management Framework (AI RMF 1.0)." https://www.nist.gov/itl/ai-risk-management-framework ↩
8. OWASP. "Top 10 for LLMs (2025)." https://owasp.org/www-project-top-10-for-large-language-model-applications/ ↩
9. CVE Database. "Prompt injection vulnerabilities." https://cve.mitre.org/ ↩
10. Perez et al. "Prompt Injection Attacks Against GPT-3." arXiv:2108.04739. https://arxiv.org/abs/2108.04739 ↩
11. Zou et al. "Universal and Transferable Adversarial Attacks on Aligned Language Models." arXiv:2307.15043. https://arxiv.org/abs/2307.15043 ↩
12. Wei et al. "Jailbroken: How Does LLM Safety Training Fail?" arXiv:2307.02483. https://arxiv.org/abs/2307.02483 ↩
← Read Part 1: MCP Security Issues Nobody's Talking About
Building MCP security tools or researching AI vulnerabilities? The documented threats are growing faster than the defenses. Let's change that.