MCP tools & integration considerations - IBM watsonx Orchestrate ADK

The following table covers the most common MCP tools and integration anti-patterns that affect agent reliability and security.

⭐️ Most common anti-patterns

Anti-pattern	Problem	Impact	Fix
⭐️ The glorified API wrapper	Low-level API operations	High reasoning burden	Business capabilities
Raw schema leakage	Exposed internal schemas	Agent must infer logic	Business abstractions
Wishful delegation	”Model will figure it out”	Fragile agents	Encode complexity in tools
Anything-in, garbage-out	No validation	Silent failures	Strong contracts
The god-mode tool	Unrestricted access	Security risks	Constrained capabilities
⭐️ The data firehose	Large unfiltered datasets	Context exhaustion	Bounded outputs
The trusted output trap	Unvalidated tool output	Prompt injection risk	Sanitize outputs
⭐️ The “kitchen sink” server	80+ tools at initialization	Selection entropy	Contextual discovery
The useless error	Raw stack traces	No self-correction	Structured errors
The stateful tool trap	Implicit session state	Race conditions	Stateless design

The glorified API wrapper

What it is

Exposing low-level API operations instead of business capabilities
Tools like odata_create(entity_set, payload) instead of create_purchase_order(vendor, items, amount)
Treating MCP as an API gateway rather than agent-ready capability layer

Why it fails

Exposes low-level API mechanics: Agents must understand entity sets, payloads, and API structure
No business semantics: Tool name doesn’t convey what business action it performs
Agents must reconstruct intent: Model figures out how to map business intent to API calls
Increases reasoning burden: Amplifies hallucination risk, makes agent fragile to API changes

What to do instead: Business capability tools

Design for business actions: create_purchase_order(vendor_id, items, delivery_date) not odata_create(entity_set, payload)
Encode business logic in tools: Move complexity from prompts to code
Clear semantics: Tool names convey business intent
Reduce reasoning burden: Agent operates business capabilities, not API mechanics

Raw schema leakage

What it is

Exposing internal data schemas instead of business operations
Tools operate on tables, entities, or endpoints—not business concepts
Inputs are generic JSON structures where field names ≠ business meaning

Why it fails

Relationships are implicit: No guidance on valid combinations
No domain rules: Tool doesn’t encode business constraints
Agent must infer business logic from schema: Highly error-prone task
Reverse-engineering burden: Asking LLM to do system design work

What to do instead: Business-level abstractions

Hide schema complexity: Expose business operations, not database tables
Encode relationships: Build valid combinations into tool design
Domain rules in code: Validation and constraints in tool implementation
Business parameters: vendor_id, items, delivery_date not generic JSON blobs

Wishful delegation

What it is

“The model will figure it out”—pushing system design responsibility onto LLM
Multi-step reasoning chains (discover → fetch metadata → construct query → build payload → execute)
Tool proliferation with dozens of generic CRUD operations
Assuming perfect reasoning from probabilistic models

Why it fails

Moves business logic from code to prompts: System responsibility becomes reasoning burden
Increases decision entropy: More tools and steps = more failure modes
Amplifies hallucination risk: Each reasoning step is opportunity for error
More tools ≠ better agents: Increases entropy in decision space

What to do instead: Encode complexity in tools

Reduce reasoning steps: Combine multi-step operations into single business capability
Limit tool count: Fewer, well-designed tools better than many generic ones
Business logic in code: Don’t make agent reverse-engineer your system
Clear contracts: Each tool has single, well-defined purpose

Anything-in, garbage-out

What it is

Tools without clear validation constraints or domain rules
No required/optional clarity, ambiguous parameter meanings
Tools are syntactically valid but semantically ambiguous

Why it fails

No guardrails: Invalid combinations accepted
Silent failures: Operations succeed technically but fail semantically
Runtime failures: Instead of design-time clarity
Agent calls tools with incorrect or meaningless inputs

What to do instead: Strong contracts

Explicit validation: Required fields, data types, value ranges
Domain rules: Encode business constraints in tool
Clear parameter semantics: Each parameter has unambiguous meaning
Fail fast: Reject invalid inputs with structured error messages

The God-mode tool

What it is

Exposing broad write access or execution capabilities without constraints
Tools allowing arbitrary writes, function execution with user code, unrestricted data access
Too much power at wrong abstraction level

Why it fails

Security risks: Agents can be manipulated to perform unauthorized actions
No audit trail: Hard to track what agent actually did
Violates least-privilege principle: Capabilities must be constrained and auditable

What to do instead: Constrained, auditable capabilities

Least privilege: Tools get minimum permissions needed
Explicit constraints: Limit scope of operations
Audit trails: Log all actions with context
Approval gates: High-risk operations require confirmation
Separate read/write: Different tools for different privilege levels

The data firehose

What it is

Tools returning large datasets without filtering or summarization
Manifestation of Tool Data Overload in MCP context
No pagination, no filtering, raw data injection without context window consideration

Why it fails

Context window exhaustion: Tool output competes with instructions, user intent, reasoning steps
Degrades reasoning quality: Large outputs push out critical instructions
Cost explosion: Paying for massive token counts on every turn
Bigger context ≠ better reasoning: Long inputs can degrade performance

What to do instead: Bounded and filtered tool output

Filter at source: Return only needed data using query parameters
Implement pagination: Use continuation tokens for large datasets
Summarize appropriately: Return summaries instead of full content
Set hard limits: Enforce maximum output sizes (e.g., 2000 tokens per call)
Monitor token usage: Track and optimize high-volume tools

The trusted output trap

What it is

Treating tool output as safe data instead of potential instructions
Security vulnerability: LLMs cannot reliably distinguish trusted instructions from untrusted data
Everything processed as same token stream

Why it fails

Prompt injection via tool output: Unvalidated content from external sources (documents, logs, emails, web data) can contain embedded instructions
Can override system behavior: Manipulate reasoning, trigger unintended actions, exfiltrate data
No clear boundary: Between legitimate system instructions and attacker’s embedded commands

What to do instead: Treat tool output as untrusted

Content sanitization: Strip/escape instruction markers, remove suspicious markdown, filter patterns
Output validation: Validate against expected schemas before sending to LLM
Structured formats: Use JSON instead of free-form text when possible
Data-instruction separation: Use XML tags, JSON encapsulation, delimiters to separate trusted instructions from untrusted data
Output monitoring: Monitor for suspicious patterns or anomalies
Least privilege: Return only minimum data needed

The "Kitchen sink" server

What it is

Registering massive, monolithic list of tools during initialization
Manifestation of Tool Soup in MCP server design
80+ tools presented at initialization regardless of agent’s task or user permissions

Why it fails

Handshake bloat: MCP discovery phase floods agent with every possible tool choice
Selection entropy: LLMs degrade in tool-selection accuracy with massive lists
Missing least-privilege: Exposes administrative/sensitive tools to sessions that shouldn’t have access
High tool selection latency: Increased rate of wrong tool invocation

What to do instead: Contextual tool discovery

Role-based filtering: Expose only tools relevant to user’s role and permissions
Task-based tool sets: Group by business capability, expose only relevant sets
Dynamic registration: Register/unregister tools based on session state or context
Hierarchical organization: Use gateway tools that expose sub-tools only when needed
Metadata optimization: Keep descriptions concise; provide just enough for selection

The useless error

What it is

Returning raw system stack traces or generic error strings instead of structured, LLM-actionable correction hints
Treating LLM like human debugger
No actionable feedback on what to fix or how to fix it

Why it fails

Breaks self-correction loop: Agents are probabilistic; with explicit explanation of why call failed, they can fix payload on next turn
Generic errors cause hallucination: Model guesses what went wrong instead of being told explicitly
Amplifies reasoning burden: Forces agent to debug instead of correct

What to do instead: Structured, agent-friendly error responses

Structured format: Return JSON with error_code, message, field, correction_hint
Error taxonomy: Define clear codes (VALIDATION_FAILED, RESOURCE_NOT_FOUND, PERMISSION_DENIED, RATE_LIMIT_EXCEEDED, INVALID_STATE)
Examples in hints: Include concrete examples of valid inputs
Reference related tools: Mention tools that can resolve the error
Avoid implementation details: Never expose stack traces, internal variables, system paths

The stateful tool trap

What it is

Designing MCP tools that implicitly rely on session state or execution order
Tools like set_active_project(project_id) followed by archive_current_project()
Assuming backend/transport layer preserves stateful sequence

Why it fails

Asynchronous drift: MCP transports (stdio, HTTP SSE) can experience race conditions, parallel reasoning loops, multi-agent handoffs
Context reset vulnerability: Conversation context cleared/shifted causes next tool call to execute against dead/incorrect references
Parallel execution hazards: Multiple agents/threads interleave tool calls, corrupting shared state

What to do instead: Stateless, self-contained tools

Explicit parameters: Every tool call includes all required context; no implicit “current” or “active” state
Idempotent operations: Tools are safely repeatable with consistent results
Return full context: Responses include enough context for subsequent calls without relying on memory
Avoid “set” and “get” patterns: Use do_operation(context_params) instead of set_context() followed by do_operation()
Document state requirements explicitly: If state genuinely required, make it explicit in tool contract and provide state tokens

MCP usage patterns

Pattern 1: Direct agent tools (recommended for Enterprise)

Design tools that represent business capabilities with intent-level, constrained inputs, strong contracts with validation, and clear semantics. Examples:

submit_expense_report(employee_id, amount, category)
approve_purchase_order(po_id)
create_supplier_invoice(vendor_id, items, due_date)

Characteristics:

Low reasoning burden
Strong contracts
Safe execution
Clear audit trail
Business-aligned

Pattern 2: Agent-ready tool facade pattern (integration layer)

Use generic MCP tools internally as implementation details, build higher-level business capability tools on top, expose only the business-level interface to agents, and hide complexity behind a simplified, intent-driven facade. Example:

# Internal (not exposed to agent)
odata_create(entity_set, payload)
odata_update(entity_set, key, payload)

# Exposed to agent (the facade)
create_supplier_invoice(vendor_id, items, due_date)
  → internally orchestrates odata_create with validation

Characteristics:

MCP used as plumbing, not interface
You control abstraction and guardrails
Business logic in code, not prompts
Easier to test and maintain
Follows classic facade design pattern principles

Evaluating MCP tool suitability

Quick checks for agent-ready tools:

Check	✅ Agent-Ready	❌ Anti-Pattern
Business action?	YES	NO (likely API wrapper)
Constrained inputs?	Explicit fields with clear semantics	Generic JSON payload
Non-expert friendly?	YES (agent-friendly)	NO (schema leakage)
Safe by default?	Limited, auditable actions	Broad write/execution (high risk)
Logic location?	Logic in system (scalable)	Logic in prompt (fragile)
Reasoning required?	Minimal (good design)	Multi-step inference (anti-pattern)
Designed for agents?	YES	Feels like an API (not for agents)

Foundational architecture considerations Tooling and scalability considerations Knowledge and document processing considerations

⭐️ Most common anti-patterns

​What it is

​Why it fails

​What to do instead: Business capability tools

​What it is

​Why it fails

​What to do instead: Business-level abstractions

​What it is

​Why it fails

​What to do instead: Encode complexity in tools

​What it is

​Why it fails

​What to do instead: Strong contracts

​What it is

​Why it fails

​What to do instead: Constrained, auditable capabilities

​What it is

​Why it fails

​What to do instead: Bounded and filtered tool output

​What it is

​Why it fails

​What to do instead: Treat tool output as untrusted

​What it is

​Why it fails

​What to do instead: Contextual tool discovery

​What it is

​Why it fails

​What to do instead: Structured, agent-friendly error responses

​What it is

​Why it fails

​What to do instead: Stateless, self-contained tools

​MCP usage patterns

​Pattern 1: Direct agent tools (recommended for Enterprise)

​Pattern 2: Agent-ready tool facade pattern (integration layer)

​Evaluating MCP tool suitability

​Related topics

What it is

Why it fails

What to do instead: Business capability tools

What it is

Why it fails

What to do instead: Business-level abstractions

What it is

Why it fails

What to do instead: Encode complexity in tools

What it is

Why it fails

What to do instead: Strong contracts

What it is

Why it fails

What to do instead: Constrained, auditable capabilities

What it is

Why it fails

What to do instead: Bounded and filtered tool output

What it is

Why it fails

What to do instead: Treat tool output as untrusted

What it is

Why it fails

What to do instead: Contextual tool discovery

What it is

Why it fails

What to do instead: Structured, agent-friendly error responses

What it is

Why it fails

What to do instead: Stateless, self-contained tools

MCP usage patterns

Pattern 1: Direct agent tools (recommended for Enterprise)

Pattern 2: Agent-ready tool facade pattern (integration layer)

Evaluating MCP tool suitability

Related topics