Agent descriptions and instructions

Agents in watsonx Orchestrate are intelligent entities that automate tasks by combining a large language model (LLM) with tools, collaborators, and optional knowledge sources. To ensure agents are effective, predictable, and easy to maintain, developers must follow clear guidelines when defining names, descriptions, and instructions. These elements are not just metadata—they directly influence orchestration, user experience, and the quality of responses.

Why descriptions and instructions matter

Descriptions and instructions are not just metadata—they directly influence orchestration:

Supervisor agents rely on descriptions to route tasks to the right collaborator.
Instructions guide the LLM’s reasoning, ensuring the agent uses its tools and collaborators effectively.

Naming agents and tools

The name uniquely identifies the agent in the system and UI. Assign clear and unique names to each agent and tool you create. These names should reflect their capabilities and functions, making it easier for users to identify them within watsonx Orchestrate. To write effective names:

Use snake_case instead of camelCase. They usually work better with routing agents. Agent names must not include spaces or special characters.
Keep names short and descriptive.
Avoid generic terms like “helper” or “assistant”.
Use domain-specific language (for example sales_outreach_agent).

Example

Good: ibm_historical_knowledge_agent
Bad: myAgent1

Writing instructions for agents

Instructions define how an agent behaves, including its tone, reasoning style, and decision-making process. They guide the underlying language model to produce consistent, predictable outputs and determine how the agent uses tools and collaborators.

They act as the agent’s persona and operating manual.
They ensure the agent responds in a way that aligns with user expectations and organizational standards.
They influence orchestration by determining when and how the agent invokes tools or other agents.

Best Practices for Writing Instructions

Use natural language

Write instructions as clear, conversational directives.
Avoid overly technical or ambiguous phrasing.

Example:

Explain technical concepts in simple terms.
When answering questions,
provide short summaries first,
then add details if the user asks for more.

Define tone and style
- Specify whether responses should be professional, friendly, concise, etc.
- Example:
  Respond in a professional tone suitable for business communication.

Include tool usage rules

Tell the agent when and how to use specific tools.

Example:

Use the CRM API tool for customer data.
If data is missing, ask clarifying questions
before proceeding.

Set error-handling guidance
- Define what to do if required information is unavailable.
- Example:
  If you cannot retrieve data, inform the user and suggest alternative actions.

Avoid Overloading Instructions

Keep them focused on behavior and decision-making, not on task-specific details.

Example: Good

Respond in a professional tone. Use the crm_api tool
for customer data. If data is missing, ask
clarifying questions before proceeding.

Bad

Respond professionally, but also be friendly
and humorous when appropriate.
Always check CRM data, but if unavailable,
try ERP, then HR systems, and if
those fail, suggest alternatives. Include
emojis when the user seems happy,
but avoid them in formal contexts. If
scheduling is involved, call the
Calendar agent, but only after confirming
timezone differences and holidays.
Also, summarize every response in bullet
points unless the user asks for paragraphs.

Choosing your LLM

The model you choose directly influences how an agent behaves and interprets instructions. Different models excel at different types of tasks, and many are trained with specialized capabilities. For example, some models include vision features, meaning they can process images, detect patterns, and answer questions about visual input. Other models are optimized specifically for code generation, and may not perform as well in general-purpose tasks. To understand the full capabilities, limitations, and intended use cases of a model, always refer to the model provider’s documentation, typically found in platforms like HuggingFace or watsonx.ai. watsonx Orchestrate offers a variety of models you can select from when creating agents, each suited for different workflows and business needs. To see the list of all supported models, you can run the following command:

orchestrate models list

You can also add new models to the list of supported models by using the AI gateway. For more information, see Managing custom LLMs with the AI gateway.

The default model (`groq/openai/gpt-oss-120b`)

GPT-OSS-120b is OpenAI’s most powerful open-weight large language model, released under the Apache 2.0 license, making it fully suitable for commercial use, customization, and enterprise deployment. It contains around 117-120 billion parameters using a Mixture of Experts (MoE) architecture, which activates only a subset of parameters during inference, enabling strong performance with manageable compute requirements. Within watsonx Orchestrate, GPT-OSS-120b has recently become the default model for building agents, leveraging its strong reasoning capabilities and reliable tool-use behavior. IBM hosts the model on watsonx.ai, enabling secure enterprise usage, integration with Orchestrate agents, and optional high-speed inference through providers such as Groq. Here are a few features of this model:

Advanced Reasoning & Chain-of-Thought: GPT-OSS-120b supports full chain-of-thought reasoning, making it effective for complex instructions, multi-step logic, and orchestrator-style agents that need to autonomously decide which tools or actions to take.
Strong Agentic Capabilities: The model natively supports:
- Function/tool calling
- Structured output formats
- Web browsing & Python execution (where supported)
- Interacting with multi-tool agent workflows
Large Context Window: GPT-OSS-120b offers a 131k-128k token context window (depending on provider), enabling it to process lengthy documents, logs, or multi-turn interactions without losing context.
Fine-Tuning & Customization: Because it is fully open-weight and under Apache 2.0, the model can be:
- Fine-tuned
- Self-hosted
- Customized with enterprise data
- Integrated into internal workflows

Note:
This model is a non-IBM product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to the terms. Read the terms.

Writing instructions for GPT-OSS-120b

GPT-OSS-120b is highly capable but also sensitive to vague or conflicting instructions. To ensure reliable behavior:

Be explicit: Clearly state priorities and constraints.
Avoid ambiguity: Conflicting directives can lead to unpredictable outputs.
Test iteratively: Validate instructions with sample tasks before deploying.
Limit complexity: Break down multi-step reasoning into clear, sequential guidance.

Example for groq/openai/gpt-oss-120b:

Always respond in a professional tone. When asked for CRM data,
use the salesforce_api tool. If the request involves scheduling,
call the calendar_agent. If information is incomplete, ask
clarifying questions before proceeding.

Special considerations

The default groq/openai/gpt-oss-120b model does not use the standard system prompt from watsonx Orchestrate. As a result, it can behave differently from other models, including:

Not explicitly identifying itself as part of the watsonx Orchestrate ecosystem.
Preferring its internal knowledge over your connected knowledge bases, unless you instruct otherwise.
Hyperlinks might not be formatted properly, unless you instruct otherwise.
It does not follow the supported Agent styles.

If you’re migrating existing agents from Llama to GPT-OSS-120B, see the comprehensive Migration guide for detailed instructions, common challenges, and optimization strategies.

Use explicit, natural-language constraints that limit reasoning depth, enforce brevity, and prioritize external knowledge. Add one or more of the following blocks to your agent’s instructions:

Prioritize knowledge bases over internal knowledge

Always check the connected knowledge base(s) first.
Prefer information retrieved from knowledge over your
own internal knowledge. If relevant content is found,
summarize it faithfully and cite the source title or document
name. If the knowledge base does not contain the answer, say
"I don't know based on the provided knowledge" and ask
a clarifying question.

Cap reasoning depth and iteration count

Use concise reasoning. Limit yourself to
a maximum of 3 reasoning steps before answering.
Do not re-plan unless the last tool result contradicts
prior assumptions. If you cannot progress after 3 steps,
ask one focused clarifying question.

Formatting

When generating hyperlinks, use the correct Markdown syntax: [link](url).

Keep chit-chat short

Keep small talk to a single sentence.
Immediately pivot to the user's task with a
concise question or action.

Enforce a strict output budget

Target an answer length of 4-6 sentences (or ≤150 words).
Use bullet points only when they increase clarity.
Avoid repeating the prompt or restating obvious context.

Fail fast when data is missing

If required inputs are missing, do not speculate.
Ask for the minimum missing fields in a single
question and wait.

Combined instruction example (recommended template)

Use this as a compact block tailored for groq/openai/gpt-oss-120b:

Behavior and sources:
- Identify as an agent operating within watsonx Orchestrate.
- Always use available tools to retrieve information before relying on your internal knowledge.
- If tools don't contain the answer, explicitly state "The available tools don't contain the answer", state that clearly and ask one clarifying question.

Reasoning and brevity controls:
- Use concise reasoning with at most 3 reasoning steps before responding.
- Keep chit-chat to one sentence, then proceed.
- Limit final answers to ≤150 words unless the user requests detail.

Formatting:
- When generating hyperlinks, use the correct Markdown syntax: [link](url).
 
Error handling:
- If a tool call or retrieval fails, briefly describe the failure and propose the next best step or ask for a missing input.

Good vs. bad instruction snippets for this topic

Good (concise, enforce constraints):

Use knowledge bases as the primary source.
Cite the document name when used.
Limit yourself to 3 reasoning steps
and less than 150 words in the final answer.
If you cannot find an answer in knowledge,
say so and ask one clarifying question.

Bad (encourages runaway reasoning and verbosity):

Think in exhaustive detail until you're
absolutely certain. Explore all possible interpretations.
Provide a comprehensive narrative of your
thoughts and include context for every claim.

Writing descriptions for agents

The agent description is used by an agent to determine when and how to delegate a task to a collaborator agent, tool or knowledge ensuring the right request is sent to the right capability. When adding any artifact (a collaborator agent, tool, or knowledge) to an agent, the agent description is critical to the agent’s success. Agent descriptions complement the agent names by providing detailed information about their purpose, capabilities, and usage. A well-written description helps users understand the agent’s role and potential applications. Descriptions should not be written in isolation — they must reflect the agent’s scope, use cases, and interaction with collaborators, tools, and knowledge bases. This ensures reliable routing and prevents ambiguity.

Key principles

Descriptions as instructions: Treat the agent description like instructions to the agent on how it should use the artifact (collaborator agent, tool, or knowledge base) in question
Hierarchy matters:
- Agent descriptions are broader and set the overall purpose.
- Collaborator descriptions narrow the scope to specific tasks.
- Tool descriptions are the most specific, clarifying exact functionality.
Avoid overlap: Each agent should have a distinct scope. If scopes are similar, clearly differentiate them.
Include context markers:
- Geographic scope (for example, “US”) helps route queries correctly.
- Domain-specific language (for example, “pet-parents and their fur-babies”) sets tone and aligns with instructions.
Don’t overload descriptions: Mention core capabilities, not every tool. The agent should infer tool usage from context.
Define trigger conditions: Clarify when the agent should use the artifact, such as “Use this when the user requests supplier risk evaluation or mentions creditworthiness.”
Specify actions: Be specific about the actions the artifact should perform. When possible include how the artifact should performs those actions, such as “Access loan data from loan APIs, calculate affordability, and summarize affordability differences”
Define restrictions: Note desired restrictions and limitations. Most artifacts will have a boundary they should stop at, such as “Do not give legal or tax advice, only provide estimates and data-driven comparisons”
Use appropriate jargon: Define any industry specific terms of acronyms so they won’t be misunderstood.

Description examples

HR agent description

Use this agent to compare employee compensation data across
similar roles and regions. Retrieve current market benchmarks
from HR databases, analyze differences, and summarize how
internal pay aligns with external trends. **Do not provide
policy recommendations or make promotion decisions**.

Supplier Analysis tool routing description

Assesses supplier risk by retrieving ratings, credit data,
and compliance history from Dun & Bradstreet and internal
procurement systems. Summarizes financial stability, delivery
reliability, and compliance risk. **Do not make purchase
decisions or alter supplier records**.

Finance agent routing description

Analyze portfolio performance by pulling real-time stock
and bond data, calculating returns, and comparing results
against benchmarks. Summarize key performance drivers and
risk factors. Do not provide personalized investment advice
or execute trades.

Example: agent and collaborator description for a pet store

The following example demonstrates how to write clear, context-rich descriptions for an agent and its collaborators, tools, and knowledge bases. This example also shows how tone and language can align with the agent’s persona, which should be reinforced in the instructions for a consistent user experience. Agent Description

USPetTreeAgent

A US PetTree attendant serving the needs of pet-parents
and their fur-babies. It can help give general pet care
advice, assist in the pet adoption process by matching
users to cats and dogs, and order pet supplies from the
PetTree US Store.

Collaborator description

MatchMakerAgent

An agent to search the PetTree cat & dog registry
to match animal lovers with cats and dogs seeking a
forever home, and file a pet adoption application.

Tools for MatchMakerAgent

SearchPetTreeCats: Search the PetTree registry for cats available for adoption.
SearchPetTreeDogs: Search the PetTree registry for dogs available for adoption.
GetPetTreeAnimalProfile: Get the full profile of an individual cat or dog.
ViewAnimalPhotos: View pictures of cats and dogs available for adoption.
FileAdoptionApplication: File a pet adoption application for a specific cat or dog.

Knowledge Base for MatchMakerAgent

Cat Breeds Care: Facts, personalities, and care guides for different cat breeds.
Dog Breeds Care: Facts, personalities, and care guides for different dog breeds.

Writing descriptions for tools

A good tool description helps agents identify and use the tool effectively. It should include a general overview of the tool’s purpose, as well as details about its inputs and outputs. In Python tools, descriptions are defined in docstrings, following Google-style docstrings. Example:

TEXT

Retrieves all user information based on their unique identifier.

Args:
    id (str): The user's unique identifier.

Returns:
    dict: The user's information, including user_id, user_name, 
    user_email, and user_phone.

Designing agents and tools for best performance

When designing agents and tools, aim for balanced complexity. Components that are too simple may lack utility, while overly complex designs can reduce the model’s ability to reason effectively.

Guidelines for Agents

Agents using LLaMA-based LLMs perform best with 10 or fewer tools or collaborators.
For complex use cases requiring many tools, break the problem into smaller subproblems and assign them to collaborator agents.
This limit may vary for more powerful models.

Guidelines for Tools

Keep input and output schemas as simple as possible.
Avoid tools with:
- A large number of input parameters.
- Parameters with deeply nested or complex data types.
Complex schemas make it harder for the LLM to use the tool effectively.

Release Notes

Get Started

Build

Deploy

Analyze

watsonx Orchestrate Developer Edition

watsonx Orchestrate ADK MCP Server

Reference

Legal notices

Why descriptions and instructions matter

Naming agents and tools

Writing instructions for agents

Best Practices for Writing Instructions

Choosing your LLM

The default model (`groq/openai/gpt-oss-120b`)

Writing instructions for GPT-OSS-120b

Special considerations

Combined instruction example (recommended template)

Good vs. bad instruction snippets for this topic

Writing descriptions for agents

Key principles

Description examples

Example: agent and collaborator description for a pet store

Writing descriptions for tools

Designing agents and tools for best performance

Guidelines for Agents

Guidelines for Tools

Release Notes

Get Started

Build

Deploy

Analyze

watsonx Orchestrate Developer Edition

watsonx Orchestrate ADK MCP Server

Reference

Legal notices

​Why descriptions and instructions matter

​Naming agents and tools

​Writing instructions for agents

​Best Practices for Writing Instructions

​Choosing your LLM

​The default model (groq/openai/gpt-oss-120b)

​Writing instructions for GPT-OSS-120b

​Special considerations

​Combined instruction example (recommended template)

​Good vs. bad instruction snippets for this topic

​Writing descriptions for agents

​Key principles

​Description examples

​Example: agent and collaborator description for a pet store

​Writing descriptions for tools

​Designing agents and tools for best performance

​Guidelines for Agents

​Guidelines for Tools

Why descriptions and instructions matter

Naming agents and tools

Writing instructions for agents

Best Practices for Writing Instructions

Choosing your LLM

The default model (`groq/openai/gpt-oss-120b`)

Writing instructions for GPT-OSS-120b

Special considerations

Combined instruction example (recommended template)

Good vs. bad instruction snippets for this topic

Writing descriptions for agents

Key principles

Description examples

Example: agent and collaborator description for a pet store

Writing descriptions for tools

Designing agents and tools for best performance

Guidelines for Agents

Guidelines for Tools