Why descriptions and instructions matter
Descriptions and instructions are not just metadata—they directly influence orchestration:- Supervisor agents rely on descriptions to route tasks to the right collaborator.
- Instructions guide the LLM’s reasoning, ensuring the agent uses its tools and collaborators effectively.
Naming agents and tools
The name uniquely identifies the agent in the system and UI. Assign clear and unique names to each agent and tool you create. These names should reflect their capabilities and functions, making it easier for users to identify them within watsonx Orchestrate. To write effective names:- Use
snake_caseinstead ofcamelCase. They usually work better with routing agents. Agent names must not include spaces or special characters. - Keep names short and descriptive.
- Avoid generic terms like “helper” or “assistant”.
- Use domain-specific language (for example
sales_outreach_agent).
- Good:
ibm_historical_knowledge_agent - Bad:
myAgent1
Writing instructions for agents
Instructions define how an agent behaves, including its tone, reasoning style, and decision-making process. They guide the underlying language model to produce consistent, predictable outputs and determine how the agent uses tools and collaborators.- They act as the agent’s persona and operating manual.
- They ensure the agent responds in a way that aligns with user expectations and organizational standards.
- They influence orchestration by determining when and how the agent invokes tools or other agents.
Best Practices for Writing Instructions
-
Use natural language
- Write instructions as clear, conversational directives.
- Avoid overly technical or ambiguous phrasing.
-
Example:
-
Define tone and style
- Specify whether responses should be professional, friendly, concise, etc.
-
Example:
-
Include tool usage rules
- Tell the agent when and how to use specific tools.
-
Example:
-
Set error-handling guidance
- Define what to do if required information is unavailable.
-
Example:
-
Avoid Overloading Instructions
- Keep them focused on behavior and decision-making, not on task-specific details.
-
Example:
Good
Bad
Choosing your LLM
The model you choose directly influences how an agent behaves and interprets instructions. Different models excel at different types of tasks, and many are trained with specialized capabilities. For example, some models include vision features, meaning they can process images, detect patterns, and answer questions about visual input. Other models are optimized specifically for code generation, and may not perform as well in general-purpose tasks. To understand the full capabilities, limitations, and intended use cases of a model, always refer to the model provider’s documentation, typically found in platforms like HuggingFace or watsonx.ai. watsonx Orchestrate offers a variety of models you can select from when creating agents, each suited for different workflows and business needs. To see the list of all supported models, you can run the following command:The default model (groq/openai/gpt-oss-120b)
GPT-OSS-120b is OpenAI’s most powerful open-weight large language model, released under the Apache 2.0 license, making it fully suitable for commercial use, customization, and enterprise deployment. It contains around 117-120 billion parameters using a Mixture of Experts (MoE) architecture, which activates only a subset of parameters during inference, enabling strong performance with manageable compute requirements.
Within watsonx Orchestrate, GPT-OSS-120b has recently become the default model for building agents, leveraging its strong reasoning capabilities and reliable tool-use behavior. IBM hosts the model on watsonx.ai, enabling secure enterprise usage, integration with Orchestrate agents, and optional high-speed inference through providers such as Groq.
Here are a few features of this model:
- Advanced Reasoning & Chain-of-Thought: GPT-OSS-120b supports full chain-of-thought reasoning, making it effective for complex instructions, multi-step logic, and orchestrator-style agents that need to autonomously decide which tools or actions to take.
-
Strong Agentic Capabilities: The model natively supports:
- Function/tool calling
- Structured output formats
- Web browsing & Python execution (where supported)
- Interacting with multi-tool agent workflows
- Large Context Window: GPT-OSS-120b offers a 131k-128k token context window (depending on provider), enabling it to process lengthy documents, logs, or multi-turn interactions without losing context.
-
Fine-Tuning & Customization: Because it is fully open-weight and under Apache 2.0, the model can be:
- Fine-tuned
- Self-hosted
- Customized with enterprise data
- Integrated into internal workflows
Note:
This model is a non-IBM product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to the terms. Read the terms.
This model is a non-IBM product governed by a third-party license that may impose use restrictions and other obligations. By using this model you agree to the terms. Read the terms.
Writing instructions for GPT-OSS-120b
GPT-OSS-120b is highly capable but also sensitive to vague or conflicting instructions. To ensure reliable behavior:- Be explicit: Clearly state priorities and constraints.
- Avoid ambiguity: Conflicting directives can lead to unpredictable outputs.
- Test iteratively: Validate instructions with sample tasks before deploying.
- Limit complexity: Break down multi-step reasoning into clear, sequential guidance.
groq/openai/gpt-oss-120b:
Special considerations
The defaultgroq/openai/gpt-oss-120b model does not use the standard system prompt from watsonx Orchestrate. As a result, it can behave differently from other models, including:
- Not explicitly identifying itself as part of the watsonx Orchestrate ecosystem.
- Preferring its internal knowledge over your connected knowledge bases, unless you instruct otherwise.
- Hyperlinks might not be formatted properly, unless you instruct otherwise.
- It does not follow the supported Agent styles.
If you’re migrating existing agents from Llama to GPT-OSS-120B, see the comprehensive Migration guide for detailed instructions, common challenges, and optimization strategies.
-
Prioritize knowledge bases over internal knowledge
-
Cap reasoning depth and iteration count
-
Formatting
-
Keep chit-chat short
-
Enforce a strict output budget
-
Fail fast when data is missing
Combined instruction example (recommended template)
Use this as a compact block tailored forgroq/openai/gpt-oss-120b:
Good vs. bad instruction snippets for this topic
Good (concise, enforce constraints):Writing descriptions for agents
The agent description is used by an agent to determine when and how to delegate a task to a collaborator agent, tool or knowledge ensuring the right request is sent to the right capability. When adding any artifact (a collaborator agent, tool, or knowledge) to an agent, the agent description is critical to the agent’s success. Agent descriptions complement the agent names by providing detailed information about their purpose, capabilities, and usage. A well-written description helps users understand the agent’s role and potential applications. Descriptions should not be written in isolation — they must reflect the agent’s scope, use cases, and interaction with collaborators, tools, and knowledge bases. This ensures reliable routing and prevents ambiguity.Key principles
- Descriptions as instructions: Treat the agent description like instructions to the agent on how it should use the artifact (collaborator agent, tool, or knowledge base) in question
- Hierarchy matters:
- Agent descriptions are broader and set the overall purpose.
- Collaborator descriptions narrow the scope to specific tasks.
- Tool descriptions are the most specific, clarifying exact functionality.
- Avoid overlap: Each agent should have a distinct scope. If scopes are similar, clearly differentiate them.
- Include context markers:
- Geographic scope (for example, “US”) helps route queries correctly.
- Domain-specific language (for example, “pet-parents and their fur-babies”) sets tone and aligns with instructions.
- Don’t overload descriptions: Mention core capabilities, not every tool. The agent should infer tool usage from context.
- Define trigger conditions: Clarify when the agent should use the artifact, such as “Use this when the user requests supplier risk evaluation or mentions creditworthiness.”
- Specify actions: Be specific about the actions the artifact should perform. When possible include how the artifact should performs those actions, such as “Access loan data from loan APIs, calculate affordability, and summarize affordability differences”
- Define restrictions: Note desired restrictions and limitations. Most artifacts will have a boundary they should stop at, such as “Do not give legal or tax advice, only provide estimates and data-driven comparisons”
- Use appropriate jargon: Define any industry specific terms of acronyms so they won’t be misunderstood.
Description examples
HR agent description
Supplier Analysis tool routing description
Finance agent routing description
Example: agent and collaborator description for a pet store
The following example demonstrates how to write clear, context-rich descriptions for an agent and its collaborators, tools, and knowledge bases. This example also shows how tone and language can align with the agent’s persona, which should be reinforced in the instructions for a consistent user experience. Agent DescriptionUSPetTreeAgent
MatchMakerAgent
SearchPetTreeCats: Search the PetTree registry for cats available for adoption.SearchPetTreeDogs: Search the PetTree registry for dogs available for adoption.GetPetTreeAnimalProfile: Get the full profile of an individual cat or dog.ViewAnimalPhotos: View pictures of cats and dogs available for adoption.FileAdoptionApplication: File a pet adoption application for a specific cat or dog.
- Cat Breeds Care: Facts, personalities, and care guides for different cat breeds.
- Dog Breeds Care: Facts, personalities, and care guides for different dog breeds.
Writing descriptions for tools
A good tool description helps agents identify and use the tool effectively. It should include a general overview of the tool’s purpose, as well as details about its inputs and outputs. In Python tools, descriptions are defined in docstrings, following Google-style docstrings. Example:TEXT
Designing agents and tools for best performance
When designing agents and tools, aim for balanced complexity. Components that are too simple may lack utility, while overly complex designs can reduce the model’s ability to reason effectively.Guidelines for Agents
- Agents using LLaMA-based LLMs perform best with 10 or fewer tools or collaborators.
- For complex use cases requiring many tools, break the problem into smaller subproblems and assign them to collaborator agents.
- This limit may vary for more powerful models.
Guidelines for Tools
- Keep input and output schemas as simple as possible.
- Avoid tools with:
- A large number of input parameters.
- Parameters with deeply nested or complex data types.
- Complex schemas make it harder for the LLM to use the tool effectively.

