llama-3.2-90B-vision-instruct to gpt-oss-120b. Based on extensive testing across 256 production agents, GPT-OSS-120B delivers superior performance in response time, error handling, conversational quality, and user experience when properly configured.
Why migrate to GPT-OSS-120B
GPT-OSS-120B offers significant advantages over Llama models:- Faster response times across all scenarios
- Superior error handling with intelligent recovery and detailed explanations
- Enhanced conversational efficiency through single-turn parameter collection
- Better memory and context retention across multi-turn dialogues
- Improved safety with robust out-of-scope request handling
- More natural interactions that feel less robotic
Before you begin
Prerequisites
- Access to watsonx Orchestrate (SaaS or Developer Edition)
- Existing agents built with Llama models
- Familiarity with agent configuration and prompt engineering
Understanding the differences
GPT-OSS-120B behaves differently from Llama models in several key ways:| Aspect | Llama 3.2 90B | GPT-OSS-120B |
|---|---|---|
| System prompt | Uses standard watsonx Orchestrate system prompt | Does not use standard system prompt |
| Knowledge preference | Balanced between internal and external knowledge | Prefers internal knowledge unless instructed otherwise |
| Parameter collection | Turn-by-turn questioning | Single-turn collection of multiple parameters |
| Error communication | Generic error messages | Context-aware, detailed explanations |
| Instruction following | Flexible interpretation | Literal, precise following of examples |
| Reasoning approach | Concise execution | More exploratory with additional tool calls |
Migration process
Review your agent configuration
Export your existing Llama-based agent to review its configuration:Document the following elements:
- Agent instructions and tone
- Tool usage patterns
- Knowledge base dependencies
- Expected user interaction flows
Optimize agent instructions
GPT-OSS-120B requires explicit, model-specific instructions. Add the following blocks to your agent’s instructions based on your needs:
Essential instruction template
Use this comprehensive template as a starting point:Prioritize knowledge bases (if applicable)
If your agent uses knowledge bases, add this instruction:Optimize tool usage
For agents with multiple tools, provide clear guidance:Control agent routing (for supervisor agents)
If your agent delegates to other agents, use explicit action verbs:Remove problematic patterns
GPT-OSS-120B can be constrained by overly specific examples. Review and update:✅ Prefer:✅ Prefer:
❌ Avoid:❌ Avoid:Test and validate
After updating your agent configuration:
-
Test basic interactions:
- Test tool calling: Verify that tools are called correctly with appropriate parameters.
- Test error scenarios: Ensure error messages are clear and recovery is intelligent.
- Test multi-turn conversations: Confirm context retention across multiple exchanges.
- Test edge cases: Validate behavior with incomplete information, out-of-scope requests, and ambiguous queries.
Common migration challenges
Challenge 1: Over-reliance on internal knowledge
Symptom: Agent provides answers from its training data instead of using tools or knowledge bases. Solution: Add explicit knowledge prioritization instructions (see Step 3).Challenge 2: Excessive parameter collection
Symptom: Agent asks for parameters before routing to specialized agents or calling tools. Solution: Use strong negations in instructions:Challenge 3: Literal example following
Symptom: Agent only handles scenarios exactly as shown in examples. Solution: Remove specific examples and use generic patterns instead.Challenge 4: Tool call precision issues
Symptom: Agent makes irrelevant tool calls or misses required calls. Solution:- Improve tool descriptions with clear use cases
- Add explicit tool usage rules in instructions
- Test iteratively and refine based on results
Challenge 5: Agent routing confusion
Symptom: Agent returns JSON instead of executing agent calls. Solution: Change from implicit to explicit instructions:Prompt engineering best practices
DO:
- ✅ Use explicit action verbs (“Call”, “Execute”, “Use”)
- ✅ Provide strong negations for unwanted behaviors
- ✅ Keep examples generic and minimal
- ✅ Trust the model’s reasoning capabilities
- ✅ Leverage single-turn parameter collection
- ✅ Test iteratively with real scenarios
DON’T:
❌Rely on implicit instructions (“delegate”, “route”)❌Provide overly specific examples that constrain behavior❌List exhaustive value options (agent may limit itself)❌Use weak negations for critical constraints❌Force turn-by-turn parameter collection❌Over-constrain conversational patterns
Performance optimization tips
Reduce response latency
Improve conversational flow
Enhance error recovery
Validation checklist
Before considering your migration complete, verify:- Agent uses
groq/openai/gpt-oss-120bas the LLM - Instructions include model-specific optimizations
- Knowledge base prioritization is configured (if applicable)
- Tool usage rules are explicit and clear
- Agent routing uses explicit action verbs (if applicable)
- Overly specific examples have been removed
- Basic interactions work as expected
- Tool calling is accurate and efficient
- Error handling is clear and helpful
- Multi-turn conversations maintain context
- Edge cases are handled gracefully
Troubleshooting
Agent is too verbose
Agent is too verbose
Add output constraints to instructions:
Agent ignores knowledge bases
Agent ignores knowledge bases
Strengthen knowledge prioritization:
Agent makes too many tool calls
Agent makes too many tool calls
Add reasoning constraints:
Agent doesn't follow instructions
Agent doesn't follow instructions
Ensure instructions are:
- Explicit and unambiguous
- Free from conflicting directives
- Using strong, clear language
- Tested with multiple scenarios
Next steps
Managing agents
Learn how to update, export, and manage your migrated agents.
Agent descriptions and instructions
Deep dive into writing effective instructions for GPT-OSS-120B.
Managing custom LLMs
Explore advanced LLM configuration options.
Model policies
Set up fallback policies and load balancing for production resilience.

