You can use third party models from a wide range of supported providers using the AI gateway system. It also allows for policies to be established that handles routing between multiple models for handling use cases such as load-balancing and fallback.

Supported providers

ProviderProvider ID
OpenAIopenai
Azure OpenAIazure-openai
AWS Bedrockbedrock
Anthropicanthropic
Googlegoogle
watsonx.aiwatsonx
Mistralmistral-ai
OpenRouteropenrouter
Ollamaollama

Configuring custom LLMs

Provider configuration

To add a custom LLM, you must provide a JSON string with the provider configuration, like the following example:
JSON
{
  "custom_host": "https://example.com/v1/api",
  "request_timeout": 500,
}
Each provider supports different JSON string schemas. The following sections detail the supported schema for each provider. Most providers require an api_key value to authenticate with the LLM service. Although you can use this value in the JSON string configuration, the safest way to store secret values, such as API keys, is to use a connection:
BASH
orchestrate connections add -a my_creds
orchestrate connections configure -a my_creds --env draft -k key_value -t team
orchestrate connections set-credentials -a my_creds --env draft -e "api_key=my_api_key"
The following sections include the supported values for each provider in the provider configuration.

OpenAI

  • api_key (Required)
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

Azure OpenAI

  • api_key (Required)
  • azure_resource_name (Required)
  • azure_deployment_id (Required)
  • azure_api_version (Required)
  • azure_model_name' (Required)
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

AWS Bedrock

  • api_key (Required)
  • aws_secret_access_key (Required)
  • aws_access_key_id (Required)
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data
  • aws_session_token
  • aws_region
  • aws_auth_type
  • aws_role_arn
  • aws_external_id
  • aws_s3_bucket
  • aws_s3_object_key
  • aws_bedrock_model
  • aws_server_side_encryption
  • aws_server_side_encryption_kms_key_id

Anthropic

  • api_key (Required)
  • anthropic_beta
  • anthropic_version
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

Google

  • api_key (Required)
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

watsonx.ai

  • api_key (Required)
  • You must provide either your Space ID, Project ID or Deployment ID. You don’t need all three:
    • watsonx_space_id (Required)
    • watsonx_project_id (Required)
    • watsonx_deployment_id (Required)
  • watsonx_cpd_url (Required in on-premises environments)
  • watsonx_cpd_username (Required in on-premises environments)
  • watsonx_cpd_password (Required in on-premises environments)
  • watsonx_version
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

Mistral

  • api_key (Required)
  • mistral_fim_completion
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

OpenRouter

  • api_key (Required)
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

Ollama

  • api_key
  • custom_host
  • url_to_fetch
  • forward_headers
  • request_timeout
  • transform_to_form_data

Adding custom LLM

Run the orchestrate models add command to add a custom LLM to your active environment.
BASH
orchestrate models add --name watsonx/meta-llama/llama-3-2-90b-vision-instruct --app-id watsonx_ai_creds
Arguments:
  • --name (-n): The name of the model you want to add. This name must follow the pattern <provider>/<model_name>. The provider must be exactly as outlined in the Supported providers section. And the model_name must be exactly the same as the name that appears on the provider’s API documentaion.
  • --description (-d): An optional description to appear alongside the model in the list view.
  • --display-name: An Optional display name for the model in the UI.
  • --provider-config: A JSON string of configuration options. These can also be provided via the connection referenced in --app-id, especially secret values. You can use the --provider-config alongside an --app-id to provide non-required values.
  • --type - The type of model that is being created. These are the supported types:
    • chat: Model that supports chat capabilities.
    • chat_vision: Model that supports chat and image capabilities.
    • completion: Model used for completion engines.
    • embedding: Embedding model used for transforming data.
  • --app-id (-a): The app ID of a key_value connection containing provider configuration details. These will be merged with the values provided in --provider-config.

Registering a watsonx model by using your watsonx credentials

You can also register a watsonx model that uses your watsonx credentials supplied in your .env file when you start the watsonx Orchestrate Developer Edition. For that, your .env file must contain either:
  • Your watsonx.ai credentials with the WATSONX_APIKEY and WATSONX_SPACE_ID environment variables.
  • Or, your watsonx Orchestrate credentials with the WO_INSTANCE and WO_API_KEY environment variables.
To learn how to configure you .env file with these credentials, see Installing the watsonx Orchestrate Developer Edition. To register the watsonx model through this method, you must create an api_key credential with the value “gateway”. You also don’t need to specify a space_id when you add the model. See the following example:
BASH
orchestrate connections configure -a wx_gw_creds --env draft -k key_value -t team
orchestrate connections set-credentials -a wx_gw_creds --env draft -e "api_key=gateway" 
orchestrate models add --name "watsonx/meta-llama/llama-3-2-90b-vision-instruct"  --app-id wx_gw_creds 

Importing models

If you want more control over your models and the ability to version control the model configuration. Consider using the orchestrate models import command
BASH
orchestrate models import --file path_to_my_spec --app-id watsonx_ai_creds
YAML
spec_version: v1
kind: model
name: watsonx/meta-llama/llama-3-2-90b-vision-instruct
display_name: Llama 3.2 Vision Instruct #Optional
description: Meta's Llama 3.2 Vision Instruct with 90b parameters running on WatsonX AI #Optional
tags: #Optional
  - meta
  - llama
model_type: chat #Optional Default to "chat"
provider_config:
  watsonx_space_id: my_wxai_space_id
Arguments:
  • --file (-f): File path of the spec file containing the model configuration
  • --app-id (-a): The app id of a key_value connection containing provider configuration details. These will be merged with the values provided in the provider_config section of the spec.

List all LLMs

Run the orchestrate models list command to see all available LLMs in your active environment.
BASH
orchestrate models list
Note: By default, you’ll see a table of available models. If you prefer raw output, add the --raw (-r) argument.

Removing custom LLMs

Run the orchestrate models remove command and use the --name (-n) argument to specify the LLM you want to remove.
BASH
orchestrate models remove -n <model-name-unique-identifier-to-delete>

Updating custom LLM

To update a custom LLM, first remove it, then add it again:
[BASH]
orchestrate models remove -n <model-name-unique-identifier-to-delete>
orchestrate models add --name watsonx/meta-llama/llama-3-2-90b-vision-instruct --app-id watsonx_ai_creds

Configuring model policies

Model policies allow for the coordination of multiple models to accomplish tasks like load-balancing and fallback.

Adding model policies

BASH
orchestrate models policy add --name <model_name> --model <provider1>/<model_id1> --model <provider2>/<model_id2> --strategy <strategy_type> --strategy-on-code 500 --retry-on-code 503 --retry-attempts 3
Arguments:
  • --name (-n): The name of the policy you want to add.
  • --description (-d): An optional description to appear a long side the policy in the list view.
  • --display-name: An optional display name for the policy in the UI
  • --strategy (-s): The policy mode you want to use.
    • loadbalance: These models operate together by distributing the load of requests between them, following the distribution of weight values. By default, both weight values are attributed as 1, so the loads are evenly balanced between the models. If you want to customize the weight values, see Importing model policies.
    • fallback: If one of the models is unavailable, the agent will try to use the other one as a fallback alternative.
    • single: Uses a only one model, but allows for --retry-on-code and --retry-attempts.
  • --strategy-on-code: A list of HTTP error codes which triggers the strategy. Used for fallback strategy.
  • --retry-on-code: A list of HTTP error codes for which the model should retry the request.
  • --retry-attempts: How many attempts it should make before stopping.

Importing model policies

BASH
orchestrate models policy import --file my_spec.yaml
Where the my_spec.yaml file follows this structure:
[my_spec.yaml]
spec_version: v1
kind: model
name: anygem
description: Balances requests between 2 Gemini models
display_name: Any Gem
policy:
  strategy:
    mode: loadbalance
  retry:
    attempts: 1
    on_status_codes: [503]
  targets:
    - model_name: virtual-model/google/gemini-2.0-flash
      weight: 0.75   # Weights must be greater than 0 and less than or equal to 1  
    - model_name: virtual-model/google/gemini-2.0-flash-lite
      weight: 0.25
Arguments:
  • --file (-f): File path of the spec file containing the model policy configuration.

Update model policy

Use either the add or import commands with the name of the model policy that you want to update to update the model policy.

Removing model policies

BASH
orchestrate models policy remove -n <name of policy>
Arguments:
  • --name (-n): The name of the model policy that you want to remove.