Managing custom LLMs
You can use third party models from a wide range of supported providers using the AI gateway system. It also allows for policies to be established that handles routing between mutliple models for handling use cases such as load-balancing and fallback.
Supported providers
Provider | Provider ID |
---|---|
OpenAI | openai |
Anthropic | anthropic |
google | |
watsonx.ai | watsonx |
Mistral | mistral |
OpenRouter | openrouter |
Ollama | ollama |
Configuring custom LLMs
Provider configuration
To add a custom LLM, you must provide a JSON string with the provider configuration, like the following example:
Each provider supports different JSON string schemas. The following sections detail the supported schema for each provider.
Most providers require an api_key
value to authenticate with the LLM service. Although you can use this value in the JSON string configuration, the safest way to store secret values, such as API keys, is to use a connection:
The following sections include the supported values for each provider in the provider configuration.
OpenAI
api_key
(Required)custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
Anthropic
api_key
(Required)anthropic_beta
anthropic_version
custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
api_key
(Required)custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
watsonx.ai
api_key
(Required)- You must provide either your Space ID, Project ID or Deployment ID. You don’t need all three:
watsonx_space_id
(Required)watsonx_project_id
(Required)watsonx_deployment_id
(Required)
watsonx_cpd_url
(Required in on-premises environments)watsonx_cpd_username
(Required in on-premises environments)watsonx_cpd_password
(Required in on-premises environments)watsonx_version
custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
Mistral
api_key
(Required)mistral_fim_completion
custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
OpenRouter
api_key
(Required)custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
Ollama
api_key
custom_host
url_to_fetch
forward_headers
request_timeout
transform_to_form_data
Adding custom LLM
Run the orchestrate models add
command to add a custom LLM to your active environment.
Arguments:
--name
(-n
): The name of the model you want to add. This name must follow the pattern<provider>
/<model_name>
. The provider must be exactly as outlined in the Supported providers section. And themodel_name
must be exactly the same as the name that appears on the provider’s API documentaion.--description
(-d
): An optional description to appear alongside the model in the list view.--display-name
: An Optional display name for the model in the UI.--provider-config
: A JSON string of configuration options. These can also be provided via the connection referenced in--app-id
, especially secret values. You can use the--provider-config
alongside an--app-id
to provide non-required values.--type
- The type of model that is being created. These are the supported types:chat
: Model that supports chat capabilities.chat_vision
: Model that supports chat and image capabilities.completion
: Model used for completion engines.embedding
: Embedding model used for transforming data.
--app-id
(-a
): The app ID of akey_value
connection containing provider configuration details. These will be merged with the values provided in--provider-config
.
Importing models
If you want more control over your models and the ability to version control the model configuration. Consider using the orchestrate models import
command
Arguments:
--file
(-f
): File path of the spec file containing the model configuration--app-id
(-a
): The app id of akey_value
connection containing provider configuration details. These will be merged with the values provided in theprovider_config
section of the spec.
List all LLMs
Run the orchestrate models list
command to see all available LLMs in your active environment.
Note: By default, you’ll see a table of available models. If you prefer raw output, add the --raw
(-r
) argument.
Removing custom LLMs
Run the orchestrate models remove
command and use the --name
(-n
) argument to specify the LLM you want to remove.
Updating custom LLM
To update a custom LLM, first remove it, then add it again. For more information, see Removing custom LLMs and Adding custom LLM.
Configuring model policies
Model policies allow for the coordination of multiple models to accomplish tasks like load-balancing and fallback.
Adding model policies
Arguments:
--name
(-n
): The name of the policy you want to add.--description
(-d
): An optional description to appear a long side the policy in the list view.--display-name
: An optional display name for the policy in the UI--strategy
(-s
): The policy mode you want to use.loadbalance
: These models operate together by distributing the load of requests between them, following the distribution of weight values. By default, both weight values are attributed as1
, so the loads are evenly balanced between the models. If you want to customize the weight values, see Importing model policies.fallback
: If one of the models is unavailable, the agent will try to use the other one as a fallback alternative.single
: Uses a only one model, but allows for--retry-on-code
and--retry-attempts
.
--strategy-on-code
: A list of HTTP error codes which triggers the strategy. Used forfallback
strategy.--retry-on-code
: A list of HTTP error codes for which the model should retry the request.--retry-attempts
: How many attempts it should make before stopping.
Importing model policies
Where the my_spec.yaml
file follows this structure:
Arguments:
--file
(-f
): File path of the spec file containing the model policy configuration.
Update model policy
Use either the add
or import
commands with the name of the model policy that you want to update to update the model policy.
Removing model policies
Arguments:
--name
(-n
): The name of the model policy that you want to remove.