> ## Documentation Index
> Fetch the complete documentation index at: https://developer.watson-orchestrate.ibm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Configuring model policies (Public preview)

Model policies allow for the coordination of multiple models to accomplish tasks like load-balancing and fallback.

<Warning>
  Model policies only support virtual models. You must [import models](./managing_llm#cli-reference) as virtual models and reference them in your policies using the `virtual-model/` prefix (e.g., `virtual-model/google/gemini-2.0-flash`). Direct references to provider models (e.g., `groq/openai/gpt-oss-120b` or `bedrock/openai.gpt-oss-120b-1:0`) are not supported.
</Warning>

## Adding model policies

```bash BASH theme={null}
orchestrate models policy add --name <model_name> --model <provider1>/<model_id1> --model <provider2>/<model_id2> --strategy <strategy_type> --strategy-on-code 500 --retry-on-code 503 --retry-attempts 3
```

<Expandable title="command flags">
  <ResponseField name="--name / -n" type="string">
    The name of the policy you want to add.
  </ResponseField>

  <ResponseField name="--description / -d" type="string">
    An optional description to appear alongside the policy in the list view.
  </ResponseField>

  <ResponseField name="--display-name" type="string">
    An optional display name for the policy in the UI.
  </ResponseField>

  <ResponseField name="--strategy / -s" type="string">
    The policy mode you want to use.<br />`loadbalance`: Distributes requests between models based on weight values (default is `1` for each).<br />`fallback`: Uses another model if one becomes unavailable.<br />`single`: Uses only one model, but allows for `--retry-on-code` and `--retry-attempts`.
  </ResponseField>

  <ResponseField name="--strategy-on-code" type="list[int]">
    A list of HTTP error codes which triggers the strategy. Used for `fallback` strategy.
  </ResponseField>

  <ResponseField name="--retry-on-code" type="list[int]">
    A list of HTTP error codes for which the model should retry the request.
  </ResponseField>

  <ResponseField name="--retry-attempts" type="int">
    How many attempts it should make before stopping.
  </ResponseField>
</Expandable>

## Importing model policies

```bash BASH theme={null}
orchestrate models policy import --file my_spec.yaml
```

<Note>
  After you create a model policy, assign the policy to your agent by using the ADK and the agent YAML file. Assign the policy the same way that you assign a regular model. You cannot assign a model policy to a Generative Prompt Activity in a flow.
</Note>

Where the `my_spec.yaml` file follows this structure:

<Tabs>
  <Tab title="Load balancing">
    ```yaml [my_spec.yaml] theme={null}
    spec_version: v1
    kind: model
    name: anygem
    description: Balances requests between 2 Gemini models
    display_name: Any Gem
    policy:
      strategy:
        mode: loadbalance
        on_status_codes: [503, 504]
      retry:
        attempts: 1
      targets:
        - model_name: virtual-model/google/gemini-2.0-flash
          weight: 0.75   # Weights must be greater than 0 and less than or equal to 1
        - model_name: virtual-model/google/gemini-2.0-flash-lite
          weight: 0.25
    ```
  </Tab>

  <Tab title="Fallback">
    ```yaml [my_spec.yaml] theme={null}
    spec_version: v1
    kind: model
    name: firstgem
    description: Use the first Gemini model that doesn't return 503
    display_name: First Gem
    policy:
      strategy:
        mode: fallback
      retry:
        attempts: 1
        on_status_codes: [503]
      targets:
        - model_name: virtual-model/google/gemini-2.0-flash
        - model_name: virtual-model/google/gemini-2.0-flash-lite
    ```
  </Tab>

  <Tab title="Single">
    ```yaml [my_spec.yaml] theme={null}
        spec_version: v1
        kind: model
        name: retrygem
        description: Gemini model that retries up to 3 times on 503
        display_name: Retry Gem
        policy:
          strategy:
            mode: single
          retry:
            attempts: 3
            on_status_codes: [503]
          targets:
            - model_name: virtual-model/google/gemini-2.0-flash
    ```
  </Tab>
</Tabs>

**Flags**:

* `--file` (`-f`): File path of the spec file containing the model policy configuration.

## Updating model policy

Use either the [`add`](#adding-model-policies) or [`import`](#importing-model-policies) commands with the name of the model policy that you want to update to update the model policy.

## Exporting model policy

```bash BASH theme={null}
orchestrate models policy export -n <model_name> -o <path>.zip
```

<Expandable title="command flags">
  | Flag              | Type   | Required | Description                                     |
  | ----------------- | ------ | -------- | ----------------------------------------------- |
  | `--name` (`-n`)   | string | Yes      | The model policy name to export.                |
  | `--output` (`-o`) | string | Yes      | The file path where the exported data is saved. |
</Expandable>

## Removing model policies

```bash BASH theme={null}
orchestrate models policy remove -n <name of policy>
```

<Expandable title="command flags">
  <ResponseField name="--name / -n" type="string">
    The name of the policy you want to remove.
  </ResponseField>
</Expandable>
