Before you begin

In order to analyze, you must first evaluate your agent. For more information, see Evaluating agents and tools.

Analyzing

The analyze command provides a detailed breakdown of your agent evaluation results, highlighting where the agent succeeded, failed, and why.

The analyze command generates an overview analysis for each dataset result in the specified directory. It helps you quickly identify:

  • Which tool calls were expected and made
  • Which were irrelevant or incorrect
  • Any parameter mismatches
  • A high-level summary of the agent’s performance

The analysis includes:

  • Analysis Summary: Presents key counts (expected vs. actual tool calls, irrelevant or incorrect tool use).
  • Conversation History: Step-by-step breakdown of every message exchanged, providing insight into where things went right or wrong.
  • Analysis Results: Details the specific mistakes, along with the reasoning for each error (e.g., irrelevant tool calls).
orchestrate evaluations analyze --data-path path/to/results

Arguments:

  • --data-path: Directory where your evaluation results are saved.

Before you run the command, enlarge you terminal window to better visualize the output. The output of the command can truncate some of the information in smaller terminal windows.

Example Output

Running analyze on the evaluation results of a dataset, such as examples/evaluations/hr_sample/data_simple.json, produces an output like the following:

╭────────────────────────────────────────────────────────────────────────────────────────── 📋 Analysis Summary ──────────────────────────────────────────────────────────────────────────────────────────╮
│  Test Case Name: data3                                                                                                                                                                                  │
│  Expected Tool Calls: 3                                                                                                                                                                                 │
│  Correct Tool Calls: 2                                                                                                                                                                                  │
│  Text Match: Summary Matched                                                                                                                                                                            │
│  Journey Success: False                                                                                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────────────── Conversation History ───────────────────────────────────────╮╭───────────────────────────────────────── Analysis Results ──────────────────────────────────────────╮
│ 👤 User: I want to know my timeoff schedule for a certain period                                   ││ ❌ fetch_assignment_id: {"reason": "irrelevant tool call"}                                          │
│ 🤖 Assistant: What is your username?                                                               ││                                                                                                     │
│ 👤 User: nwaters                                                                                   ││ ❌ retrieve_timeoff_schedule: {"reason": "irrelevant tool call"}                                    │
│ ❌ Tool Call: {"type": "tool_call", "name": "fetch_assignment_id", "args": {"username":            ││                                                                                                     │
│ "nwaters"}, "id": "chatcmpl-tool-1363bcbc0b7b495c8eb04f10d5"}                                      ││                                                                                                     │
│ 🤖 Assistant: {"type": "tool_response", "content": "15778303", "name": "fetch_assignment_id",      ││                                                                                                     │
│ "tool_call_id": "chatcmpl-tool-1363bcbc0b7b495c8eb04f10d5"}                                        ││                                                                                                     │
│ 🤖 Assistant: What is the start date of the period you want to know your time-off schedule for?    ││                                                                                                     │
│ 👤 User: 2025-01-01                                                                                ││                                                                                                     │
│ 🤖 Assistant: What is the end date of the period you want to know your time-off schedule for?      ││                                                                                                     │
│ 👤 User: 2025-03-03                                                                                ││                                                                                                     │
│ ❌ Tool Call: {"type": "tool_call", "name": "retrieve_timeoff_schedule", "args": {"assignment_id": ││                                                                                                     │
│ "15778303", "start_date": "2025-01-01", "end_date": "2025-03-03"}, "id":                           ││                                                                                                     │
│ "chatcmpl-tool-b87205d164d84e9cb09109872a"}                                                        ││                                                                                                     │
│ 🤖 Assistant: {"type": "tool_response", "content": "[\"2025-01-05\"]", "name":                     ││                                                                                                     │
│ "retrieve_timeoff_schedule", "tool_call_id": "chatcmpl-tool-b87205d164d84e9cb09109872a"}           ││                                                                                                     │
│ 🤖 Assistant: Your time-off schedule for the period from 2025-01-01 to 2025-03-03 is on            ││                                                                                                     │
│ 2025-01-05.                                                                                        ││                                                                                                     │
╰────────────────────────────────────────────────────────────────────────────────────────────────────╯╰─────────────────────────────────────────────────────────────────────────────────────────────────────╯
  • Always verify that your API credentials are set before running analyze.
  • Use the analysis output to quickly identify patterns in agent errors and focus your improvement efforts.