Observability

Once your agent is running, you need to know how it's performing. The Observability tab on the agent details page gives you that visibility, from high-level metrics down to the exact steps the agent took in every execution.

Think of it as three layers you can drill into: the dashboard shows overall performance, sessions show individual interactions, and execution details show exactly what happened step by step.

Dashboard

The Observability tab opens with a dashboard showing key metrics for the selected time period (default is last 30 days, adjustable via the dropdown). Stats auto-update every 30 minutes.

Eight metric cards give you the big picture:

Total Sessions: How many times the agent was invoked during the selected period.
Total Executions: The total number of execution runs across all sessions. A single session can have multiple executions.
Agent Execution Latency: The average time the agent takes to complete an execution.
p95 Latency (Execution): The 95th percentile execution time. This tells you what the slowest 5% of executions look like. Useful for spotting performance outliers.
Tool Success Rate: The percentage of tool (API) calls that completed successfully.
Execution Success Rate: The percentage of executions that finished without errors.
LLM Steps Success Rate: The percentage of LLM calls that returned successfully.
Total Tokens In/Out: How many tokens were consumed (input) and generated (output) across all executions.

Session list

Below the dashboard, you'll see the Session List.

Each row represents a session and shows the Session ID, Agent Version ID, who created it, start and last action time, total tokens processed, and counts for executions, execution steps, LLM steps, and tool steps.

Click any session to open its details.

Session details

The session details panel shows the session's metadata at the top: Session ID, Agent Version ID, Start Time, and Last Action Time. Below that, summary cards show Total Tokens Processed, Total Executions, Total Execution Steps, and Total LLM Execution Steps.

The Execution List table shows every execution within that session. Each row includes the Execution ID, Execution Mode (e.g., Connection), Execution Source (e.g., Zia Agents TestBed), start and end times, time taken, and status (Success, Failed, etc.).

Click any execution to go deeper.

Execution details

This is where you see exactly what the agent did and how it got there. The panel shows the execution ID, session reference, timestamp, status, total duration, and three counts: Total Steps, Tool Steps, and LLM Calls.

Step Timeline

The Step Timeline is the core of execution details. It lists every step the agent took, in order, with the step type, duration, and timestamp range.

Each step is expandable. Click the dropdown arrow to see what happened inside.

The step types you'll encounter:

Request: The initial input that triggered the execution. Expanding this shows the exact prompt or query the agent received. For example: "Please share insights on the deal id 5738779000002537534."
Prompt Generation: The agent assembles the prompt it will send to the LLM, incorporating the instructions, context, and any data it has gathered so far.
LLM Execution: The model processes the prompt and generates a response. This step shows the time the LLM took to respond.

Step Timeline

Tool Execution: The agent called an API tool. Expanding this step shows two sections: Input (the exact JSON request sent to the API, including the parameters and their values) and Output Preview (the JSON response returned by the API). This is especially useful for debugging. If a tool call failed or returned unexpected data, you can see exactly what was sent and what came back.

Step Timeline-2

Response & Reasoning: The final step. This shows two things: the Response the agent delivered to the user, and the Step-by-Step Reasoning where the agent explains its thought process. The reasoning section walks through how the agent interpreted the request, what data it gathered, which details it prioritized, and how it arrived at its final answer.

Execution Details

Reading the timeline

A typical execution flows like this:

Request → Prompt Generation → LLM Execution → Tool Execution → Prompt Generation → LLM Execution → Tool Execution → ... → Response & Reasoning

The agent cycles between thinking (LLM) and acting (Tool) as many times as needed before producing its final response.

The duration shown on each step tells you where time is being spent. If an execution is slow, the timeline makes it easy to identify whether the bottleneck is in LLM calls, tool calls, or prompt assembly.

What to look for

Healthy agents will show high success rates across tools, executions, and LLM steps. Latency should be consistent, without large gaps between average and p95.
If tool success rate drops, check the Tool Execution steps in the affected executions. The input/output JSON will show you what went wrong: bad parameters, expired connections, or API errors.
If execution latency spikes, look at the Step Timeline to find which step is taking longer than expected. Multiple LLM calls in a single execution can add up, especially with larger models.
If responses are inaccurate, expand the Response & Reasoning step. The step-by-step reasoning shows you how the agent arrived at its answer, which makes it easier to figure out whether the issue is in the instructions, the knowledge base, or the data returned by tools.

Observability

Dashboard

Session list

Session details

Execution details

Step Timeline

Reading the timeline

What to look for

TABLE OF CONTENTS