> ## Documentation Index
> Fetch the complete documentation index at: https://whyops.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Python SDK Runtime Events

> Manual event patterns for the whyops Python package, including sync and async traces, tool spans, prompt caching usage, and hybrid setups.

Use the trace builder when you want the graph to include application-side runtime work that the provider proxy cannot observe directly.

<CardGroup cols={3}>
  <Card title="Quickstart" icon="box-open" href="/integrations/python-sdk">
    Start with installation, agent initialization, and your first proxied OpenAI or Anthropic call.
  </Card>

  <Card title="Proxy Helpers" icon="plug" href="/integrations/python-sdk-proxy">
    Review the proxy key flow before adding runtime events on top of proxied traffic.
  </Card>

  <Card title="Advanced Patterns" icon="sliders" href="/integrations/python-sdk-advanced">
    Move there after this page for hybrid flows, self-hosting, and common mistakes.
  </Card>
</CardGroup>

## Manual events mode

Use the exact same `trace_id` here that you pass in `X-Trace-ID` on proxied OpenAI or Anthropic calls when you want tool events and model events to stay on the same thread.

<Tabs>
  <Tab title="Sync">
    ```python theme={null}
    trace = sdk.trace("session-123")

    trace.user_message_sync(
        [{"role": "user", "content": "Reset my password."}],
        external_user_id="user_12345",
    )

    span_id = trace.tool_call_request_sync(
        "search_orders",
        [{"name": "search_orders", "arguments": {"orderId": "123"}}],
        latency_ms=11,
    )

    trace.tool_call_response_sync(
        "search_orders",
        span_id,
        [{"name": "search_orders", "arguments": {"orderId": "123"}}],
        {"status": "shipped"},
        latency_ms=88,
    )

    trace.llm_response_sync(
        "openai/gpt-4o-mini",
        "openai",
        "Your order has shipped.",
        finish_reason="stop",
        latency_ms=390,
        usage={"promptTokens": 42, "completionTokens": 16, "totalTokens": 58},
    )
    ```
  </Tab>

  <Tab title="Async">
    ```python theme={null}
    trace = sdk.trace("session-123")

    await trace.user_message(
        [{"role": "user", "content": "Reset my password."}],
        external_user_id="user_12345",
    )

    span_id = await trace.tool_call_request(
        "search_orders",
        [{"name": "search_orders", "arguments": {"orderId": "123"}}],
        latency_ms=11,
    )

    await trace.tool_call_response(
        "search_orders",
        span_id,
        [{"name": "search_orders", "arguments": {"orderId": "123"}}],
        {"status": "shipped"},
        latency_ms=88,
    )
    ```
  </Tab>
</Tabs>

## Linking events to users

Use `external_user_id` to associate events with your application user IDs:

```python theme={null}
trace.user_message_sync(
    [{"role": "user", "content": "Reset my password."}],
    external_user_id="user_12345",
)
```

The `external_user_id` is stored on every event and trace, allowing you to filter and analyze traces by your own user identifiers.

## Prompt caching usage

```python theme={null}
await trace.llm_response(
    "anthropic/claude-sonnet-4-5",
    "anthropic",
    "Done.",
    usage={
        "promptTokens": 1200,
        "completionTokens": 240,
        "totalTokens": 9940,
        "cacheReadTokens": 8200,
        "cacheCreationTokens": 300,
    },
    latency_ms=860,
)
```

Use `cacheReadTokens` and `cacheCreationTokens` when your runtime exposes cache-aware usage so the trace cost breakdown stays accurate.

## Method map

| Sync method                 | Async method           | Purpose                                             |
| --------------------------- | ---------------------- | --------------------------------------------------- |
| `user_message_sync()`       | `user_message()`       | Log assembled user or system-facing messages        |
| `llm_response_sync()`       | `llm_response()`       | Log model output, finish reason, usage, and latency |
| `llm_thinking_sync()`       | `llm_thinking()`       | Log explicit reasoning blocks                       |
| `embedding_request_sync()`  | `embedding_request()`  | Log embedding inputs                                |
| `embedding_response_sync()` | `embedding_response()` | Log embedding result summary                        |
| `tool_call_request_sync()`  | `tool_call_request()`  | Start a tool call span and return `span_id`         |
| `tool_call_response_sync()` | `tool_call_response()` | Close the tool call span                            |
| `tool_result_sync()`        | `tool_result()`        | Record tool output returned to the model            |
| `error_sync()`              | `error()`              | Record runtime or provider failures                 |

## Hybrid pattern

```python theme={null}
trace_id = "checkout-8841"
client = sdk.openai(OpenAI(api_key="YOUR_WHYOPS_API_KEY"))
client.default_headers = {
    **(client.default_headers or {}),
    "X-Trace-ID": trace_id,
    "X-Thread-ID": trace_id,
}
trace = sdk.trace(trace_id)

span_id = trace.tool_call_request_sync(
    "charge_card",
    [{"name": "charge_card", "arguments": {"amount": 4999, "currency": "usd"}}],
)

result = charge_card()

trace.tool_call_response_sync(
    "charge_card",
    span_id,
    [{"name": "charge_card", "arguments": {"amount": 4999, "currency": "usd"}}],
    result,
)
```

Use this when you want automatic provider capture from the proxy and explicit application-side timing for tools, jobs, queues, and failures.