Use the runtime trace builder when you want visibility beyond what the proxy can infer from provider traffic alone.
Quickstart Start with installation, agent initialization, and your first proxied OpenAI or Anthropic call.
Proxy Helpers Review the proxy key flow before adding runtime events on top of proxied traffic.
Advanced Patterns Move there after this page for hybrid flows, self-hosting, and common mistakes.
Minimal trace
Use the exact same traceId here that you pass in X-Trace-ID on proxied OpenAI or Anthropic calls when you want tool events and model events to stay on the same thread.
Prompt + response
Thinking block
const trace = whyops . trace ( 'session-123' );
await trace . userMessage (
[{ role: 'user' , content: 'Reset my password.' }],
{ metadata: { systemPrompt: 'You are a support assistant.' } },
);
await trace . llmResponse ( 'openai/gpt-4o-mini' , 'openai' , 'I can help with that.' , {
finishReason: 'stop' ,
latencyMs: 420 ,
usage: {
promptTokens: 42 ,
completionTokens: 16 ,
totalTokens: 58 ,
},
});
await trace . llmThinking ( 'I should verify the order before replying.' , {
signature: 'anthropic-thinking-signature' ,
});
Linking events to users
Use externalUserId to associate events with your application’s user IDs:
await trace . userMessage (
[{ role: 'user' , content: 'Reset my password.' }],
{ externalUserId: 'user_12345' },
);
The externalUserId is stored on every event and trace, allowing you to filter and analyze traces by your own user identifiers.
const spanId = await trace . toolCallRequest (
'search_orders' ,
[{ name: 'search_orders' , arguments: { orderId: '123' } }],
{ latencyMs: 12 },
);
await trace . toolCallResponse (
'search_orders' ,
spanId ,
[{ name: 'search_orders' , arguments: { orderId: '123' } }],
{ status: 'shipped' },
{ latencyMs: 91 },
);
await trace . toolResult (
'search_orders' ,
{ status: 'shipped' , eta: '2026-03-29' },
{ spanId: 'tool-span-123' },
);
toolCallRequest() returns a spanId. Reuse that same value in toolCallResponse() so the UI can treat the execution as a single tool span.
Prompt caching usage
await trace . llmResponse ( 'anthropic/claude-sonnet-4-5' , 'anthropic' , 'Done.' , {
usage: {
promptTokens: 1200 ,
completionTokens: 240 ,
totalTokens: 9940 ,
cacheReadTokens: 8200 ,
cacheCreationTokens: 300 ,
},
latencyMs: 860 ,
});
Use cacheReadTokens for tokens served from cache and cacheCreationTokens for tokens written into cache when your runtime exposes those values.
Event map
Method Purpose Key options userMessage()Log assembled chat input metadata.systemPrompt, metadata.tools, spanId, stepId, externalUserIdllmResponse()Log model output or tool calls toolCalls, finishReason, usage, latencyMsllmThinking()Log exposed thinking blocks signatureembeddingRequest()Log embedding inputs spanId, stepIdembeddingResponse()Log embedding result summary totalTokens, latencyMstoolCallRequest()Start a tool call span requestedAt, latencyMs; returns spanIdtoolCallResponse()Close the tool call span respondedAt, latencyMstoolResult()Record tool output returned to the model spanId, stepIderror()Record runtime or provider failures status, stack
All methods accept externalUserId as an optional parameter to link the event to your application user.
Hybrid pattern
const traceId = 'checkout-8841' ;
const openai = whyops . openai ( new OpenAI ({ apiKey: process . env . WHYOPS_API_KEY }));
openai . defaultHeaders = {
... ( openai as any ). defaultHeaders ,
'X-Trace-ID' : traceId ,
'X-Thread-ID' : traceId ,
};
const trace = whyops . trace ( traceId );
const spanId = await trace . toolCallRequest ( 'charge_card' , [
{ name: 'charge_card' , arguments: { amount: 4999 , currency: 'usd' } },
]);
const result = await chargeCard ();
await trace . toolCallResponse (
'charge_card' ,
spanId ,
[{ name: 'charge_card' , arguments: { amount: 4999 , currency: 'usd' } }],
result ,
);
Use this when the proxy already captures the LLM exchange, but you still want application-side latency and outcomes for tools, queue jobs, or downstream APIs.