Name: aiconfig-ai-metrics
Author: launchdarkly

SKILL.md

$27

Workflow

1. Explore the existing call site

Before picking a tier, find the provider call and answer these questions:

Shape? Is it a chat loop (history + turn-based), a one-shot completion, an agent step, or something else? → drives Tier 1 vs 2.

Framework? Raw provider SDK? LangChain / LangGraph? Vercel AI SDK? CrewAI? Strands? → drives which Tier-2 provider package (if any) applies.

Provider? OpenAI, Anthropic, Bedrock, Gemini, Azure, custom HTTP? → cross-reference with the package availability matrix below.

Streaming? If yes, you'll need TTFT tracking, which means Tier 4 for the TTFT part even if the rest is Tier 2.

Language? Python or Node? Provider-package coverage differs between them.

Already using an AI Config? If not, route to aiconfig-create first — tracking requires a tracker, which is obtained by calling create_tracker() / createTracker() on the config object returned by completion_config() / completionConfig() / createModel().

On the current SDK API? If the call site uses aiclient.config(...) / aiClient.config(...) or constructs an AIConfig(...) / LDAIConfig default, it's on the pre-0.20 surface. Migrate it as part of this work before adding tracking:

aiclient.config(...) → aiclient.completion_config(...) for one-shot/chat or aiclient.agent_config(...) for agent mode (mirror the call signature). Node is the same with camelCase.

AIConfig(...) default → AICompletionConfigDefault(...) or AIAgentConfigDefault(...) (Node: LDAICompletionConfigDefault / LDAIAgentConfigDefault). AIConfig is the base class the SDK returns; it isn't a valid default-value constructor — the typed *Default variants are.

If the result was being tuple-unpacked (config, tracker = aiclient.config(...)), drop the unpack — the new methods return a single config object. Obtain the tracker via config.create_tracker() / aiConfig.createTracker().

For deeper rewrites (call sites with hardcoded model/prompt as well), hand off to aiconfig-migrate instead of doing the full migration here.

2. Look up your Tier-2 option

Use this matrix to decide whether Tier 2 (provider package) is available for your situation. If it's not, drop to Tier 3 (custom extractor). If the shape is chat-loop, go to Tier 1 first regardless of what's in this matrix.

Framework / provider

Python provider package

Node provider package

Reference

OpenAI (direct SDK)

launchdarkly-server-sdk-ai-openai

@launchdarkly/server-sdk-ai-openai

openai-tracking.md

LangChain / LangGraph

launchdarkly-server-sdk-ai-langchain

@launchdarkly/server-sdk-ai-langchain

langchain-tracking.md

Vercel AI SDK

—

@launchdarkly/server-sdk-ai-vercel

(use the Vercel provider docs)

AWS Bedrock (Converse or InvokeModel)

— (use LangChain-aws or custom extractor)

bedrock-tracking.md

Anthropic direct SDK

—

anthropic-tracking.md

Gemini / Google GenAI

—

gemini-tracking.md

Strands Agents

— (Tier 3 custom extractor)

strands-tracking.md

Cohere, Mistral, custom HTTP

—

Tier 3 custom extractor

Any provider, streaming + TTFT

— (Tier 4 only)

trackStreamMetricsOf (no TTFT) + manual TTFT

streaming-tracking.md

3. Implement from the matching reference

Once you know the tier and the provider, open the reference file and follow the pattern. The references are written so Tier 1 is always the first example, Tier 2/3 next, and Tier 4 last. Stop at the first tier that matches the app's shape.

Guardrails that apply to every tier:

**Always check config.enabled** before making the tracked call. A disabled config means the user has flagged the feature off — you should short-circuit to whatever fallback the app uses (cached response, error, degraded path) rather than making the provider call at all.

Wrap the existing call, don't rewrite it. Tier 2 and Tier 3 are designed to slot around an unmodified provider call. If you find yourself rewriting the call to fit the tracker, you're at the wrong tier — drop down one.

**Errors are handled inside trackMetricsOf.** The wrapper catches exceptions, records trackError() internally, and re-raises — do not add except: tracker.trackError() on top, it's a noop that also trips the at-most-once guard. Tier 1 handles both paths automatically. At Tier 4 (manual, streaming, track_duration_of) the caller does own the error-tracking call.

Always flush before close. Call ldClient.flush() (Python: ldclient.get().flush(); Node: await ldClient.flush()) before closing the client. Trailing events are at risk of being lost otherwise — in short-lived scripts and long-running services alike. In Node, ldClient.close() returns a Promise; await it.

4. Verify

Confirm the Monitoring tab fills in:

Run one real request through the instrumented path.

Open the AI Config in LaunchDarkly → Monitoring tab. Duration, token counts, and generation counts should appear within 1–2 minutes.

Force an error (bad API key, zero max_tokens, whatever) and confirm the error count increments.

If streaming: verify TTFT appears. If it doesn't, you probably wrapped the stream creation with trackMetricsOf but didn't add the manual trackTimeToFirstToken call — see streaming-tracking.md.

Quick reference: tracker methods

Obtain a tracker via the factory on the config object: tracker = config.create_tracker() (Python) or const tracker = aiConfig.createTracker() (Node). Call the factory once per execution and reuse the returned tracker for every call — each factory invocation mints a new runId that tags every tracking event emitted by that tracker so events from a single execution can be correlated together (via exported events / downstream systems). The Monitoring tab aggregates events rather than grouping them by run today — the runId is useful when events are exported or queried outside the UI, and is the identifier the SDK's at-most-once guards are keyed on. The methods below are the raw API surface — most of the time you should not call them individually; use trackMetricsOf or a Tier-1 managed runner. The list is here so you can recognize the methods in existing code and reach for the right one when you genuinely need Tier 4.

Method (Python ↔ Node)

Tier

What it does

track_metrics_of(extractor, fn) / trackMetricsOf(extractor, fn)

2 / 3

Wraps a provider call, captures duration + success/error, calls your extractor for tokens. This is the default generic tracker.

track_metrics_of_async(extractor, fn) (Python)

2 / 3

Async variant of the above.

trackStreamMetricsOf(extractor, streamFn) (Node only)

2 / 3

Streaming variant. Captures per-chunk usage when the extractor handles chunks. Does not auto-capture TTFT.

track_duration(ms) / trackDuration(ms)

Record latency in milliseconds.

track_duration_of(fn) / trackDurationOf(fn)

Wraps a callable and records duration automatically. Does not capture tokens or success — pair with explicit calls.

track_tokens(TokenUsage) / trackTokens({input, output, total})

Record token usage.

track_time_to_first_token(ms) / trackTimeToFirstToken(ms)

Record TTFT for streaming responses.

track_success() / trackSuccess()

Mark the generation as successful. Required for the Monitoring tab to count it.

track_error() / trackError()

Mark the generation as failed. Do not also call trackSuccess() in the same request.

track_feedback({kind}) / trackFeedback({kind})

any

Record thumbs-up / thumbs-down from a feedback UI. Independent of the success/error path.

track_tool_call(name) / trackToolCall(name)

any

Record a single tool invocation by name. Available on both SDKs.

track_tool_calls([names]) / trackToolCalls([names])

any

Batch variant — record a list of tool invocations in one call.

track_judge_result(result) / trackJudgeResult(result)

any

Record a programmatic judge evaluation. result.sampled indicates whether evaluation ran.

Related skills

aiconfig-create — prerequisite if the app doesn't have an AI Config yet

aiconfig-custom-metrics — business metrics (conversion, resolution, retention) layered on top of the AI metrics this skill captures

aiconfig-online-evals — automatic quality scoring (LLM-as-judge) on sampled live requests; complementary to the metrics here

aiconfig-migrate — Stage 4 of the hardcoded-to-AI-Configs migration delegates to this skill

aiconfig-ai-metrics

SKILL.md

Workflow

1. Explore the existing call site

2. Look up your Tier-2 option

3. Implement from the matching reference

4. Verify

Quick reference: tracker methods

Related skills

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers