SKILL.md
$2b
Prerequisites That Must Exist
Before extraction, verify:
- Data 360 is enabled
- Session Tracing is enabled
- the Salesforce Standard Data Model version is sufficient
- Einstein / Agentforce capabilities are enabled in the org
- JWT / ECA auth for Data 360 access is configured
If auth is missing, hand off to:
Deep setup guide:
What This Skill Works With
Core storage / analysis model
- extraction via Data 360 APIs
- Parquet for storage efficiency
- Polars for large-scale lazy analysis
Core STDM entities
At minimum, expect work around:
- session
- interaction / turn
- interaction step
- moment
- message
GenAI Trust Layer / audit records may also be relevant for content-quality and generation debugging.
Full schema:
Required Context to Gather First
Ask for or infer:
- target org alias
- time window or date range
- agent filter, if any
- whether the goal is extraction, summary analysis, or single-session debugging
- output location for extracted data
- whether the user already has Parquet files on disk
Recommended Workflow
1. Verify setup and auth
Confirm Data 360 tracing exists and JWT/ECA auth is working.
2. Choose the extraction mode
Need
Default approach
recent telemetry snapshot
extract last N days
focused investigation
filtered extraction by date and agent
one broken conversation
extract or debug a single session tree
ongoing usage analytics
incremental extraction
3. Extract to Parquet
Use the provided scripts under scripts/ rather than reimplementing extraction logic.
4. Analyze with Polars
Common analysis goals:
- session volume and duration
- topic distribution
- action step failures
- latency hotspots
- abandonment / escalation patterns
- session-level timeline reconstruction
5. Convert findings into next actions
Typical outcomes:
- topic mismatch → improve routing or descriptions
- action failure → inspect Flow / Apex implementation
- latency issue → optimize downstream action path
- test gap → add targeted agent tests
High-Signal Operational Rules
- treat STDM as read-only telemetry
- expect ingestion lag; this is not perfect real-time debugging
- use date filters and focused extraction to avoid unnecessary volume / query cost
- prefer Parquet over ad hoc JSON for durable analysis
- use lazy Polars patterns for large datasets
Common pitfalls:
- assuming missing data means no issue, when tracing may simply not be enabled
- running huge broad queries without date or agent filters
- trying to fix the agent inside this skill instead of handing off to authoring / testing skills
Output Format
When finishing, report in this order:
- What data was extracted or analyzed
- Scope (org, dates, agent filter, session IDs)
- Key findings
- Likely root causes
- Recommended next skill / next action
Suggested shape:
Observability task: <extract / analyze / debug-session>
Scope: <org, dates, agents, session ids>
Artifacts: <directories / parquet files>
Findings: <latency, routing, action, quality, abandonment patterns>
Root cause: <best current explanation>
Next step: <testing, agent fix, flow fix, apex fix>
Cross-Skill Integration
Need
Delegate to
Reason
auth / JWT setup
Data 360 access
fix agent routing / behavior
authoring corrections
formal regression / coverage tests
reproducible test loops
Flow-backed action debugging
declarative repair
Apex-backed action debugging
code / log investigation
Reference Map
Start here
Data model / querying
Analysis / debugging
Auth / troubleshooting
Score Guide
Score
Meaning
90+
strong telemetry-backed diagnosis
75–89
useful analysis with minor gaps
60–74
partial visibility only
< 60
insufficient evidence; gather more telemetry