SKILL.md
$27
Core Concepts
What Are Experiments?
Experiments in LaunchDarkly let you measure the impact of feature flag variations on key metrics. An experiment consists of:
- Treatments: The flag variations being compared (control vs. test)
- Metrics: What you're measuring (conversion rate, latency, revenue, etc.)
- Iterations: Data collection periods — start an iteration to begin collecting data
- Holdout (optional): A percentage of traffic excluded from the experiment for baseline measurement
Experiment Lifecycle
- Create the experiment with metrics and treatments
- Start an iteration to begin data collection
- Monitor results as data accumulates
- Stop the iteration when you have statistical significance
- Ship the winning variation
Core Principles
- Metrics First: Ensure your metrics exist before creating the experiment
- Clear Hypothesis: Know what you expect to improve and by how much
- Proper Controls: Always include a control treatment (the current behavior)
- Sufficient Sample Size: Let experiments run long enough for statistical significance
- One Change at a Time: Test one variable per experiment for clear attribution
Workflow
Step 1: Prepare Metrics
Before creating an experiment, ensure the metrics you want to measure exist:
- Use
list-metricsto check for existing metrics
- If needed, use
create-metricto create new ones
- Note the metric keys — you'll need them for the experiment
Common metric types:
Goal
Metric Type
Example
Conversion
Custom conversion
checkout-completed
Performance
Custom numeric
page-load-time-ms
Engagement
Custom conversion
feature-clicked
Revenue
Custom numeric
order-value
Step 2: Create the Experiment
Use create-experiment with:
projectKeyandenvironmentKey-- where to run the experiment
name-- descriptive name for the experiment
flagKey-- the feature flag being experimented on
metrics-- array of metric objects withkeyandisGroupfields
treatments-- array of treatments, each with aname,baselineflag, andparameters
holdout(optional) -- percentage of traffic to exclude
{
"projectKey": "my-project",
"environmentKey": "production",
"name": "Checkout Flow v2 Experiment",
"flagKey": "checkout-flow-v2",
"metrics": [
{"key": "checkout-completed", "isGroup": false},
{"key": "checkout-time-seconds", "isGroup": false}
],
"treatments": [
{
"name": "Control",
"baseline": true,
"parameters": {
"flagKey": "checkout-flow-v2",
"variationId": "variation-a-id"
}
},
{
"name": "New Checkout",
"baseline": false,
"parameters": {
"flagKey": "checkout-flow-v2",
"variationId": "variation-b-id"
}
}
]
}
Step 3: Start Data Collection
Use start-experiment-iteration to begin collecting data:
{
"projectKey": "my-project",
"environmentKey": "production",
"experimentKey": "checkout-flow-v2-experiment"
}
Optionally set reshuffle: true to redistribute traffic across treatments.
Step 4: Verify
- Use
get-experimentto confirm the experiment is running
- Check that all treatments are listed correctly
- Verify metrics are attached
- Confirm the iteration status shows as active
Report results:
- Experiment created and iteration started
- N treatments with M metrics configured
- Data collection is active
Edge Cases
Situation
Action
Metric doesn't exist
Create it first with create-metric
Flag has no variations
Create flag variations before setting up treatments
Experiment already exists
Use list-experiments to find it, then get-experiment for details
Need to change metrics mid-experiment
Stop the current iteration, update, then start a new one
What NOT to Do
- Don't start an experiment without clearly defined metrics
- Don't stop experiments too early — wait for statistical significance
- Don't run multiple experiments on the same flag simultaneously without careful holdout design
- Don't forget to set a baseline treatment — one treatment must be marked
baseline: true