launchdarkly-experiment-setup

Set up and run experiments in LaunchDarkly. Create experiments with metrics and treatments, start iterations to collect data, and monitor results.

INSTALLATION
npx skills add https://github.com/launchdarkly/agent-skills --skill launchdarkly-experiment-setup
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$27

Core Concepts

What Are Experiments?

Experiments in LaunchDarkly let you measure the impact of feature flag variations on key metrics. An experiment consists of:

  • Treatments: The flag variations being compared (control vs. test)
  • Metrics: What you're measuring (conversion rate, latency, revenue, etc.)
  • Iterations: Data collection periods — start an iteration to begin collecting data
  • Holdout (optional): A percentage of traffic excluded from the experiment for baseline measurement

Experiment Lifecycle

  • Create the experiment with metrics and treatments
  • Start an iteration to begin data collection
  • Monitor results as data accumulates
  • Stop the iteration when you have statistical significance
  • Ship the winning variation

Core Principles

  • Metrics First: Ensure your metrics exist before creating the experiment
  • Clear Hypothesis: Know what you expect to improve and by how much
  • Proper Controls: Always include a control treatment (the current behavior)
  • Sufficient Sample Size: Let experiments run long enough for statistical significance
  • One Change at a Time: Test one variable per experiment for clear attribution

Workflow

Step 1: Prepare Metrics

Before creating an experiment, ensure the metrics you want to measure exist:

  • Use list-metrics to check for existing metrics
  • If needed, use create-metric to create new ones
  • Note the metric keys — you'll need them for the experiment

Common metric types:

Goal

Metric Type

Example

Conversion

Custom conversion

checkout-completed

Performance

Custom numeric

page-load-time-ms

Engagement

Custom conversion

feature-clicked

Revenue

Custom numeric

order-value

Step 2: Create the Experiment

Use create-experiment with:

  • projectKey and environmentKey -- where to run the experiment
  • name -- descriptive name for the experiment
  • flagKey -- the feature flag being experimented on
  • metrics -- array of metric objects with key and isGroup fields
  • treatments -- array of treatments, each with a name, baseline flag, and parameters
  • holdout (optional) -- percentage of traffic to exclude
{

  "projectKey": "my-project",

  "environmentKey": "production",

  "name": "Checkout Flow v2 Experiment",

  "flagKey": "checkout-flow-v2",

  "metrics": [

    {"key": "checkout-completed", "isGroup": false},

    {"key": "checkout-time-seconds", "isGroup": false}

  ],

  "treatments": [

    {

      "name": "Control",

      "baseline": true,

      "parameters": {

        "flagKey": "checkout-flow-v2",

        "variationId": "variation-a-id"

      }

    },

    {

      "name": "New Checkout",

      "baseline": false,

      "parameters": {

        "flagKey": "checkout-flow-v2",

        "variationId": "variation-b-id"

      }

    }

  ]

}

Step 3: Start Data Collection

Use start-experiment-iteration to begin collecting data:

{

  "projectKey": "my-project",

  "environmentKey": "production",

  "experimentKey": "checkout-flow-v2-experiment"

}

Optionally set reshuffle: true to redistribute traffic across treatments.

Step 4: Verify

  • Use get-experiment to confirm the experiment is running
  • Check that all treatments are listed correctly
  • Verify metrics are attached
  • Confirm the iteration status shows as active

Report results:

  • Experiment created and iteration started
  • N treatments with M metrics configured
  • Data collection is active

Edge Cases

Situation

Action

Metric doesn't exist

Create it first with create-metric

Flag has no variations

Create flag variations before setting up treatments

Experiment already exists

Use list-experiments to find it, then get-experiment for details

Need to change metrics mid-experiment

Stop the current iteration, update, then start a new one

What NOT to Do

  • Don't start an experiment without clearly defined metrics
  • Don't stop experiments too early — wait for statistical significance
  • Don't run multiple experiments on the same flag simultaneously without careful holdout design
  • Don't forget to set a baseline treatment — one treatment must be marked baseline: true
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card