SKILL.md

CI/CD Best Practices

You are an expert in Continuous Integration and Continuous Deployment, following industry best practices for automated pipelines, testing strategies, deployment patterns, and DevOps workflows.

Core Principles

Automate everything that can be automated

Fail fast with quick feedback loops

Build once, deploy many times

Implement infrastructure as code

Practice continuous improvement

Maintain security at every stage

Pipeline Design

Pipeline Stages

A typical CI/CD pipeline includes these stages:

Build -> Test -> Security -> Deploy (Staging) -> Deploy (Production)

#### 1. Build Stage

build:

  stage: build

  script:

    - npm ci --prefer-offline

    - npm run build

  artifacts:

    paths:

      - dist/

    expire_in: 1 day

  cache:

    key: ${CI_COMMIT_REF_SLUG}

    paths:

      - node_modules/

Best practices:

Use dependency caching to speed up builds

Generate build artifacts for downstream stages

Pin dependency versions for reproducibility

Use multi-stage Docker builds for smaller images

#### 2. Test Stage

test:

  stage: test

  parallel:

    matrix:

      - TEST_TYPE: [unit, integration, e2e]

  script:

    - npm run test:${TEST_TYPE}

  coverage: '/Coverage: \d+\.\d+%/'

  artifacts:

    reports:

      junit: test-results.xml

      coverage_report:

        coverage_format: cobertura

        path: coverage/cobertura-coverage.xml

Testing layers:

Unit tests: Fast, isolated, run on every commit

Integration tests: Test component interactions

End-to-end tests: Validate user workflows

Performance tests: Check for regressions

#### 3. Security Stage

security:

  stage: security

  parallel:

    matrix:

      - SCAN_TYPE: [sast, dependency, secrets]

  script:

    - ./security-scan.sh ${SCAN_TYPE}

  allow_failure: false

Security scanning types:

SAST: Static Application Security Testing

DAST: Dynamic Application Security Testing

Dependency scanning: Check for vulnerable packages

Secret detection: Find leaked credentials

Container scanning: Analyze Docker images

#### 4. Deploy Stage

deploy:staging:

  stage: deploy

  environment:

    name: staging

    url: https://staging.example.com

  script:

    - ./deploy.sh staging

  rules:

    - if: $CI_COMMIT_BRANCH == "develop"

deploy:production:

  stage: deploy

  environment:

    name: production

    url: https://example.com

  script:

    - ./deploy.sh production

  rules:

    - if: $CI_COMMIT_BRANCH == "main"

      when: manual

Deployment Strategies

Blue-Green Deployment

Maintain two identical environments:

deploy:blue-green:

  script:

    - ./deploy-to-inactive.sh

    - ./run-smoke-tests.sh

    - ./switch-traffic.sh

    - ./cleanup-old-environment.sh

Benefits:

Zero-downtime deployments

Easy rollback by switching traffic back

Full testing in production-like environment

Canary Deployment

Gradually roll out to subset of users:

deploy:canary:

  script:

    - ./deploy-canary.sh --percentage=5

    - ./monitor-metrics.sh --duration=30m

    - ./deploy-canary.sh --percentage=25

    - ./monitor-metrics.sh --duration=30m

    - ./deploy-canary.sh --percentage=100

Canary stages:

Deploy to 5% of traffic

Monitor error rates and latency

Gradually increase if metrics are healthy

Full rollout or rollback based on data

Rolling Deployment

Update instances incrementally:

deploy:rolling:

  script:

    - kubectl rollout restart deployment/app

    - kubectl rollout status deployment/app --timeout=5m

Configuration:

Set maxUnavailable and maxSurge

Health checks determine rollout pace

Automatic rollback on failure

Feature Flags

Decouple deployment from release:

// Feature flag implementation

if (featureFlags.isEnabled('new-checkout')) {

  return <NewCheckout />;

} else {

  return <LegacyCheckout />;

}

Benefits:

Deploy disabled features to production

Gradual feature rollout

A/B testing capabilities

Quick feature disable without deployment

Environment Management

Environment Hierarchy

Development -> Testing -> Staging -> Production

Each environment should:

Mirror production as closely as possible

Have isolated data and secrets

Use infrastructure as code

Environment Variables

variables:

  # Global variables

  APP_NAME: my-app

# Environment-specific

.staging:

  variables:

    ENV: staging

    API_URL: https://api.staging.example.com

.production:

  variables:

    ENV: production

    API_URL: https://api.example.com

Best practices:

Never hardcode secrets

Use secret management (Vault, AWS Secrets Manager)

Separate configuration from code

Document all required variables

Infrastructure as Code

# Terraform example

resource "aws_ecs_service" "app" {

  name            = var.app_name

  cluster         = aws_ecs_cluster.main.id

  task_definition = aws_ecs_task_definition.app.arn

  desired_count   = var.environment == "production" ? 3 : 1

  deployment_configuration {

    maximum_percent         = 200

    minimum_healthy_percent = 100

  }

}

Testing Strategies

Test Pyramid

/\

       /  \      E2E Tests (Few)

      /----\

     /      \    Integration Tests (Some)

    /--------\

   /          \  Unit Tests (Many)

  --------------

Test Parallelization

test:

  parallel: 4

  script:

    - npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

Test Data Management

Use fixtures for consistent test data

Reset database state between tests

Use factories for dynamic test data

Avoid production data in tests

Flaky Test Handling

test:

  retry:

    max: 2

    when:

      - runner_system_failure

      - stuck_or_timeout_failure

Strategies:

Quarantine flaky tests

Add retry logic for known issues

Investigate and fix root causes

Track flaky test metrics

Monitoring and Observability

Pipeline Metrics

Track these metrics:

Lead time: Commit to production duration

Deployment frequency: How often you deploy

Change failure rate: Percentage of failed deployments

Mean time to recovery: Time to fix failures

Health Checks

deploy:

  script:

    - ./deploy.sh

    - ./wait-for-healthy.sh --timeout=300

    - ./run-smoke-tests.sh

Implement:

Readiness probes

Liveness probes

Startup probes

Smoke tests post-deployment

Alerting

notify:failure:

  stage: notify

  script:

    - ./send-alert.sh --channel=deployments --status=failed

  when: on_failure

notify:success:

  stage: notify

  script:

    - ./send-notification.sh --channel=deployments --status=success

  when: on_success

Security in CI/CD

Secrets Management

# Use CI/CD secret variables

deploy:

  script:

    - echo "$DEPLOY_KEY" | base64 -d > deploy_key

    - chmod 600 deploy_key

    - ./deploy.sh

  after_script:

    - rm -f deploy_key

Best practices:

Rotate secrets regularly

Use short-lived credentials

Audit secret access

Never log secrets

Pipeline Security

# Restrict who can run production deploys

deploy:production:

  rules:

    - if: $CI_COMMIT_BRANCH == "main"

      when: manual

      allow_failure: false

  environment:

    name: production

    deployment_tier: production

Controls:

Branch protection rules

Required approvals

Audit logging

Signed commits

Dependency Security

dependency_check:

  script:

    - npm audit --audit-level=high

    - ./check-licenses.sh

  allow_failure: false

Optimization Techniques

Caching

cache:

  key:

    files:

      - package-lock.json

  paths:

    - node_modules/

  policy: pull-push

Cache strategies:

Cache dependencies between runs

Use content-based cache keys

Separate cache per branch

Clean stale caches periodically

Parallelization

stages:

  - build

  - test

  - deploy

# Run tests in parallel

test:unit:

  stage: test

  script: npm run test:unit

test:integration:

  stage: test

  script: npm run test:integration

test:e2e:

  stage: test

  script: npm run test:e2e

Artifact Management

build:

  artifacts:

    paths:

      - dist/

    expire_in: 1 week

    when: on_success

Best practices:

Set appropriate expiration

Only store necessary artifacts

Use artifact compression

Clean up old artifacts

Rollback Strategies

Automatic Rollback

deploy:

  script:

    - ./deploy.sh

    - ./health-check.sh || ./rollback.sh

Manual Rollback

rollback:

  stage: deploy

  when: manual

  script:

    - ./get-previous-version.sh

    - ./deploy.sh --version=$PREVIOUS_VERSION

Database Rollbacks

Use reversible migrations

Test rollback procedures

Consider data compatibility

Have backup restoration process

Documentation

Pipeline Documentation

Document in your repository:

Pipeline stages and their purpose

Required environment variables

Deployment procedures

Troubleshooting guides

Rollback procedures

Runbooks

Create runbooks for:

Deployment failures

Rollback procedures

Environment setup

Incident response

Continuous Improvement

Metrics to Track

Build success rate

Average build time

Test coverage trends

Deployment frequency

Incident frequency

Regular Reviews

Weekly pipeline performance review

Monthly security assessment

Quarterly process improvement

Annual tooling evaluation

ci-cd-best-practices

SKILL.md

CI/CD Best Practices

Core Principles

Pipeline Design

Pipeline Stages

Deployment Strategies

Blue-Green Deployment

Canary Deployment

Rolling Deployment

Feature Flags

Environment Management

Environment Hierarchy

Environment Variables

Infrastructure as Code

Testing Strategies

Test Pyramid

Test Parallelization

Test Data Management

Flaky Test Handling

Monitoring and Observability

Pipeline Metrics

Health Checks

Alerting

Security in CI/CD

Secrets Management

Pipeline Security

Dependency Security

Optimization Techniques

Caching

Parallelization

Artifact Management

Rollback Strategies

Automatic Rollback

Manual Rollback

Database Rollbacks

Documentation

Pipeline Documentation

Runbooks

Continuous Improvement

Metrics to Track

Regular Reviews

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers