senior-ml-engineer

Production-grade ML engineering expertise for deploying models, building MLOps systems, and scaling AI infrastructure. Covers model deployment, feature stores, monitoring, and distributed computing with PyTorch, TensorFlow, Spark, and Kubernetes Includes LLM integration patterns, RAG system architecture, and fine-tuning workflows using LangChain and LlamaIndex Provides production patterns for scalable data processing, real-time inference, A/B testing, and automated retraining pipelines Addresses security, compliance, performance optimization, and team leadership responsibilities for enterprise AI systems

INSTALLATION
npx skills add https://github.com/davila7/claude-code-templates --skill senior-ml-engineer
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2c

Core Expertise

This skill covers world-class capabilities in:

  • Advanced production patterns and architectures
  • Scalable system design and implementation
  • Performance optimization at scale
  • MLOps and DataOps best practices
  • Real-time processing and inference
  • Distributed computing frameworks
  • Model deployment and monitoring
  • Security and compliance
  • Cost optimization
  • Team leadership and mentoring

Tech Stack

Languages: Python, SQL, R, Scala, Go

ML Frameworks: PyTorch, TensorFlow, Scikit-learn, XGBoost

Data Tools: Spark, Airflow, dbt, Kafka, Databricks

LLM Frameworks: LangChain, LlamaIndex, DSPy

Deployment: Docker, Kubernetes, AWS/GCP/Azure

Monitoring: MLflow, Weights & Biases, Prometheus

Databases: PostgreSQL, BigQuery, Snowflake, Pinecone

Reference Documentation

1. Mlops Production Patterns

Comprehensive guide available in references/mlops_production_patterns.md covering:

  • Advanced patterns and best practices
  • Production implementation strategies
  • Performance optimization techniques
  • Scalability considerations
  • Security and compliance
  • Real-world case studies

2. Llm Integration Guide

Complete workflow documentation in references/llm_integration_guide.md including:

  • Step-by-step processes
  • Architecture design patterns
  • Tool integration guides
  • Performance tuning strategies
  • Troubleshooting procedures

3. Rag System Architecture

Technical reference guide in references/rag_system_architecture.md with:

  • System design principles
  • Implementation examples
  • Configuration best practices
  • Deployment strategies
  • Monitoring and observability

Production Patterns

Pattern 1: Scalable Data Processing

Enterprise-scale data processing with distributed computing:

  • Horizontal scaling architecture
  • Fault-tolerant design
  • Real-time and batch processing
  • Data quality validation
  • Performance monitoring

Pattern 2: ML Model Deployment

Production ML system with high availability:

  • Model serving with low latency
  • A/B testing infrastructure
  • Feature store integration
  • Model monitoring and drift detection
  • Automated retraining pipelines

Pattern 3: Real-Time Inference

High-throughput inference system:

  • Batching and caching strategies
  • Load balancing
  • Auto-scaling
  • Latency optimization
  • Cost optimization

Best Practices

Development

  • Test-driven development
  • Code reviews and pair programming
  • Documentation as code
  • Version control everything
  • Continuous integration

Production

  • Monitor everything critical
  • Automate deployments
  • Feature flags for releases
  • Canary deployments
  • Comprehensive logging

Team Leadership

  • Mentor junior engineers
  • Drive technical decisions
  • Establish coding standards
  • Foster learning culture
  • Cross-functional collaboration

Performance Targets

Latency:

  • P50: < 50ms
  • P95: < 100ms
  • P99: < 200ms

Throughput:

  • Requests/second: > 1000
  • Concurrent users: > 10,000

Availability:

  • Uptime: 99.9%
  • Error rate: < 0.1%

Security &#x26; Compliance

  • Authentication &#x26; authorization
  • Data encryption (at rest &#x26; in transit)
  • PII handling and anonymization
  • GDPR/CCPA compliance
  • Regular security audits
  • Vulnerability management

Common Commands

# Development

python -m pytest tests/ -v --cov

python -m black src/

python -m pylint src/

# Training

python scripts/train.py --config prod.yaml

python scripts/evaluate.py --model best.pth

# Deployment

docker build -t service:v1 .

kubectl apply -f k8s/

helm upgrade service ./charts/

# Monitoring

kubectl logs -f deployment/service

python scripts/health_check.py

Resources

  • Advanced Patterns: references/mlops_production_patterns.md
  • Implementation Guide: references/llm_integration_guide.md
  • Technical Reference: references/rag_system_architecture.md
  • Automation Scripts: scripts/ directory

Senior-Level Responsibilities

As a world-class senior professional:

-

Technical Leadership

  • Drive architectural decisions
  • Mentor team members
  • Establish best practices
  • Ensure code quality

-

Strategic Thinking

  • Align with business goals
  • Evaluate trade-offs
  • Plan for scale
  • Manage technical debt

-

Collaboration

  • Work across teams
  • Communicate effectively
  • Build consensus
  • Share knowledge

-

Innovation

  • Stay current with research
  • Experiment with new approaches
  • Contribute to community
  • Drive continuous improvement

-

Production Excellence

  • Ensure high availability
  • Monitor proactively
  • Optimize performance
  • Respond to incidents
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card