SKILL.md
$27
Transformers and LLMs
- Leverage the Transformers library for pre-trained models
- Correctly implement attention mechanisms and positional encodings
- Use efficient fine-tuning techniques (LoRA, P-tuning)
- Handle tokenization and sequences properly
Diffusion Models
- Employ the Diffusers library for diffusion model work
- Correctly implement forward/reverse diffusion processes
- Utilize appropriate noise schedulers and sampling methods
- Understand different pipelines (StableDiffusionPipeline, StableDiffusionXLPipeline)
Training and Evaluation
- Implement efficient PyTorch DataLoaders
- Use proper train/validation/test splits
- Apply early stopping and learning rate scheduling
- Use task-appropriate evaluation metrics
- Implement gradient clipping and NaN/Inf handling
Gradio Integration
- Create interactive demos for inference and visualization
- Build user-friendly interfaces with proper error handling
Error Handling
- Use try-except blocks for error-prone operations
- Implement proper logging
- Leverage PyTorch's debugging tools
Performance Optimization
- Utilize DataParallel/DistributedDataParallel for multi-GPU training
- Implement gradient accumulation for large batch sizes
- Use mixed precision training with
torch.cuda.amp
- Profile code to identify bottlenecks
Required Dependencies
- torch
- transformers
- diffusers
- gradio
- numpy
- tqdm
- tensorboard/wandb
Project Conventions
- Begin with clear problem definition and dataset analysis
- Create modular code with separate files for models, data loading, training, evaluation
- Use YAML configuration files for hyperparameters
- Implement experiment tracking and model checkpointing
- Use version control for code and configuration tracking