SKILL.md
Gemini Image Analysis
Analyze images using Gemini Pro's vision capabilities.
Prerequisites
pip install google-generativeai
export GEMINI_API_KEY=your_api_key
CLI Reference
Basic Image Analysis
# Analyze an image
gemini -m pro -f /path/to/image.png "Describe this image in detail"
With specific question
gemini -m pro -f screenshot.png "What error message is shown?"
Multiple images
gemini -m pro -f image1.png -f image2.png "Compare these two images"
## Analysis Operations
### General Description
gemini -m pro -f image.png "Describe this image comprehensively:
- Main subject/content
- Colors and composition
- Text visible (if any)
- Context and purpose
- Notable details"
### Extract Text (OCR)
gemini -m pro -f screenshot.png "Extract all text from this image.
Format as plain text, preserving layout where possible.
Include any text in buttons, labels, or UI elements."
### Code from Screenshot
gemini -m pro -f code-screenshot.png "Extract the code from this screenshot.
Provide as properly formatted code with correct indentation.
Note any parts that are unclear or partially visible."
### UI Analysis
gemini -m pro -f ui-screenshot.png "Analyze this UI:
- What application/website is this?
- What page/screen is shown?
- Main UI elements and their purpose
- User flow/actions available
- Any UX issues or suggestions"
### Error Analysis
gemini -m pro -f error-screenshot.png "Analyze this error:
- What error is shown?
- What is the likely cause?
- How to fix it?
- Any related information visible?"
### Diagram Understanding
gemini -m pro -f diagram.png "Explain this diagram:
- What type of diagram is this?
- Main components and their relationships
- Data/process flow
- Key takeaways"
## Specific Use Cases
### Debug Screenshot
gemini -m pro -f debug-screen.png "I'm debugging an issue. From this screenshot:
- What is the current state?
- What errors or warnings are visible?
- What should I look at?
- Suggested next steps"
### Compare Before/After
gemini -m pro -f before.png -f after.png "Compare these before and after images:
- What changed?
- Is this an improvement?
- Any issues in the 'after' version?
- Anything missing?"
### Design Feedback
gemini -m pro -f design.png "Provide design feedback:
- Visual hierarchy
- Color usage
- Typography
- Spacing and alignment
- Accessibility concerns
- Suggestions for improvement"
### Data Extraction
gemini -m pro -f chart.png "Extract data from this chart:
- Chart type
- Data series and values
- Axes labels and ranges
- Key trends or insights
- Output as structured data if possible"
### Form Analysis
gemini -m pro -f form.png "Analyze this form:
- Form purpose
- Fields and their types
- Required vs optional
- Validation rules visible
- UX suggestions"
## Workflow Patterns
### Screenshot to Issue
Capture screenshot (macOS)
screencapture -i /tmp/bug.png
Analyze and format as issue
gemini -m pro -f /tmp/bug.png "Create a bug report from this screenshot:
Summary
[One-line description]
Steps to Reproduce
[Inferred from screenshot]
Expected Behavior
[What should happen]
Actual Behavior
[What the screenshot shows]
Environment
[Any visible system info]"
### UI to Code
gemini -m pro -f ui-design.png "Generate React component code that recreates this UI:
- Use Tailwind CSS for styling
- Make it responsive
- Include proper TypeScript types
- Add appropriate accessibility attributes"
### Documentation
gemini -m pro -f app-screen.png "Write user documentation for this screen:
- What this screen is for
- How to use each feature
- Common tasks
- Tips and notes"