data-analysis

Dataset exploration, cleaning, statistical analysis, and visualization in Python or SQL. Supports CSV, JSON, and SQL data sources with pandas DataFrames and direct database queries Covers the full analysis pipeline: data loading, missing value handling, outlier detection, grouping, correlation analysis, and pivot tables Includes visualization templates for histograms, boxplots, heatmaps, and time series using matplotlib and seaborn Generates structured markdown reports with dataset overview, key findings, statistical summaries, and actionable recommendations

INSTALLATION
npx skills add https://github.com/supercent-io/skills-template --skill data-analysis
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Data Analysis

When to use this skill

  • Data exploration: Understand a new dataset
  • Report generation: Derive data-driven insights
  • Quality validation: Check data consistency
  • Decision support: Make data-driven recommendations

Instructions

Step 1: Load and explore data

Python (Pandas):

import pandas as pd

import numpy as np

Load CSV

df = pd.read_csv('data.csv')

Basic info

print(df.info())

print(df.describe())

print(df.head(10))

Check missing values

print(df.isnull().sum())

Data types

print(df.dtypes)

**SQL**:

-- Inspect table schema

DESCRIBE table_name;

-- Sample data

SELECT * FROM table_name LIMIT 10;

-- Basic stats

SELECT

COUNT(*) as total_rows,

COUNT(DISTINCT column_name) as unique_values,

MIN(numeric_column) as min_val,

MAX(numeric_column) as max_val,

AVG(numeric_column) as avg_val

FROM table_name;


### Step 2: Data cleaning

Handle missing values

df['column'].fillna(df['column'].mean(), inplace=True)

df.dropna(subset=['required_column'], inplace=True)

Remove duplicates

df.drop_duplicates(inplace=True)

Type conversions

df['date'] = pd.to_datetime(df['date'])

df['category'] = df['category'].astype('category')

Remove outliers (IQR method)

Q1 = df['value'].quantile(0.25)

Q3 = df['value'].quantile(0.75)

IQR = Q3 - Q1

df = df[(df['value'] >= Q1 - 1.5IQR) &#x26; (df['value'] <= Q3 + 1.5IQR)]


### Step 3: Statistical analysis

Descriptive statistics

print(df['numeric_column'].describe())

Grouped analysis

grouped = df.groupby('category').agg({

'value': ['mean', 'sum', 'count'],

'other': 'nunique'

})

print(grouped)

Correlation

correlation = df[['col1', 'col2', 'col3']].corr()

print(correlation)

Pivot table

pivot = pd.pivot_table(df,

values='sales',

index='region',

columns='month',

aggfunc='sum'

)


### Step 4: Visualization

import matplotlib.pyplot as plt

import seaborn as sns

Histogram

plt.figure(figsize=(10, 6))

df['value'].hist(bins=30)

plt.title('Distribution of Values')

plt.savefig('histogram.png')

Boxplot

plt.figure(figsize=(10, 6))

sns.boxplot(x='category', y='value', data=df)

plt.title('Value by Category')

plt.savefig('boxplot.png')

Heatmap (correlation)

plt.figure(figsize=(10, 8))

sns.heatmap(correlation, annot=True, cmap='coolwarm')

plt.title('Correlation Matrix')

plt.savefig('heatmap.png')

Time series

plt.figure(figsize=(12, 6))

df.groupby('date')['value'].sum().plot()

plt.title('Time Series of Values')

plt.savefig('timeseries.png')


### Step 5: Derive insights

Top/bottom analysis

top_10 = df.nlargest(10, 'value')

bottom_10 = df.nsmallest(10, 'value')

Trend analysis

df['month'] = df['date'].dt.to_period('M')

monthly_trend = df.groupby('month')['value'].sum()

growth = monthly_trend.pct_change() * 100

Segment analysis

segments = df.groupby('segment').agg({

'revenue': 'sum',

'customers': 'nunique',

'orders': 'count'

})

segments['avg_order_value'] = segments['revenue'] / segments['orders']


## Output format

### Analysis report structure

Data Analysis Report

1. Dataset overview

  • Dataset: [name]
  • Records: X,XXX
  • Columns: XX
  • Date range: YYYY-MM-DD ~ YYYY-MM-DD

2. Key findings

  • Insight 1
  • Insight 2
  • Insight 3

3. Statistical summary

MetricValue
MeanX.XX
MedianX.XX
Std devX.XX

4. Recommendations

  1. [Recommendation 1]
  1. [Recommendation 2]
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card