statsmodels

Statistical modeling toolkit. OLS, GLM, logistic, ARIMA, time series, hypothesis tests, diagnostics, AIC/BIC, for rigorous statistical inference and…

INSTALLATION
npx skills add https://github.com/davila7/claude-code-templates --skill statsmodels
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2a

Quick Start Guide

Linear Regression (OLS)

import statsmodels.api as sm

import numpy as np

import pandas as pd

# Prepare data - ALWAYS add constant for intercept

X = sm.add_constant(X_data)

# Fit OLS model

model = sm.OLS(y, X)

results = model.fit()

# View comprehensive results

print(results.summary())

# Key results

print(f"R-squared: {results.rsquared:.4f}")

print(f"Coefficients:\\n{results.params}")

print(f"P-values:\\n{results.pvalues}")

# Predictions with confidence intervals

predictions = results.get_prediction(X_new)

pred_summary = predictions.summary_frame()

print(pred_summary)  # includes mean, CI, prediction intervals

# Diagnostics

from statsmodels.stats.diagnostic import het_breuschpagan

bp_test = het_breuschpagan(results.resid, X)

print(f"Breusch-Pagan p-value: {bp_test[1]:.4f}")

# Visualize residuals

import matplotlib.pyplot as plt

plt.scatter(results.fittedvalues, results.resid)

plt.axhline(y=0, color='r', linestyle='--')

plt.xlabel('Fitted values')

plt.ylabel('Residuals')

plt.show()

Logistic Regression (Binary Outcomes)

from statsmodels.discrete.discrete_model import Logit

# Add constant

X = sm.add_constant(X_data)

# Fit logit model

model = Logit(y_binary, X)

results = model.fit()

print(results.summary())

# Odds ratios

odds_ratios = np.exp(results.params)

print("Odds ratios:\\n", odds_ratios)

# Predicted probabilities

probs = results.predict(X)

# Binary predictions (0.5 threshold)

predictions = (probs > 0.5).astype(int)

# Model evaluation

from sklearn.metrics import classification_report, roc_auc_score

print(classification_report(y_binary, predictions))

print(f"AUC: {roc_auc_score(y_binary, probs):.4f}")

# Marginal effects

marginal = results.get_margeff()

print(marginal.summary())

Time Series (ARIMA)

from statsmodels.tsa.arima.model import ARIMA

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Check stationarity

from statsmodels.tsa.stattools import adfuller

adf_result = adfuller(y_series)

print(f"ADF p-value: {adf_result[1]:.4f}")

if adf_result[1] > 0.05:

    # Series is non-stationary, difference it

    y_diff = y_series.diff().dropna()

# Plot ACF/PACF to identify p, q

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))

plot_acf(y_diff, lags=40, ax=ax1)

plot_pacf(y_diff, lags=40, ax=ax2)

plt.show()

# Fit ARIMA(p,d,q)

model = ARIMA(y_series, order=(1, 1, 1))

results = model.fit()

print(results.summary())

# Forecast

forecast = results.forecast(steps=10)

forecast_obj = results.get_forecast(steps=10)

forecast_df = forecast_obj.summary_frame()

print(forecast_df)  # includes mean and confidence intervals

# Residual diagnostics

results.plot_diagnostics(figsize=(12, 8))

plt.show()

Generalized Linear Models (GLM)

import statsmodels.api as sm

# Poisson regression for count data

X = sm.add_constant(X_data)

model = sm.GLM(y_counts, X, family=sm.families.Poisson())

results = model.fit()

print(results.summary())

# Rate ratios (for Poisson with log link)

rate_ratios = np.exp(results.params)

print("Rate ratios:\\n", rate_ratios)

# Check overdispersion

overdispersion = results.pearson_chi2 / results.df_resid

print(f"Overdispersion: {overdispersion:.2f}")

if overdispersion > 1.5:

    # Use Negative Binomial instead

    from statsmodels.discrete.count_model import NegativeBinomial

    nb_model = NegativeBinomial(y_counts, X)

    nb_results = nb_model.fit()

    print(nb_results.summary())

Core Statistical Modeling Capabilities

1. Linear Regression Models

Comprehensive suite of linear models for continuous outcomes with various error structures.

Available models:

  • OLS: Standard linear regression with i.i.d. errors
  • WLS: Weighted least squares for heteroskedastic errors
  • GLS: Generalized least squares for arbitrary covariance structure
  • GLSAR: GLS with autoregressive errors for time series
  • Quantile Regression: Conditional quantiles (robust to outliers)
  • Mixed Effects: Hierarchical/multilevel models with random effects
  • Recursive/Rolling: Time-varying parameter estimation

Key features:

  • Comprehensive diagnostic tests
  • Robust standard errors (HC, HAC, cluster-robust)
  • Influence statistics (Cook's distance, leverage, DFFITS)
  • Hypothesis testing (F-tests, Wald tests)
  • Model comparison (AIC, BIC, likelihood ratio tests)
  • Prediction with confidence and prediction intervals

When to use: Continuous outcome variable, want inference on coefficients, need diagnostics

Reference: See references/linear_models.md for detailed guidance on model selection, diagnostics, and best practices.

2. Generalized Linear Models (GLM)

Flexible framework extending linear models to non-normal distributions.

Distribution families:

  • Binomial: Binary outcomes or proportions (logistic regression)
  • Poisson: Count data
  • Negative Binomial: Overdispersed counts
  • Gamma: Positive continuous, right-skewed data
  • Inverse Gaussian: Positive continuous with specific variance structure
  • Gaussian: Equivalent to OLS
  • Tweedie: Flexible family for semi-continuous data

Link functions:

  • Logit, Probit, Log, Identity, Inverse, Sqrt, CLogLog, Power
  • Choose based on interpretation needs and model fit

Key features:

  • Maximum likelihood estimation via IRLS
  • Deviance and Pearson residuals
  • Goodness-of-fit statistics
  • Pseudo R-squared measures
  • Robust standard errors

When to use: Non-normal outcomes, need flexible variance and link specifications

Reference: See references/glm.md for family selection, link functions, interpretation, and diagnostics.

3. Discrete Choice Models

Models for categorical and count outcomes.

Binary models:

  • Logit: Logistic regression (odds ratios)
  • Probit: Probit regression (normal distribution)

Multinomial models:

  • MNLogit: Unordered categories (3+ levels)
  • Conditional Logit: Choice models with alternative-specific variables
  • Ordered Model: Ordinal outcomes (ordered categories)

Count models:

  • Poisson: Standard count model
  • Negative Binomial: Overdispersed counts
  • Zero-Inflated: Excess zeros (ZIP, ZINB)
  • Hurdle Models: Two-stage models for zero-heavy data

Key features:

  • Maximum likelihood estimation
  • Marginal effects at means or average marginal effects
  • Model comparison via AIC/BIC
  • Predicted probabilities and classification
  • Goodness-of-fit tests

When to use: Binary, categorical, or count outcomes

Reference: See references/discrete_choice.md for model selection, interpretation, and evaluation.

4. Time Series Analysis

Comprehensive time series modeling and forecasting capabilities.

Univariate models:

  • AutoReg (AR): Autoregressive models
  • ARIMA: Autoregressive integrated moving average
  • SARIMAX: Seasonal ARIMA with exogenous variables
  • Exponential Smoothing: Simple, Holt, Holt-Winters
  • ETS: Innovations state space models

Multivariate models:

  • VAR: Vector autoregression
  • VARMAX: VAR with MA and exogenous variables
  • Dynamic Factor Models: Extract common factors
  • VECM: Vector error correction models (cointegration)

Advanced models:

  • State Space: Kalman filtering, custom specifications
  • Regime Switching: Markov switching models
  • ARDL: Autoregressive distributed lag

Key features:

  • ACF/PACF analysis for model identification
  • Stationarity tests (ADF, KPSS)
  • Forecasting with prediction intervals
  • Residual diagnostics (Ljung-Box, heteroskedasticity)
  • Granger causality testing
  • Impulse response functions (IRF)
  • Forecast error variance decomposition (FEVD)

When to use: Time-ordered data, forecasting, understanding temporal dynamics

Reference: See references/time_series.md for model selection, diagnostics, and forecasting methods.

5. Statistical Tests and Diagnostics

Extensive testing and diagnostic capabilities for model validation.

Residual diagnostics:

  • Autocorrelation tests (Ljung-Box, Durbin-Watson, Breusch-Godfrey)
  • Heteroskedasticity tests (Breusch-Pagan, White, ARCH)
  • Normality tests (Jarque-Bera, Omnibus, Anderson-Darling, Lilliefors)
  • Specification tests (RESET, Harvey-Collier)

Influence and outliers:

  • Leverage (hat values)
  • Cook's distance
  • DFFITS and DFBETAs
  • Studentized residuals
  • Influence plots

Hypothesis testing:

  • t-tests (one-sample, two-sample, paired)
  • Proportion tests
  • Chi-square tests
  • Non-parametric tests (Mann-Whitney, Wilcoxon, Kruskal-Wallis)
  • ANOVA (one-way, two-way, repeated measures)

Multiple comparisons:

  • Tukey's HSD
  • Bonferroni correction
  • False Discovery Rate (FDR)

Effect sizes and power:

  • Cohen's d, eta-squared
  • Power analysis for t-tests, proportions
  • Sample size calculations

Robust inference:

  • Heteroskedasticity-consistent SEs (HC0-HC3)
  • HAC standard errors (Newey-West)
  • Cluster-robust standard errors

When to use: Validating assumptions, detecting problems, ensuring robust inference

Reference: See references/stats_diagnostics.md for comprehensive testing and diagnostic procedures.

Formula API (R-style)

Statsmodels supports R-style formulas for intuitive model specification:

import statsmodels.formula.api as smf

# OLS with formula

results = smf.ols('y ~ x1 + x2 + x1:x2', data=df).fit()

# Categorical variables (automatic dummy coding)

results = smf.ols('y ~ x1 + C(category)', data=df).fit()

# Interactions

results = smf.ols('y ~ x1 * x2', data=df).fit()  # x1 + x2 + x1:x2

# Polynomial terms

results = smf.ols('y ~ x + I(x**2)', data=df).fit()

# Logit

results = smf.logit('y ~ x1 + x2 + C(group)', data=df).fit()

# Poisson

results = smf.poisson('count ~ x1 + x2', data=df).fit()

# ARIMA (not available via formula, use regular API)

Model Selection and Comparison

Information Criteria

# Compare models using AIC/BIC

models = {

    'Model 1': model1_results,

    'Model 2': model2_results,

    'Model 3': model3_results

}

comparison = pd.DataFrame({

    'AIC': {name: res.aic for name, res in models.items()},

    'BIC': {name: res.bic for name, res in models.items()},

    'Log-Likelihood': {name: res.llf for name, res in models.items()}

})

print(comparison.sort_values('AIC'))

# Lower AIC/BIC indicates better model

Likelihood Ratio Test (Nested Models)

# For nested models (one is subset of the other)

from scipy import stats

lr_stat = 2 * (full_model.llf - reduced_model.llf)

df = full_model.df_model - reduced_model.df_model

p_value = 1 - stats.chi2.cdf(lr_stat, df)

print(f"LR statistic: {lr_stat:.4f}")

print(f"p-value: {p_value:.4f}")

if p_value < 0.05:

    print("Full model significantly better")

else:

    print("Reduced model preferred (parsimony)")

Cross-Validation

from sklearn.model_selection import KFold

from sklearn.metrics import mean_squared_error

kf = KFold(n_splits=5, shuffle=True, random_state=42)

cv_scores = []

for train_idx, val_idx in kf.split(X):

    X_train, X_val = X.iloc[train_idx], X.iloc[val_idx]

    y_train, y_val = y.iloc[train_idx], y.iloc[val_idx]

    # Fit model

    model = sm.OLS(y_train, X_train).fit()

    # Predict

    y_pred = model.predict(X_val)

    # Score

    rmse = np.sqrt(mean_squared_error(y_val, y_pred))

    cv_scores.append(rmse)

print(f"CV RMSE: {np.mean(cv_scores):.4f} ± {np.std(cv_scores):.4f}")

Best Practices

Data Preparation

  • Always add constant: Use sm.add_constant() unless excluding intercept
  • Check for missing values: Handle or impute before fitting
  • Scale if needed: Improves convergence, interpretation (but not required for tree models)
  • Encode categoricals: Use formula API or manual dummy coding

Model Building

  • Start simple: Begin with basic model, add complexity as needed
  • Check assumptions: Test residuals, heteroskedasticity, autocorrelation
  • Use appropriate model: Match model to outcome type (binary→Logit, count→Poisson)
  • Consider alternatives: If assumptions violated, use robust methods or different model

Inference

  • Report effect sizes: Not just p-values
  • Use robust SEs: When heteroskedasticity or clustering present
  • Multiple comparisons: Correct when testing many hypotheses
  • Confidence intervals: Always report alongside point estimates

Model Evaluation

  • Check residuals: Plot residuals vs fitted, Q-Q plot
  • Influence diagnostics: Identify and investigate influential observations
  • Out-of-sample validation: Test on holdout set or cross-validate
  • Compare models: Use AIC/BIC for non-nested, LR test for nested

Reporting

  • Comprehensive summary: Use .summary() for detailed output
  • Document decisions: Note transformations, excluded observations
  • Interpret carefully: Account for link functions (e.g., exp(β) for log link)
  • Visualize: Plot predictions, confidence intervals, diagnostics

Common Workflows

Workflow 1: Linear Regression Analysis

  • Explore data (plots, descriptives)
  • Fit initial OLS model
  • Check residual diagnostics
  • Test for heteroskedasticity, autocorrelation
  • Check for multicollinearity (VIF)
  • Identify influential observations
  • Refit with robust SEs if needed
  • Interpret coefficients and inference
  • Validate on holdout or via CV

Workflow 2: Binary Classification

  • Fit logistic regression (Logit)
  • Check for convergence issues
  • Interpret odds ratios
  • Calculate marginal effects
  • Evaluate classification performance (AUC, confusion matrix)
  • Check for influential observations
  • Compare with alternative models (Probit)
  • Validate predictions on test set

Workflow 3: Count Data Analysis

  • Fit Poisson regression
  • Check for overdispersion
  • If overdispersed, fit Negative Binomial
  • Check for excess zeros (consider ZIP/ZINB)
  • Interpret rate ratios
  • Assess goodness of fit
  • Compare models via AIC
  • Validate predictions

Workflow 4: Time Series Forecasting

  • Plot series, check for trend/seasonality
  • Test for stationarity (ADF, KPSS)
  • Difference if non-stationary
  • Identify p, q from ACF/PACF
  • Fit ARIMA or SARIMAX
  • Check residual diagnostics (Ljung-Box)
  • Generate forecasts with confidence intervals
  • Evaluate forecast accuracy on test set

Reference Documentation

This skill includes comprehensive reference files for detailed guidance:

references/linear_models.md

Detailed coverage of linear regression models including:

  • OLS, WLS, GLS, GLSAR, Quantile Regression
  • Mixed effects models
  • Recursive and rolling regression
  • Comprehensive diagnostics (heteroskedasticity, autocorrelation, multicollinearity)
  • Influence statistics and outlier detection
  • Robust standard errors (HC, HAC, cluster)
  • Hypothesis testing and model comparison

references/glm.md

Complete guide to generalized linear models:

  • All distribution families (Binomial, Poisson, Gamma, etc.)
  • Link functions and when to use each
  • Model fitting and interpretation
  • Pseudo R-squared and goodness of fit
  • Diagnostics and residual analysis
  • Applications (logistic, Poisson, Gamma regression)

references/discrete_choice.md

Comprehensive guide to discrete outcome models:

  • Binary models (Logit, Probit)
  • Multinomial models (MNLogit, Conditional Logit)
  • Count models (Poisson, Negative Binomial, Zero-Inflated, Hurdle)
  • Ordinal models
  • Marginal effects and interpretation
  • Model diagnostics and comparison

references/time_series.md

In-depth time series analysis guidance:

  • Univariate models (AR, ARIMA, SARIMAX, Exponential Smoothing)
  • Multivariate models (VAR, VARMAX, Dynamic Factor)
  • State space models
  • Stationarity testing and diagnostics
  • Forecasting methods and evaluation
  • Granger causality, IRF, FEVD

references/stats_diagnostics.md

Comprehensive statistical testing and diagnostics:

  • Residual diagnostics (autocorrelation, heteroskedasticity, normality)
  • Influence and outlier detection
  • Hypothesis tests (parametric and non-parametric)
  • ANOVA and post-hoc tests
  • Multiple comparisons correction
  • Robust covariance matrices
  • Power analysis and effect sizes

When to reference:

  • Need detailed parameter explanations
  • Choosing between similar models
  • Troubleshooting convergence or diagnostic issues
  • Understanding specific test statistics
  • Looking for code examples for advanced features

Search patterns:

# Find information about specific models

grep -r "Quantile Regression" references/

# Find diagnostic tests

grep -r "Breusch-Pagan" references/stats_diagnostics.md

# Find time series guidance

grep -r "SARIMAX" references/time_series.md

Common Pitfalls to Avoid

  • Forgetting constant term: Always use sm.add_constant() unless no intercept desired
  • Ignoring assumptions: Check residuals, heteroskedasticity, autocorrelation
  • Wrong model for outcome type: Binary→Logit/Probit, Count→Poisson/NB, not OLS
  • Not checking convergence: Look for optimization warnings
  • Misinterpreting coefficients: Remember link functions (log, logit, etc.)
  • Using Poisson with overdispersion: Check dispersion, use Negative Binomial if needed
  • Not using robust SEs: When heteroskedasticity or clustering present
  • Overfitting: Too many parameters relative to sample size
  • Data leakage: Fitting on test data or using future information
  • Not validating predictions: Always check out-of-sample performance
  • Comparing non-nested models: Use AIC/BIC, not LR test
  • Ignoring influential observations: Check Cook's distance and leverage
  • Multiple testing: Correct p-values when testing many hypotheses
  • Not differencing time series: Fit ARIMA on non-stationary data
  • Confusing prediction vs confidence intervals: Prediction intervals are wider

Getting Help

For detailed documentation and examples:

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card