SKILL.md
$2b
import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "yfinance", "pandas", "numpy"])
If already installed, skip and proceed.
Step 2: Route to the Correct Sub-Skill
Classify the user's request and jump to the matching section. If the user asks a general question about an ETF's premium or discount without specifying a particular analysis type, default to Sub-Skill A (Single ETF Snapshot).
User Request
Route To
Examples
Single ETF premium/discount
Sub-Skill A: Single ETF Snapshot
"is SPY at a premium?", "AGG premium to NAV", "BITO premium"
Compare multiple ETFs
Sub-Skill B: Multi-ETF Comparison
"compare bond ETF discounts", "which has bigger premium IBIT or BITO", "rank these ETFs by premium"
Screener / find extreme premiums
Sub-Skill C: Premium Screener
"which ETFs have biggest discount", "find ETFs trading below NAV", "premium screener"
Deep analysis with context
Sub-Skill D: Premium Deep Dive
"why is HYG at a discount", "is ARKK premium normal", "ETF premium analysis with context"
Sudden premium surge / gamma squeeze
Sub-Skill E: Premium Surge Decomposition
"why did KWEB jump 13% today", "is this ETF rally driven by gamma", "decompose today's ETF move", "dealer GEX for SOXL", "how long until the premium converges"
Defaults
Parameter
Default
Data source
yfinance navPrice field
Price field
regularMarketPrice (falls back to previousClose)
Screener universe
Common ETF list by category (see Sub-Skill C)
Sub-Skill A: Single ETF Snapshot
Goal: Show the current premium/discount for one ETF with context about what's normal, plus a peer comparison to show how it stacks up against similar ETFs.
A1: Fetch and compute
import yfinance as yf
# Peer groups by category — used to automatically compare the target ETF against its closest peers
CATEGORY_PEERS = {
"Digital Assets": ["IBIT", "BITO", "FBTC", "ETHA", "ARKB", "GBTC"],
"Intermediate Core Bond": ["AGG", "BND", "SCHZ"],
"High Yield Bond": ["HYG", "JNK", "USHY"],
"Long Government": ["TLT", "VGLT", "SPTL"],
"Emerging Markets Bond": ["EMB", "VWOB", "PCY"],
"Large Growth": ["QQQ", "VUG", "IWF", "SCHG"],
"Large Blend": ["SPY", "VOO", "IVV", "VTI"],
"Commodities Focused": ["GLD", "IAU", "SLV", "DBC"],
"China Region": ["KWEB", "FXI", "MCHI"],
"Trading--Leveraged Equity": ["TQQQ", "UPRO", "SOXL", "JNUG"],
"Trading--Inverse Equity": ["SQQQ", "SPXU", "SOXS", "JDST"],
"Derivative Income": ["JEPI", "JEPQ", "QYLD"],
"Large Value": ["SCHD", "VYM", "DVY", "HDV"],
}
def etf_premium_snapshot(ticker_symbol):
ticker = yf.Ticker(ticker_symbol)
info = ticker.info
# Verify this is an ETF
quote_type = info.get("quoteType", "")
if quote_type != "ETF":
return {"error": f"{ticker_symbol} is not an ETF (quoteType={quote_type})"}
price = info.get("regularMarketPrice") or info.get("previousClose")
nav = info.get("navPrice")
if not price or not nav or nav <= 0:
return {"error": f"NAV data not available for {ticker_symbol}"}
premium_pct = (price - nav) / nav * 100
premium_dollar = price - nav
# Additional context
result = {
"ticker": ticker_symbol,
"name": info.get("longName") or info.get("shortName", ""),
"market_price": round(price, 4),
"nav": round(nav, 4),
"premium_discount_pct": round(premium_pct, 4),
"premium_discount_dollar": round(premium_dollar, 4),
"status": "PREMIUM" if premium_pct > 0 else "DISCOUNT" if premium_pct < 0 else "AT NAV",
"category": info.get("category", "N/A"),
"fund_family": info.get("fundFamily", "N/A"),
"total_assets": info.get("totalAssets"),
"net_expense_ratio": info.get("netExpenseRatio"),
"avg_volume": info.get("averageVolume"),
"bid": info.get("bid"),
"ask": info.get("ask"),
"yield_pct": info.get("yield"),
"ytd_return": info.get("ytdReturn"),
}
# Bid-ask spread as context for whether the premium is meaningful
bid = info.get("bid")
ask = info.get("ask")
if bid and ask and bid > 0:
spread_pct = (ask - bid) / ((ask + bid) / 2) * 100
result["bid_ask_spread_pct"] = round(spread_pct, 4)
return result
A2: Fetch peer comparison
After computing the target ETF's snapshot, look up its category and pull premium data for peers in the same category. This gives the user immediate context on whether the premium is ETF-specific or market-wide.
def get_peer_premiums(target_ticker, target_category):
"""Fetch premium/discount for peers in the same category."""
peers = CATEGORY_PEERS.get(target_category, [])
# Remove the target itself from peers
peers = [p for p in peers if p.upper() != target_ticker.upper()]
if not peers:
return []
peer_data = []
for sym in peers:
try:
t = yf.Ticker(sym)
info = t.info
p = info.get("regularMarketPrice") or info.get("previousClose")
n = info.get("navPrice")
if p and n and n > 0:
prem = (p - n) / n * 100
peer_data.append({
"ticker": sym,
"name": info.get("shortName", ""),
"price": round(p, 2),
"nav": round(n, 2),
"premium_pct": round(prem, 4),
"expense_ratio": info.get("netExpenseRatio"),
})
except Exception:
pass
return peer_data
Present the peer comparison as a small table after the main snapshot. This helps the user see whether the premium is unique to their ETF or shared across the category — for example, if all crypto ETFs are at ~1.5% premium, the user's ETF isn't an outlier.
A3: Interpret the result
Use this framework to explain whether the premium/discount is meaningful:
Premium/Discount
Interpretation
Within +/- 0.05%
Essentially at NAV — normal for large, liquid ETFs
+/- 0.05% to 0.25%
Minor deviation — common and usually not actionable
+/- 0.25% to 1.0%
Notable — worth mentioning. Check bid-ask spread and category
+/- 1.0% to 3.0%
Significant — common for less liquid, international, or specialty ETFs
Beyond +/- 3.0%
Large — may indicate stress, illiquidity, or structural issues
Context matters by category:
- US large-cap equity (SPY, QQQ, IVV): premiums > 0.10% are unusual
- Bond ETFs (AGG, HYG, LQD, TLT): discounts of 0.5-2% happen during volatility
- International/EM (EEM, VWO, KWEB): time-zone mismatch causes regular 0.3-1% deviations
- Leveraged/Inverse (TQQQ, SQQQ, JNUG): 0.3-1.5% is normal due to daily reset mechanics
- Crypto (IBIT, BITO): 1-3% premiums are common, especially for newer funds
- Commodity (GLD, USO, UNG): depends on contango/backwardation in futures
Also compare the premium/discount to the bid-ask spread: if the premium is smaller than the spread, it's noise, not signal.
Sub-Skill B: Multi-ETF Comparison
Goal: Compare premium/discount across multiple ETFs side by side.
B1: Fetch and rank
import yfinance as yf
import pandas as pd
def compare_etf_premiums(tickers):
rows = []
for sym in tickers:
try:
t = yf.Ticker(sym)
info = t.info
if info.get("quoteType") != "ETF":
rows.append({"ticker": sym, "error": "Not an ETF"})
continue
price = info.get("regularMarketPrice") or info.get("previousClose")
nav = info.get("navPrice")
if price and nav and nav > 0:
prem = (price - nav) / nav * 100
bid = info.get("bid", 0)
ask = info.get("ask", 0)
spread = (ask - bid) / ((ask + bid) / 2) * 100 if bid and ask and bid > 0 else None
rows.append({
"ticker": sym,
"name": info.get("shortName", ""),
"price": round(price, 2),
"nav": round(nav, 2),
"premium_pct": round(prem, 4),
"spread_pct": round(spread, 4) if spread else None,
"category": info.get("category", "N/A"),
"total_assets": info.get("totalAssets"),
})
else:
rows.append({"ticker": sym, "error": "NAV unavailable"})
except Exception as e:
rows.append({"ticker": sym, "error": str(e)})
df = pd.DataFrame(rows)
if "premium_pct" in df.columns:
df = df.sort_values("premium_pct", ascending=True)
return df
B2: Present as a ranked table
Sort by premium/discount (most discounted first). Highlight:
- Which ETFs are at the deepest discount
- Which are at the highest premium
- Whether the premium/discount exceeds the bid-ask spread (if it doesn't, it's market microstructure noise)
Sub-Skill C: Premium Screener
Goal: Scan a universe of common ETFs to find those with the largest premiums or discounts.
C1: Define the universe and scan
Use this default universe organized by category. The user can supply their own list instead.
DEFAULT_ETF_UNIVERSE = {
"US Equity": ["SPY", "QQQ", "IVV", "VOO", "VTI", "DIA", "IWM", "ARKK"],
"Bond": ["AGG", "BND", "TLT", "HYG", "LQD", "VCIT", "VCSH", "BNDX", "EMB", "JNK", "MUB", "TIP"],
"International": ["EFA", "EEM", "VWO", "IEMG", "KWEB", "FXI", "INDA", "VEA", "EWZ", "EWJ"],
"Commodity": ["GLD", "SLV", "USO", "UNG", "DBC", "IAU", "PDBC", "GSG"],
"Crypto": ["IBIT", "BITO", "FBTC", "ETHA", "ARKB", "GBTC"],
"Leveraged/Inverse": ["TQQQ", "SQQQ", "SPXU", "UPRO", "JNUG", "JDST", "SOXL", "SOXS"],
"Sector": ["XLF", "XLE", "XLK", "XLV", "XLI", "XLP", "XLU", "XLRE", "XLC", "XLB", "XLY"],
"Sector - Semis/Tech": ["SOXX", "SMH", "IGV", "XSD"],
"Sector - Healthcare": ["XBI", "IBB", "IHI"],
"Thematic": ["ARKW", "ARKG", "HACK", "CLOU", "WCLD", "BUG", "BOTZ", "LIT", "ICLN", "TAN"],
"Income": ["JEPI", "JEPQ", "SCHD", "VYM", "DVY", "DIVO", "HDV", "QYLD"],
}
import yfinance as yf
import pandas as pd
def screen_etf_premiums(universe=None, min_abs_premium=0.0):
if universe is None:
universe = DEFAULT_ETF_UNIVERSE
all_tickers = []
for category, tickers in universe.items():
for sym in tickers:
all_tickers.append((sym, category))
rows = []
for sym, category_label in all_tickers:
try:
t = yf.Ticker(sym)
info = t.info
price = info.get("regularMarketPrice") or info.get("previousClose")
nav = info.get("navPrice")
if price and nav and nav > 0:
prem = (price - nav) / nav * 100
if abs(prem) >= min_abs_premium:
rows.append({
"ticker": sym,
"name": info.get("shortName", ""),
"category": category_label,
"price": round(price, 2),
"nav": round(nav, 2),
"premium_pct": round(prem, 4),
"total_assets_B": round(info.get("totalAssets", 0) / 1e9, 2),
"expense_ratio": info.get("netExpenseRatio"),
})
except Exception:
pass
df = pd.DataFrame(rows)
if not df.empty:
df = df.sort_values("premium_pct", ascending=True)
return df
C2: Present the results
Show a ranked table sorted by premium (most discounted first). Group by category if the list is long. Call out:
- Top 5 deepest discounts — potential buying opportunities (or signs of stress)
- Top 5 highest premiums — overpaying risk
- Category patterns — are all bond ETFs at a discount? Are all crypto ETFs at a premium?
Note: this screener takes time because it fetches data one ticker at a time. For large universes (60+ ETFs), warn the user it may take 1-2 minutes.
Sub-Skill D: Premium Deep Dive
Goal: Combine premium/discount data with additional context to help the user understand why the premium exists and whether it's likely to persist.
D1: Gather comprehensive data
Run the Sub-Skill A snapshot, then add:
import yfinance as yf
import numpy as np
def premium_deep_dive(ticker_symbol):
ticker = yf.Ticker(ticker_symbol)
info = ticker.info
price = info.get("regularMarketPrice") or info.get("previousClose")
nav = info.get("navPrice")
if not price or not nav or nav <= 0:
return {"error": "NAV data not available"}
premium_pct = (price - nav) / nav * 100
# Historical price data for volatility context
hist = ticker.history(period="3mo")
if not hist.empty:
returns = hist["Close"].pct_change().dropna()
daily_vol = returns.std()
annualized_vol = daily_vol * np.sqrt(252)
avg_volume = hist["Volume"].mean()
dollar_volume = (hist["Close"] * hist["Volume"]).mean()
# Price range context
high_3m = hist["Close"].max()
low_3m = hist["Close"].min()
pct_from_high = (price - high_3m) / high_3m * 100
else:
daily_vol = annualized_vol = avg_volume = dollar_volume = None
high_3m = low_3m = pct_from_high = None
result = {
"ticker": ticker_symbol,
"name": info.get("longName", ""),
"price": round(price, 4),
"nav": round(nav, 4),
"premium_pct": round(premium_pct, 4),
"category": info.get("category", "N/A"),
"fund_family": info.get("fundFamily", "N/A"),
"total_assets": info.get("totalAssets"),
"expense_ratio": info.get("netExpenseRatio"),
"yield_pct": info.get("yield"),
"ytd_return": info.get("ytdReturn"),
"beta_3y": info.get("beta3Year"),
"annualized_vol": round(annualized_vol * 100, 2) if annualized_vol else None,
"avg_daily_dollar_volume": round(dollar_volume, 0) if dollar_volume else None,
"pct_from_3m_high": round(pct_from_high, 2) if pct_from_high else None,
}
# Bid-ask spread
bid = info.get("bid")
ask = info.get("ask")
if bid and ask and bid > 0:
spread_pct = (ask - bid) / ((ask + bid) / 2) * 100
result["bid_ask_spread_pct"] = round(spread_pct, 4)
result["premium_exceeds_spread"] = abs(premium_pct) > spread_pct
return result
D2: Explain the why
After gathering data, explain the premium/discount using this diagnostic framework:
Common causes of premiums:
- Demand surge — more buyers than authorized participants can create shares (common for new/hot ETFs like crypto)
- Time-zone mismatch — international ETF trading when underlying markets are closed; price reflects anticipated moves
- Creation mechanism bottleneck — when authorized participants face constraints on creating new shares
- Sentiment premium — retail demand pushes price above fair value during hype cycles
Common causes of discounts:
- Liquidity stress — during sell-offs, bond and credit ETFs often trade at discounts because underlying bonds are harder to price/trade than the ETF itself
- Redemption pressure — heavy outflows but slow authorized participant response
- Stale NAV — the official NAV may not reflect after-hours news or events
- Structural issues — contango in futures-based ETFs (USO, UNG) creates persistent drag
Is the premium likely to persist?
- For liquid US equity ETFs: No — arbitrage corrects deviations within minutes
- For bond ETFs during stress: Discounts can persist for days or weeks
- For crypto ETFs: Premiums tend to narrow as the fund matures and APs become more active
- For international ETFs: Resets daily as underlying markets open
Sub-Skill E: Premium Surge Decomposition (Gamma Squeeze Analysis)
Goal: When an ETF has just experienced a dramatic intraday move that diverges from its underlying holdings, decompose the move into (1) a fundamental NAV-driven component and (2) an "excess premium" driven by structural forces — most commonly options dealer gamma hedging, AP arbitrage breakdowns, or sentiment surges. Then assess how long the premium will likely take to converge.
This sub-skill is appropriate when the user reports or asks about:
- An ETF moving 5%+ in a single session
- A divergence between the ETF and its named underlyings (e.g., "MSTR jumped 13% but BTC only rose 3%")
- A suspected gamma squeeze in an ETF or single name
- Whether dealer hedging is amplifying a move
Read references/gamma_squeeze_reference.md for the full GEX formula derivation, dealer-positioning conventions, and worked examples before running E2.
E1: Decompose today's move into NAV-driven vs excess premium
The static navPrice field gives only the most recent end-of-day NAV — it cannot tell you how much of today's move is NAV-driven. Estimate the NAV return from the holdings' returns instead:
import yfinance as yf
import pandas as pd
import numpy as np
def decompose_etf_move(ticker_symbol, holdings_weights=None, window="2d"):
"""
Decompose the ETF's most recent daily move into NAV-driven vs excess premium.
holdings_weights: dict like {"MU": 0.20, "005930.KS": 0.22, "000660.KS": 0.27, ...}
If None, attempts to fetch via yfinance's funds_data;
falls back to user-supplied weights for ETFs where it isn't available.
"""
etf = yf.Ticker(ticker_symbol)
info = etf.info
# ETF return over the most recent session
etf_hist = etf.history(period=window, auto_adjust=False)
if len(etf_hist) < 2:
return {"error": "Not enough history"}
etf_close_today = etf_hist["Close"].iloc[-1]
etf_close_prev = etf_hist["Close"].iloc[-2]
etf_return_pct = (etf_close_today / etf_close_prev - 1) * 100
# Try to auto-fetch holdings if not supplied
if holdings_weights is None:
try:
top_holdings = etf.funds_data.top_holdings # DataFrame
holdings_weights = dict(zip(top_holdings.index, top_holdings["Holding Percent"]))
except Exception:
holdings_weights = {}
if not holdings_weights:
return {
"error": "Holdings weights unavailable — supply manually via holdings_weights={'TICKER': weight, ...}",
"etf_return_pct": round(etf_return_pct, 4),
}
# Weighted return of underlying holdings (proxy for NAV move)
weighted_return = 0.0
coverage = 0.0
holding_returns = {}
for sym, w in holdings_weights.items():
try:
h = yf.Ticker(sym).history(period=window, auto_adjust=False)
if len(h) >= 2:
r = (h["Close"].iloc[-1] / h["Close"].iloc[-2] - 1) * 100
holding_returns[sym] = round(r, 4)
weighted_return += w * r
coverage += w
except Exception:
pass
# Normalize to coverage so partial holdings still give a sensible NAV proxy
nav_return_proxy = weighted_return / coverage if coverage > 0 else None
excess_premium_pct = (
etf_return_pct - nav_return_proxy if nav_return_proxy is not None else None
)
return {
"ticker": ticker_symbol,
"etf_return_pct": round(etf_return_pct, 4),
"nav_return_proxy_pct": round(nav_return_proxy, 4) if nav_return_proxy else None,
"excess_premium_pct": round(excess_premium_pct, 4) if excess_premium_pct else None,
"holdings_coverage_pct": round(coverage * 100, 2),
"holding_returns": holding_returns,
"interpretation": (
"Most of the move is NAV-driven — limited structural component"
if excess_premium_pct is not None and abs(excess_premium_pct) < 1
else "Significant excess premium — investigate dealer hedging, AP bottlenecks, or sentiment"
if excess_premium_pct is not None
else "Cannot conclude without holdings data"
),
}
Caveat: For international ETFs whose underlyings trade in a closed session (e.g., Asian holdings during US hours), the holdings' US-listed proxies (ADRs) or futures must be used. If neither is available, flag this to the user — the NAV proxy will be stale.
E2: Compute dealer gamma exposure (GEX) from the options chain
GEX quantifies how much hedging buying/selling dealers must do per 1% move in the underlying. Large positive GEX accumulating on the call side during a rally indicates a gamma squeeze in progress.
import numpy as np
from datetime import datetime, timezone
from math import log, sqrt, exp, pi
def _norm_pdf(x):
return exp(-0.5 * x * x) / sqrt(2 * pi)
def _bsm_gamma(S, K, T, r, sigma):
"""Black-Scholes gamma. Returns 0 for degenerate inputs."""
if S <= 0 or K <= 0 or T <= 0 or sigma <= 0:
return 0.0
d1 = (log(S / K) + (r + 0.5 * sigma * sigma) * T) / (sigma * sqrt(T))
return _norm_pdf(d1) / (S * sigma * sqrt(T))
def compute_gex(ticker_symbol, risk_free_rate=0.045, max_expirations=8):
"""
Compute gross and net dealer gamma exposure.
Conventions:
- Per contract, dollar gamma per 1% move = OI * 100 * gamma * spot * (spot * 0.01)
= OI * gamma * spot^2 (with multiplier=100)
- SqueezeMetrics convention (assumes dealers SHORT calls, LONG puts):
net_gex = call_gamma_$ - put_gamma_$
Positive net_gex = stabilizing (dealers sell rallies, buy dips)
Negative net_gex = destabilizing (dealers buy rallies, sell dips → squeeze)
- "Customer-net-long-everything" convention (dealers SHORT both):
gross_hedge = call_gamma_$ + put_gamma_$
This is the maximum hedging pressure assumption.
"""
t = yf.Ticker(ticker_symbol)
info = t.info
spot = info.get("regularMarketPrice") or info.get("previousClose")
if not spot:
return {"error": "No spot price"}
expirations = t.options[:max_expirations]
if not expirations:
return {"error": "No options chain available"}
now = datetime.now(timezone.utc)
rows = []
for exp_str in expirations:
try:
chain = t.option_chain(exp_str)
except Exception:
continue
exp_date = datetime.strptime(exp_str, "%Y-%m-%d").replace(tzinfo=timezone.utc)
T = max((exp_date - now).total_seconds() / (365.25 * 86400), 1e-6)
for side, df in [("call", chain.calls), ("put", chain.puts)]:
for _, row in df.iterrows():
K = row.get("strike")
iv = row.get("impliedVolatility")
oi = row.get("openInterest", 0) or 0
if not K or not iv or oi <= 0:
continue
gamma = _bsm_gamma(spot, K, T, risk_free_rate, iv)
# Dollar value per 1% spot move:
gamma_dollars_per_1pct = oi * gamma * spot * spot
rows.append({
"expiration": exp_str,
"side": side,
"strike": K,
"iv": iv,
"oi": oi,
"gamma": gamma,
"gamma_$_per_1pct": gamma_dollars_per_1pct,
})
if not rows:
return {"error": "No usable contracts"}
df = pd.DataFrame(rows)
call_gex = df[df["side"] == "call"]["gamma_$_per_1pct"].sum()
put_gex = df[df["side"] == "put"]["gamma_$_per_1pct"].sum()
# Top concentration: which expiration & strike dominate
top_strikes = (
df.groupby(["expiration", "strike", "side"])["gamma_$_per_1pct"]
.sum()
.sort_values(ascending=False)
.head(10)
.reset_index()
)
total_call_oi = df[df["side"] == "call"]["oi"].sum()
total_put_oi = df[df["side"] == "put"]["oi"].sum()
cp_ratio = total_call_oi / total_put_oi if total_put_oi > 0 else None
# Pull near-term ATM IV as a single representative number
df["moneyness"] = abs(df["strike"] / spot - 1)
near_atm = df.sort_values("moneyness").head(20)
atm_iv_pct = near_atm["iv"].median() * 100 if len(near_atm) else None
return {
"ticker": ticker_symbol,
"spot": spot,
"call_gex_per_1pct_$": call_gex,
"put_gex_per_1pct_$": put_gex,
"net_gex_squeezemetrics_$": call_gex - put_gex,
"gross_hedge_pressure_$": call_gex + put_gex,
"total_call_oi": int(total_call_oi),
"total_put_oi": int(total_put_oi),
"call_put_oi_ratio": round(cp_ratio, 2) if cp_ratio else None,
"atm_iv_pct": round(atm_iv_pct, 2) if atm_iv_pct else None,
"expirations_analyzed": len(expirations),
"top_concentrations": top_strikes,
}
Interpret the output:
- **
net_gex_squeezemetrics_$highly negative** → dealers are short gamma; rallies will be amplified by their hedging buys. Classic gamma-squeeze fuel.
- Concentration on a single near-dated strike (e.g., the article's "June $45 calls") → squeeze is fragile and concentrated. When that strike expires or the spot moves past it, the gamma decays sharply.
- ATM IV well above the recent average (article example: 78 vs typical ~30–40) → market is pricing in continued large moves; option premium decay alone will provide some convergence pressure over days.
- Call/Put OI ratio > 2.5 → call-heavy positioning, consistent with a bullish gamma squeeze setup.
E3: Compare structural buying pressure to actual volume
The article's most concrete claim was that ~35% of the day's buying was dealer-driven. Reproduce this comparison:
def estimate_dealer_share_of_volume(ticker_symbol, gex_per_1pct_dollars, etf_return_pct):
"""
Implied dealer-driven $ buying = |gex_per_1pct| * |etf_return_pct|
Compare to actual dollar volume.
"""
t = yf.Ticker(ticker_symbol)
hist = t.history(period="2d", auto_adjust=False)
if hist.empty:
return None
today = hist.iloc[-1]
actual_dollar_volume = today["Close"] * today["Volume"]
implied_dealer_buying = abs(gex_per_1pct_dollars) * abs(etf_return_pct)
share = implied_dealer_buying / actual_dollar_volume if actual_dollar_volume > 0 else None
return {
"actual_dollar_volume_$": round(actual_dollar_volume, 0),
"implied_dealer_buying_$": round(implied_dealer_buying, 0),
"dealer_share_of_volume_pct": round(share * 100, 2) if share else None,
}
This is a rough estimate — it assumes every contract's full gamma was hedged in a single direction during the move. Real hedging is incremental, and not all dealers hedge identically. Treat as an upper-bound heuristic, not a precise figure. Always present it alongside the assumptions.
E4: Assess premium convergence timeline
The article's three-tier convergence framework:
Time scale
Mechanism
What to check
Hours
AP creation/redemption arbitrage
Is the underlying market open? Are creation units restricted? Is the spread between bid/ask widening (suggests AP stepping back)?
Days
Options expiration / gamma decay
When does the dominant strike's expiration land? Is OI rolling forward or being closed? Is IV starting to compress?
Weeks
Net flow normalization
Is the ETF receiving large daily inflows (signals demand outpacing creation capacity)? Is short interest building (potential additional squeeze fuel)?
def assess_convergence(ticker_symbol, top_concentrations_df):
"""Returns a dict of qualitative convergence signals."""
t = yf.Ticker(ticker_symbol)
info = t.info
# 1. AP arbitrage: market hours of underlying
region = info.get("region") or info.get("market") or "unknown"
underlying_session_note = (
"International — check whether underlying market overlaps US trading hours; "
"AP arbitrage may be blocked when underlying market is closed"
if "us_market" not in (info.get("market") or "").lower()
else "US-listed underlying — AP arbitrage active during US hours"
)
# 2. Options expiration: nearest concentrated strike
if not top_concentrations_df.empty:
next_major_exp = top_concentrations_df.iloc[0]["expiration"]
days_to_exp = (datetime.strptime(next_major_exp, "%Y-%m-%d") - datetime.now()).days
exp_note = f"Largest gamma concentration expires in {days_to_exp} days ({next_major_exp})"
else:
exp_note = "No clear strike concentration"
# 3. Flow proxy: AUM trajectory (very rough)
aum = info.get("totalAssets")
aum_note = f"Total AUM: ${aum/1e9:.2f}B" if aum else "AUM unavailable"
return {
"ap_arbitrage": underlying_session_note,
"options_window": exp_note,
"flows": aum_note,
}
E5: Present the decomposition
Format the answer in this order:
-
Headline number: today's ETF move, NAV-proxy move, and the excess premium (in pp).
-
Decomposition table:
Component
Contribution
NAV-driven (holdings × weights)
+X.X%
Excess premium (residual)
+Y.Y%
Total ETF move
+Z.Z%
-
Dealer hedging quantification:
- Net GEX (SqueezeMetrics convention)
- Implied dealer $ buying for the day vs actual $ volume
- Estimated dealer share of buying pressure
-
Risk indicators: ATM IV, call/put OI ratio, top-3 strike/expiration concentrations.
-
Convergence outlook: list each of the hours/days/weeks mechanisms with the current state of each.
-
Caveats: the GEX estimate assumes uniform dealer positioning; the NAV proxy is stale during overnight sessions; this is not a forecast of future price.
Step 3: Respond to the User
Always include
- The ETF name and ticker
- Market price and NAV with the calculation shown
- Premium/discount percentage clearly labeled
- Context: is this deviation normal for this ETF category?
Always caveat
- NAV data from Yahoo Finance reflects the most recent official NAV (typically end of prior trading day) — it is not real-time
- Market price may have a 15-minute delay depending on the exchange
- Premium/discount can change rapidly during market hours — this is a snapshot, not a live feed
- Small premiums/discounts (< bid-ask spread) are market microstructure noise, not real mispricing
- Never recommend buying or selling based on premium/discount alone — present the data and let the user decide
Formatting
- Use markdown tables for multi-ETF comparisons
- Show the formula:
Premium/Discount = (Market Price - NAV) / NAV x 100
- Use color indicators in text: "trading at a 0.45% discount" or "at a 1.2% premium"
- Round percentages to 2-4 decimal places depending on magnitude
Reference Files
references/etf_premium_reference.md— Detailed formulas, category-specific benchmarks, common ETF universe list, and background on the creation/redemption mechanism that drives premiums
references/gamma_squeeze_reference.md— Premium decomposition framework, Black-Scholes gamma + GEX formulas with both SqueezeMetrics and customer-net-long conventions, convergence-timeline framework (hours/days/weeks), gamma-squeeze vs routine-rally diagnostic table, and a worked example. Read this before running Sub-Skill E.
Read the reference files for deeper technical detail on ETF premium/discount mechanics, historical context, and the gamma-squeeze decomposition methodology.