Strategy Memo — Idiosyncratic Volatility and the Cross-Section of Stock Returns, 1990-2024

Author: strategist agent Date: 2026-05-11 Paper type: Descriptive / measurement / predictive. No causal claim. The unit of analysis is a stock-month return; the right-hand-side variable of interest is lagged realised idiosyncratic volatility. We test whether a stylised fact — the AHXZ (2006) negative idiovol-return spread — is preserved in the 1990-2024 US universe.

Pre-Strategy Report

I read the three required inputs before designing the strategy. Key takeaways feeding the memo are summarised below.

1. Literature review (`quality_reports/literature-review.md`)

Four-bucket frontier map: (i) does the puzzle exist; (ii) why; (iii) does it still exist; (iv) FM methods. Bucket (i) anchors on ang2006cross (foundational, 1963-2000) with replication/measurement critiques in bali2008idiosyncratic, han2011investor, and time-series backdrop in campbell2001have.
Bucket (ii) explanations we cite but do not test: fu2009idiosyncratic (expected vs. realised idiovol), boyer2010expected, bali2011maxing (MAX lottery proxy), huang2010return (return reversal), stambaugh2015arbitrage (arbitrage asymmetry), chen2012does, liu2018absolving, asness2020betting.
Bucket (iii) — Does it still exist? is unsettled. mclean2016does predicts post-publication decay. hou2020replicating finds survival under VW + NYSE breakpoints with smaller magnitude. chu2020idiosyncratic shows sensitivity to short-sale environment (Reg SHO).
The gap statement is explicit: no transparent decade-by-decade AHXZ replication on 1990-2024 exists; our contribution is descriptive and methodological — an updated stylised fact, not an adjudication of explanations.
Methods bucket (famamacbeth1973, shanken1992estimation, petersen2009estimating, jegadeesh2019empirical) sets the FM specification with Newey-West SEs and month clustering.

2. Data assessment (`quality_reports/data-assessment.md`)

Coverage grade: A. Four RDS files (crsp.dsf 67.6M rows; crsp.msenames 83.8k; comp.funda 404k; crsp.ccmxpf_lnkhist 32.8k). After universe filters the analytical panel should exceed 4M stock-months — orders of magnitude above the inferential threshold.
Filters justified by Explorer: shrcd ∈ {10,11}, exchcd ∈ {1,2,3}, exclude SIC 6000-6999 (financials) and SIC 4900-4999 (utilities), require ≥17 daily observations per (permno, month).
Market return constructible from crsp.dsf as beginning-of-day market-cap-weighted aggregate over the same filtered universe; validates ≥5 decimals against crsp.dsi vwretd.
Explorer-critic recommendation: add curcd == 'USD' to the Compustat funda filter alongside indfmt=='INDL' & datafmt=='STD' & popsrc=='D' & consol=='C'. Adopted in Section 3 below.
Open caveats: delisting returns (dlret lives in msedelist, not dsf) — flagged as documented limitation; Han-Lesmond microstructure noise — addressed in robustness; January seasonality — addressed in robustness; negative book equity drop.

3. Project conventions (`CLAUDE.md` + content-invariants + working-paper-format)

R language, here::here() paths, set.seed(20260519) once at top, R exports bare tabular to output/, no titles in ggplot, double-spaced 12pt working paper.
INV-8 binding: this is a descriptive paper — no causal claims. INV-7: notation consistent across sections. INV-11: numbers in text must match tables.
Strategy phase severity is medium (constructive criticism per quality.md). Missing robustness check costs -5 at this phase.

I have inputs from all three required sources and noted the explorer-critic’s curcd recommendation. No discovery inputs are missing.

Research Question and Hypotheses

We document — descriptively — whether sorting US common stocks by lagged-month idiosyncratic volatility produces the AHXZ (2006) cross-sectional return spread on the 1990-2024 sample. Three pre-specified predictive hypotheses follow.

H1 (main, magnitude preservation). Conditional on standard pricing controls, the next-month average return spread between the low-idiovol quintile (Q1) and the high-idiovol quintile (Q5) is positive: $\mathbb{E}[r_{Q1,t+1} - r_{Q5,t+1} \mid \mathcal{F}_t] > 0$. Equivalently, the FF3 alpha of the long-Q1/short-Q5 portfolio is positive and economically meaningful. This preserves the AHXZ sign convention (low idiovol predicts high returns).

H2 (heterogeneity, sub-period). The Q1-minus-Q5 spread is larger in the post-1999 sub-period than in the pre-1999 sub-period, consistent with the listings-decline composition shift documented by Doidge-Karolyi-Stulz (and the time-trend trajectory implied by Hou-Loh 2016).

H3 (heterogeneity, size). The Q1-minus-Q5 spread is concentrated in small-cap stocks, defined as below the NYSE 20th percentile of market capitalisation. The spread is statistically and economically smaller — possibly indistinguishable from zero — among NYSE-large stocks.

None of these hypotheses is causal. They are statements about conditional predictive moments of returns given lagged characteristics, in the spirit of Cochrane’s “characteristic-based” cross-sectional pricing.

Sample Construction

Universe. Begin with crsp.dsf joined to crsp.msenames on permno with the validity condition namedt <= date <= coalesce(nameendt, today). Apply:

shrcd %in% c(10, 11) (domestic common stocks)
exchcd %in% c(1, 2, 3) (NYSE, AMEX, Nasdaq)
!(siccd %in% 6000:6999) (exclude financials)
!(siccd %in% 4900:4999) (exclude utilities)
Date range: 1990-01-01 <= date <= 2024-12-31
Stock-month requirement: at least 17 valid daily returns within the calendar month (AHXZ convention; ≈80% of the 21 trading days)

Compustat link. Join comp.funda to the CRSP panel via crsp.ccmxpf_lnkhist. Compustat filters: indfmt == 'INDL' & datafmt == 'STD' & popsrc == 'D' & consol == 'C' & curcd == 'USD' (the curcd clause is per the explorer-critic recommendation; prevents non-USD filers from contaminating book equity construction). Link filters: linktype %in% c('LU', 'LC'), linkprim %in% c('P', 'C'), with linkdt <= datadate <= coalesce(linkenddt, today).

Book equity (Fama-French convention). $BE = ceq + txditc - PS$, where $PS = \text{coalesce}(pstkrv, pstkl, pstk, 0)$. Drop firm-years with $BE \le 0$ for BE/ME-based controls (Fama-French 1992 convention). For sorting purposes, book equity at fiscal year-end $y$ is matched to monthly returns in July of year $y+1$ through June of year $y+2$ (FF12 lag).

Market equity. $ME_{i,t} = |prc_{i,t}| \cdot shrout_{i,t}$ in $ thousands; aggregated to $ millions.

Self-constructed market return. From the filtered dsf panel: \[r_{M,t} = \frac{\sum_i ME_{i,t-1} \cdot r_{i,t}}{\sum_i ME_{i,t-1}}.\] Lagged market cap as weight avoids look-ahead. Validate against crsp.dsi vwretd if available. Risk-free rate: Ken French daily $r_f$ file.

Idiovol Construction (Main Specification)

For each (permno, month) pair with $\ge 17$ valid daily returns:

Compute daily excess return $r^e_{i,d} = r_{i,d} - r_{f,d}$.
Compute daily market excess return $r^e_{M,d} = r_{M,d} - r_{f,d}$.
Run the within-month CAPM time-series regression \[r^e_{i,d} = \alpha_{i,m} + \beta_{i,m} r^e_{M,d} + \varepsilon_{i,d}, \qquad d \in \text{month } m.\]
Store $\widehat{\text{IVOL}}_{i,m} = \text{sd}(\widehat{\varepsilon}_{i,d})$ as the monthly idiovol (raw daily residual standard deviation).
Optional secondary measure: annualised-monthly equivalent $\widehat{\text{IVOL}}^{(21)}_{i,m} = \widehat{\text{IVOL}}_{i,m} \cdot \sqrt{21}$. Primary reporting uses the raw daily-std measure.

Output object: one $\widehat{\text{IVOL}}_{i,m}$ value per (permno, month).

Predictive lag. $\widehat{\text{IVOL}}_{i,m}$ predicts return $r_{i,m+1}$. We never use contemporaneous idiovol as a regressor.

Identification Frame

This is a predictive, not causal, exercise. We make four explicit non-claims.

We do not claim that idiovol causes low future returns. An equivalent statement is: low future returns cause high lagged idiovol — both are consistent with the same conditional moment.
We do not test any of the bucket-(ii) mechanisms (lottery preferences, arbitrage asymmetry, return reversal, mispricing). Those require either an instrument or an experiment we do not have.
We do not interpret the Fama-MacBeth slope on idiovol as a “price of risk.” Following jegadeesh2019empirical, individual-stock FM coefficients are subject to errors-in-variables; we report the slope as a conditional predictive coefficient, not as a structural risk premium.
We do not claim external validity beyond US common stocks 1990-2024.

What we DO test. The AHXZ research design — quintile portfolios, monthly rebalancing, EW and VW returns, CAPM/FF3 alphas, FM cross-sectional slopes — applied unchanged to the 1990-2024 sample, yields a positive Q1-minus-Q5 spread. The hypothesis is operationally a sign test: in the 35-year sample, the time-series mean of the Q1-minus-Q5 portfolio return exceeds zero at conventional significance.

Main Specifications

Spec 1 — Quintile portfolio sorts

At the end of each month $m$:

Compute $\widehat{\text{IVOL}}_{i,m}$ as above using daily data from month $m$.
Compute NYSE breakpoints: the 20th, 40th, 60th, 80th percentiles of $\widehat{\text{IVOL}}_{i,m}$ over the subset of permnos with $exchcd = 1$.
Assign every stock in the analytical universe (NYSE + AMEX + Nasdaq) to quintile Q1…Q5 according to those NYSE breakpoints.
Form Q1…Q5 portfolios; hold for month $m+1$.
Report:
- Equal-weighted monthly excess returns of Q1, Q2, Q3, Q4, Q5, and the long-Q1/short-Q5 spread.
- Value-weighted monthly excess returns of Q1…Q5 and the long-Q1/short-Q5 spread (VW preferred per AHXZ and hou2020replicating).
Tabulate average excess return, standard deviation, Sharpe ratio, and Newey-West $t$-statistic (12 lags) for each portfolio and the Q1-minus-Q5 spread.

Spec 2 — Factor-model alphas

Run the time-series regressions on the long-Q1/short-Q5 monthly portfolio return $r^{LS}_{m+1}$:

\[r^{LS}_{m+1} = \alpha^{CAPM} + \beta^{MKT} \, \text{MKT}_{m+1} + u_{m+1},\]

\[r^{LS}_{m+1} = \alpha^{FF3} + \beta^{MKT} \, \text{MKT}_{m+1} + \beta^{SMB} \, \text{SMB}_{m+1} + \beta^{HML} \, \text{HML}_{m+1} + u_{m+1}.\]

Report $\widehat{\alpha}$ in monthly percent, Newey-West $t$-statistic with 12 lags, adjusted $R^2$, and the number of monthly observations.

Spec 3 — Fama-MacBeth cross-sectional regressions

For each month $t$, run the cross-sectional regression

\[r_{i,t} = \lambda_{0,t} + \lambda_{1,t} \widehat{\text{IVOL}}_{i,t-1} + \lambda_{2,t} \log(\text{ME}_{i,t-1}) + \lambda_{3,t} \log(\text{BE/ME}_{i,t-1}) + \lambda_{4,t} \text{MOM}_{i,t-12, t-2} + \lambda_{5,t} \text{STR}_{i,t-1} + e_{i,t},\]

where MOM is the cumulative return from $t-12$ to $t-2$ (skipping $t-1$, Jegadeesh-Titman convention) and STR is the prior-month return (short-term reversal, huang2010return). Report time-series means $\bar{\lambda}_k$ and Newey-West-adjusted $t$-statistics with 12 lags. Cluster observations within months for the cross-sectional fit per petersen2009estimating. Report Shanken-adjusted SEs alongside the standard FM SEs as a secondary diagnostic.

Robustness Map

The descriptive paper’s signal is the stability of the spread under design perturbations. Pre-committed robustness suite:

ID	Perturbation	Reference	What we check
R1	Replace CAPM-residual IVOL with FF3-residual IVOL	`fama1993common`, `ang2006cross`	Robustness to factor benchmark in idiovol construction (the Fu 2009 measurement debate)
R2	Split sample at January 2000: pre-1999 vs. post-1999	`hou2016have`, `mclean2016does`	Whether the spread has decayed in the more recent decade (H2 heterogeneity test)
R3	NYSE-20% small-cap vs. NYSE-80% large-cap subsamples	`bali2008idiosyncratic`	Whether the spread concentrates in small stocks (H3 test)
R4	Conditional double sorts: 5x5 size $\times$ IVOL and 5x5 BE/ME $\times$ IVOL	AHXZ Table VI	Whether the IVOL spread survives within each size/value bin
R5	Replace IVOL with MAX = mean of five highest daily returns in prior month	`bali2011maxing`	Whether the lottery proxy subsumes IVOL; report both in horse-race FM
R6	Drop stock-months with $prc_{i,m-1} < \$5$	`han2011investor`	Microstructure-noise robustness
R7	Drop January observations from portfolio time series	`han2011investor`	January-seasonality robustness
R8	Drop 2008-01 through 2009-12 (financial-crisis window)	n/a	Crisis-window robustness — verifies the spread is not driven by 2008-09
R9	Report CAPM, FF3, FF4 (Carhart) and FF5 (Fama-French 2015) alphas where the factor file is available; EW vs. VW	`fama1993common`, `carhart1997persistence`, `hou2020replicating`	Alpha is not an artefact of factor-model choice

That is nine robustness checks against the user-required threshold of seven; the redundancy is deliberate because the strategist-critic deducts -5 per missing robustness at strategy-phase severity.

Falsification Predictions

Pre-committed falsifiers — what we would have to observe to reject each hypothesis.

Falsifies H1. The 35-year time-series mean of the VW long-Q1/short-Q5 portfolio return is non-positive and the FF3 alpha is non-positive (both with $t < 1.96$). Equivalently, the FM slope on lagged IVOL is non-negative (recall sign: AHXZ predicts $\lambda_1 < 0$ in the spec above since IVOL enters levels — H1 says low IVOL predicts high returns, i.e., the FM slope on raw IVOL is negative).
Falsifies H2. The Q1-Q5 spread in the post-1999 sub-period is smaller than (or equal to within 1 SE of) the pre-1999 spread. We allow either smaller-than or statistically indistinguishable as falsifying.
Falsifies H3. The Q1-Q5 spread among NYSE-large stocks (size above NYSE 20th percentile) is as large or larger than in the small-cap subsample. We allow either as falsifying.

We will not p-hack our way to non-falsification. The robustness map is fixed before estimation.

Threats and Mitigations

Threat	Source	Mitigation in our design
T1 — Data snooping / multiple testing. Nine robustness checks $\times$ two hypotheses generate many tests; some will be “significant” by chance.	Lo-MacKinlay (1990)	Pre-commit to the robustness list. Headline H1-H3 are the only confirmatory tests; robustness is supportive, not confirmatory. We do not report joint $p$-values; we report consistency of sign across robustness.
T2 — Idiovol mismeasurement. Residual std with ≈21 observations is noisy; sampling error attenuates the cross-sectional slope.	`jegadeesh2019empirical`, `han2011investor`	(i) Use NYSE breakpoints to make portfolio assignment robust to within-quintile measurement noise; (ii) report Shanken-corrected SEs in the FM; (iii) the ≥17-obs filter reduces variance of the daily-residual std.
T3 — Microstructure noise. Bid-ask bounce inflates IVOL for low-priced/illiquid stocks.	`han2011investor`	Robustness R6 (drop $prc < \$5$) and R3 (NYSE-only large subset) directly address this.
T4 — Fu (2009) expected-vs-realised critique. Conditioning on lagged realised IVOL may capture mean reversion in volatility, not a pricing relationship.	`fu2009idiosyncratic`, with `guo2014` corrective	We explicitly adopt the AHXZ realised-IVOL convention and disclaim it. We do not claim to estimate the price of expected idiovol. Robustness R1 (FF3-residual IVOL) speaks to the same residual definition Fu critiques.
T5 — Transaction-cost realisation of the spread. A Q1-Q5 spread of 100 bps/month may not survive bid-ask costs for high-IVOL stocks.	`novymarx2016taxes`	We report gross returns. Net-of-cost feasibility is a separate research question; we flag this as a limitation, not a result we claim.

Pre-Registration Note

Even though this paper is descriptive, we pre-commit to the following design choices before running estimation. Deviations would be disclosed and explained in any revision.

Quintile breakpoints. NYSE breakpoints (NYSE-only sample, $exchcd = 1$), 20/40/60/80 percentiles, recomputed monthly.
Long-short construction. Long Q1, short Q5 (low-minus-high). This is the AHXZ sign convention — positive expected spread under H1.
Portfolio weighting. Equal-weighted and value-weighted, both reported. VW is the primary specification (per hou2020replicating).
Standard errors. Newey-West with 12 monthly lags for all time-series tests. Petersen-style month clustering plus Newey-West-adjusted FM-coefficient SEs for the Fama-MacBeth regressions. Shanken-adjusted SEs reported as secondary.
Sample period. 1990-01-01 through 2024-12-31. No in-sample-optimisation of start or end dates.
Hypothesis sign. H1 predicts $\bar r_{Q1} - \bar r_{Q5} > 0$ (equivalently, the FM slope on $\widehat{\text{IVOL}}_{i,t-1}$ is negative).
Lookback convention. IVOL in month $m$ predicts return in month $m+1$. No skip-a-month variant in the main specification.

Critic-Anticipated Weaknesses

The strategist-critic will flag at least the following. We list each with a defensible response.

“Why CAPM-residual IVOL as the main spec, not FF3-residual?” AHXZ (2006) Table I and bali2008idiosyncratic both report CAPM-residual as a benchmark precisely because the FF3 residual conflates exposures to SMB and HML (which are themselves volatile). The literature reports both; we follow the AHXZ main-text convention with FF3 in R1. Response: this is a convention choice, not an optimisation; we report both.
“You don’t have delisting returns (dlret).” Confirmed limitation per the data assessment. Affects $\le 5\%$ of stock-months, and the bias is well-known to be small for monthly-frequency portfolio-level results (Shumway 1997). Response: documented in the limitations sub-section; quantified by reporting the delisting-flagged fraction of each quintile in summary stats.
“H2 (post-1999 larger) contradicts McLean-Pontiff post-publication decay.” Yes — the two hypotheses are competing. AHXZ was published in 2006, so MP-style decay would predict a smaller post-publication spread. Our H2 instead conditions on listings-composition shifts (doidge2017-style stylised fact: post-1999 the public-firm population skews toward smaller, riskier firms). Response: H2 is a stated horserace between two interpretations; the data adjudicates.
“Using individual-stock FM violates the Shanken correction.” Per jegadeesh2019empirical. Response: we report both Shanken-adjusted and unadjusted SEs and interpret FM slopes as conditional predictive coefficients, not structural risk premia.
“Why no MAX-orthogonalised IVOL in main spec?” The MAX-vs-IVOL horserace is the central question of bali2011maxing and is one paper’s worth of work to do properly. We report MAX as R5 (a robustness, not a confirmatory test) and explicitly disclaim adjudication of the lottery-preference mechanism.

Estimation Approach Summary

Headline portfolios: quintile sorts with NYSE breakpoints (Spec 1).
Headline alphas: CAPM and FF3 (Spec 2) on the long-Q1/short-Q5 portfolio. Newey-West SEs (12 lags).
Headline cross-section: Fama-MacBeth with month clustering and Newey-West-adjusted SEs (Spec 3); five controls (size, BE/ME, MOM, STR, lagged IVOL).
Software: R, data.table + fixest::feols for the within-month CAPM regressions (vectorised), sandwich::NeweyWest for SEs.
Output organisation: tables (.tex bare-tabular) and figures (.pdf, .png) to output/. No table/figure caption in the .tex output; LaTeX wraps with threeparttable per INV-13. Notation across paper, tables, and talk uses the symbols defined in this memo per INV-7 and INV-20.

Deliverables

quality_reports/strategy-memo.md (this file)

Companion artifacts to be produced by paired creators in subsequent dispatches:

scripts/build_factors.R and scripts/analysis.R (coder)
output/table_*.tex and output/figure_*.pdf (coder)
paper/main.tex, paper/references.bib (writer)