A couple of days ago, Rawley Heimer from Boston College visited our discipline and gave a seminar on research methods and journal publication insights for us PhD students. He’s an interesting guy and I was very impressed by a method he introduced to us - **Specification Curve**.

## Motivation

More offen than not, empirical researchers need to argue that their chosen model specification reigns. If not, they need to run a battery of tests on alternative specifications and report them. The problem is, researchers can fit a few tables each with a few models in the paper at best, and it’s extremely hard for readers to know whether the reported results are being cherry-picked.

So, *why not run all possible model specifications and find a concise way to report them all?*

## The Specification Curve

The idea of specificaiton curve is a direct answer to the question provided by Simonsohn, Simmons and Nelson (2015).

Simonsohn, Uri and Simmons, Joseph P. and Nelson, Leif D., Specification Curve: Descriptive and Inferential Statistics on All Reasonable Specifications (October 29, 2019). Available at SSRN: https://ssrn.com/abstract=2694998 or http://dx.doi.org/10.2139/ssrn.2694998

To intuitively explain this concept, below is a demo plot I created.

The plot is made up of two parts. The upper panel shows the coefficient estimates in ascending order and confidence intervals of the variable of interest. The lower panel indicates the model specification using colored dots. Both panels share the same x-axis of model number.

This simple demo reports only 7 specifications and is not meant to exhaust all possible combinations of the specification choices as indicated by the y-axis of the lower panel. Theoretically, there should be:

- 2 choices of controls variables (included or not)
- 5 choices of fixed effects (if mutually exclusive and much more if not)
- 3 choices of standard error clustering (assuming mutually exclusive)
- 3 other mutually exclusive optional choices

Altogether there can be easily at least $2\times5\times3\times3=90$ possible specifications. We should estimate all these models and add more data points to the specification curve plot.

Beyond reporting all estimates from hundreds and thousands of models, the more appealing point of specification curve is that we can identify the most impactful factors in specifying the model.

As the models are sorted by the coefficient estimates, the distribution of dots in the lower panel can reveal whether certain specification choices drive the results.

An example is provided below.

This plot is partially taken from a working project of mine. There are some insights we can draw from the curve:

- When the sample period is shorter and more balanced around the event year, the estimated treatment effect tends to be lower.
- When the sample is unrestricted or restricted based on condition 2, the estimated treatment effect tends to be lower.
- Whether or not control variables are included seems to have no impact.
- Fixed effect choice 1 to 3 are included in all specifications.
- When fixed effect choice 6 is included, the estimated treatment effect tends to be lower.
- The choice of standard error clustering method seems to have no impact.

The most impactful factor appears to be fixed effect choice 6, suggesting some more thinking and discussion.

## How to Make a Specification Curve?

So far there seems to be no available software or package to conduct specification curve analysis easily. I did all the above analysis and plotting using a combination of STATA, Python and JavaScript:

- Use STATA to estimate all the models and
`-estout-`

to csv files. - Run a Python script to compile the csv files into a JSON file of a particular structure.
- Run a JavaScript to parse the JSON file and plot it using a modified Chart.js lib I build.

For the first part, I wrote a post on programmatically tabulating regression results in STATA earlier. The code can be modified to estimate tons of models and save the results in respective csv files.

The second part is easy. I defined a standard format for the JSON file to be used in the last step, and here I just need to extract relevant information from the csv files and put them together.

The last part is a little bit challenging but I figured it out anyway using Chart.js. I forked and modified the source code of the latest release version of Chart.js (2.9.3) to allow for different label font styles on the y-axis in the lower panel.

I’m going to publish a tool to plot the specification curve first and later maybe write a package to streamline everything.