Skip to content


Adding Another Factor to Principal-Agent Model

In a traditional principal-agent model, firm output is a function of the agent's effort and the principal observes only the output not agent's effort. The principal carefully designs the agent's compensation package, especially the sensitivity of the agent's pay to firm output, to maximize the firm value. Now, what if we add another factor to the relationship between firm output and agent's effort? How would the optimal pay sensitivity change?

Estimate Organization Capital

As in Eisfeldt and Papanikolaou (2013), we obtain firm-year accounting data from the Compustat and compute the stock of organization capital for firms using the perpetual inventory method that recursively calculates the stock of OC by accumulating the deflated value of SG&A expenses.

Download M&A Deals from SDC Platinum

Thomson One Banker SDC Platinum database provides comprehensive M&A transaction data from early 1980s, and is perhaps the most widely used M&A database in the world.

This post documents the steps of downloading M&A deals from the SDC Platinum database. Specifically, I show how to download the complete M&A data where:

  • both the acquiror and the target are US firms,
  • the acquiror is a public firm or a private firm,
  • the target is a public firm, a private firm, or a subsidiary,
  • the deal value is at least $1m, and
  • the form of the deal is a acquisition, a merger or an acquisition of majority interest.

Specification Curve Analysis


More often than not, empirical researchers need to argue that their chosen model specification reigns. If not, they need to run a battery of tests on alternative specifications and report them. The problem is, researchers can fit a few tables each with a few models in the paper at best, and it's extremely hard for readers to know whether the reported results are being cherry-picked.

So, why not run all possible model specifications and find a concise way to report them all?

Firm Historical Headquarter State from SEC 10K/Q Filings

Why the need to use SEC filings?

In the Compustat database, a firm's headquarter state (and other identification) is in fact the current record stored in This means once a firm relocates (or updates its incorporate state, address, etc.), all historical observations will be updated and not recording historical state information anymore.

To resolve this issue, an effective way is to use the firm's historical SEC filings. You can follow my previous post Textual Analysis on SEC filings to extract the header information, which includes a wide range of meta data. Alternatively, the University of Notre Dame's Software Repository for Accounting and Finance provides an augmented 10-X header dataset.

2023 March Update

In this update I use 1,491,368 8-K filings of U.S. firms from 2004 to Dec 2022 and extract their HQ state and zipcode.

Compute Jackknife Coefficient Estimates in SAS

In certain scenarios, we want to estimate a model's parameters on the sample for each observation with itself excluded. This can be achieved by estimating the model repeatedly on the leave-one-out samples but is very inefficient. If we estimate the model on the full sample, however, the coefficient estimates will certainly be biased. Thankfully, we have the Jackknife method to correct for the bias, which produces the Jackknifed coefficient estimates for each observation.

Python Shared Memory in Multiprocessing

Python 3.8 introduced a new module multiprocessing.shared_memory that provides shared memory for direct access across processes. My test shows that it significantly reduces the memory usage, which also speeds up the program by reducing the costs of copying and moving things around.1

Textual Analysis on SEC Filings

Nowadays top journals favour more granular studies. Sometimes it's useful to dig into the raw SEC filings and perform textual analysis. This note documents how I download all historical SEC filings via EDGAR and conduct some textual analyses.

Kyle's Lambda

A measure of market impact cost from Kyle (1985), which can be interpreted as the cost of demanding a certain amount of liquidity over a given time period.