Probability of Informed Trading (PIN)
This post was originally published on my old blog in March 2018 in Chinese. Translation is provided by ChatGPT-4.
In the market microstructure literature, Easley et al. (1996) proposed a trading model that can decompose the bid-ask spread. The most commendable aspect of this model is the introduction of the “Probability of Informed Trading,” or PIN, which serves as a means of measuring the informational component in the spread. As the name suggests, under ideal conditions, PIN can reflect the probability of informed trading in a market with market maker.
In this post, I attempt to comb through the modeling process in the Easley et al. (1996) paper and discuss how to handle the objective function in maximum likelihood estimation to avoid overflow errors during computation.
Model
Assume that the buy and sell orders of informed and uninformed traders follow independent Poisson processes, and the following tree diagram describes the entire trading process:
- On each trading day, there is a probability of
that new information will appear, and obviously a probability of that there will be no new information. - The probability of new information being bearish is
, and the probability of it being bullish is .- If the news is bearish, the arrival rate of buy orders on that day is
, and the arrival rate of sell orders is . - If the news is bullish, the arrival rate of buy orders on that day is
, and the arrival rate of sell orders is .
- If the news is bearish, the arrival rate of buy orders on that day is
- When there is no new information, the arrival rate of both buy and sell orders is
.
Trading Process
Next, assume that the market maker is a Bayesian, that is, he will update his understanding of the overall market status, especially whether there is new information on that day, by observing trades and trading rates. Suppose each trading day is independent,
Let
Similarly, if there is bearish information and the market maker observes a sell order at time
If there is bullish information and the market maker observes a sell order at time
Thus, the expected zero-profit bid price at time
Here,
At this point, the ask price should be:
Let’s associate these bid and ask prices with the expected asset value at time
we can write the above
Thus, the bid-ask spread is
This indicates that the bid-ask spread at time
The probability of a buy order being an informed trade
the expected loss due to the informed buyer + the probability of a sell order being an informed trade the expected loss due to the informed seller
Therefore, the probability that any trade at time
If no information event occurs (
And our
Model Estimation
After the model is established, let’s talk about the parameter estimation of this model. The parameters we need to estimate,
First, according to the trading model shown in the diagram, assume that there is bad news on a certain day, then the arrival rate of sell orders is
If there is good news on a certain day, the probability of observing a sequence of trades with
If there is no new information on a certain day, the probability of observing a sequence of trades with
So, the probability of observing a total of
Hence, the objective function of the maximum likelihood function is:
Bottomline
The problem seems to end here. With the objective function, it seems to be all set as long as you program it and pay attention to the parameter boundaries. However, the real challenge comes next, because if you really write the objective function like this and run it, you will inevitably encounter an overflow error. After all, this function is filled with powers and factorials. Even if the time element is chosen very small, some highly liquid assets will still have hundreds of transactions within a few seconds. Therefore, both
By observing equation (16), the three terms in the likelihood function can actually extract a common factor
Now, since the last term
Python code
See my implementation here: https://frds.io/measures/probability_of_informed_trading/.