# Probability of Informed Trading (PIN)

The PIN of any asset captures the probability that a trade in that asset is initiated by an informed trader, developed in a series of papers, Easley, O’Hara, and co-authors (1992, 1996a, 1996b, 1997a, 1997b, 2002).

## Definition

This short description is from WRDS.

To estimate the PIN for any asset for a particular period of time, a count of the number of buyer- and seller-initiated trades over that time period is required. To identify whether a particular trade is buyer or seller initiated, the Lee and Ready (1991) test is commonly used in the literature.

The research application produces estimates of the parameters from a stylized version (Easley, Engle, O’Hara, and Wu, 2002) of the Easley, O’Hara, et al. models. In this stylized version, the probability of observing B buy orders and S sell orders on day t conditional on the parameter vector of the model $Θ ≡ [µ,ε,α,δ]$ is given by

$$ \begin{align} P&\left[\mathbf{y}_t=\left(B,S\right)|\theta\right] = \alpha(1-\delta)e^{-(\mu+2\epsilon)}\frac{(\mu+\epsilon)^B \epsilon^S}{B!S!} \newline &+\alpha\delta e^{-(\mu+2\epsilon)}\frac{(\mu+\epsilon)^B \epsilon^S}{B!S!} + (1-\alpha)e^{-2\epsilon} \frac{\epsilon^{B+S}}{B!S!} \end{align} $$

where $\mu$ is the rate of informed trade arrival, $\epsilon$ is the arrival rate of orders, $\alpha$ is the probability of an information event, $\delta$ is the probability of the signal being “low”, and $\mathbf{y}_t=\left(B,S\right)$ is the vector of buy and sell counts.

To estimate the parameters of this model from data over $T$ days, the product of the above likelihood function is maximized. However, some of the quantities in the likelihood function are extremely large or extremely small and can cause overflow or underflow errors. Therefore, the log of the likelihood function is taken and factorized to produce the following factorized log likelihood function:

$$ \begin{align} \mathcal{L}&\left(\{ \mathbf{y}_t \}t=1,T|\theta \right) = \sum_{t=1}^{T} \left[-2\epsilon+ M \ln x + (B+S) \ln(\mu+\epsilon) \right] \newline &+ \sum_{t=1}^{T} \ln\left[ \alpha(1-\delta)e^{-\mu} x^{S-M} + \alpha\delta e^{-\mu} x^{B-M} + (1-\alpha)x^{B+S-M} \right] \end{align} $$

where $M≡(min(B,S)+max(B,S))/2$, and $x≡ε/(μ+ε)∈[0,1]$ (Easley, Engle, O’Hara, and Wu, 2002). This log-likelihood function is maximized over the parameter space to produce the maximum-likelihood estimates of the parameters. The probability of informed trading that emerges from the estimated parameters is

$$ \begin{align} PIN=\frac{\alpha\mu}{\alpha\mu+2\epsilon} \end{align} $$

## Source Code

This example Python code is not optimized for speed and serves only demonstration purpose. It may contain errors.

```
# PIN.py
```