# A random walk away financeΒΆ

## Identify Retail Investors

Retail investors and their trading behaviour attract many research interests. One strand of literature uses proprietary datasets to identify retail investors. The other uses algorithms. A recent JF paper Boehmer et al. (2021) proposes a simple one based only on the trade price, which also signs the trade direction effectively. Even more interestingly, I just read a follow-up work forthcoming on JF by Barber et al. (2023). The authors placed 85,000 retail trades themselves to validate the Boehmer et al. (2021) algorithm.

## Translog Cost Function Estimation

This post focuses on the translog cost function. I discuss the linear homogeneity constraint, the technique to impose the constraint, and its estimation via

• Ordinary Least Square (OLS)
• Stochastic Frontier Analysis (SFA)

Code examples are provided, too.

## Translog Production and Cost Functions

In this post, I'll carefully explain the derivation of cost function from a CES production function, as well as the derivation of translog (transcendental logarithmic) production and cost functions.

flowchart TB
subgraph Production
A[Production Function] -. approximation .-> D(Translog Production Function)
end
subgraph Cost
B[Cost Function] -. approximation .-> C(Translog Cost Function)
end
A == Conversion via Duality ==> B

Before I start, the graph above illustrate the relations. Specifically, we can derive the cost function from a CES production function via the duality theorem. Translog production and translog cost functions are approximations to the production and corresponding cost function, respectively, via Taylor expansion.

## GARCH-Constant Conditional Correlation (CCC)

This post details a multivariate GARCH Constant Conditional Correlation (CCC) model. It was somewhat surprising that I didn't find a good Python implementation of GARCH-CCC, so I wrote my own, see documentation on frds.io. It performs very well, often generates (marginally) better estimates than in Stata based on log-likelihood.

## GARCH Estimation

This post details GARCH(1,1) model and its estimation manually in Python, compared to using libraries and in Stata. For GJR-GARCH(1,1), see my documentation on frds.io.

This post documents how to download SEC filings from EDGAR using edgar-analyzer, a Python program I wrote. It features:

• 3 commands only to download any type of filings for any period of time

## Difference-in-Differences Estimation

Empirical researchers have been using difference-in-differences (DiD) estimation to identify an event's Average Treatment effect on the Treated entities (ATT). This post is my understanding and a non-technical note of the DiD approach as it evolves over the past years, especially on the problems and solutions when multiple treatment events are staggered.

## CRSP Missing Codes

A note on the missing codes in CRSP.

## FRED - Federal Reserve Economic Data

Since Stata 15, we can search, browse and import almost a million U.S. and international economic and financial time series made available by the St. Louis Federal Reserve's Federal Research Economic Data. This post briefly explains this great feature.

## Correlated Random Effects

Can we estimate the coefficient of gender while controlling for individual fixed effects? This sounds impossible as an individual's gender typically does not vary and hence would be absorbed by individual fixed effects. However, Correlated Random Effects (CRE) may actually help.

At last year's FMA Annual Meeting, I learned this CRE estimation technique when discussing a paper titled "Gender Gap in Returns to Publications" by Piotr Spiewanowski, Ivan Stetsyuk and Oleksandr Talavera. Let me recollect my memory and summarize the technique in this post.