Compute Weekly Return from Daily CRSP Data¶
Computing the weekly returns from the CRSP daily stock data is a common task but may be tricky sometimes. Let's discuss a few different ways to get it done incorrectly and correctly.
TL;DR Take me to the final solution!
Surely -> The solution
INCORRECT ways¶
Let me start with a few incorrect ways, which may seem perfectly okay at first glance. This part is important because it shows you how a small mistake can lead to hard-to-discover bugs.
Weekly index return from daily data¶
Using intnx()
, we can derive the Friday of the week given a date, as shown
below.
Note that intnx("weekday.6", date, 0)
will give the last Friday, which is
not what we want. We want the next Friday of the week for a given date, so
we use intnx("weekday.6", date, 1)
. The case...when...
statement ensures
that if the given date is already a Friday, we don't go for the next one. Below
is a sample output of the mktret_weekly
table generated.
Example output of mktret_weekly
Obs | Date | Year | Week | FridayOfWeek | mktret |
---|---|---|---|---|---|
1 | 19860102 | 1986 | 0 | 03JAN1986 | -0.1893222 |
2 | 19860103 | 1986 | 0 | 03JAN1986 | -0.1893222 |
3 | 19860106 | 1986 | 1 | 10JAN1986 | -2.333080418 |
4 | 19860107 | 1986 | 1 | 10JAN1986 | -2.333080418 |
5 | 19860108 | 1986 | 1 | 10JAN1986 | -2.333080418 |
6 | 19860109 | 1986 | 1 | 10JAN1986 | -2.333080418 |
7 | 19860110 | 1986 | 1 | 10JAN1986 | -2.333080418 |
8 | 19860113 | 1986 | 2 | 17JAN1986 | 1.1992620931 |
9 | 19860114 | 1986 | 2 | 17JAN1986 | 1.1992620931 |
We can verify that the FridayOfWeek
indeed gives the Friday of the week.
Therefore, the final weekly dataset using Friday as the date identifier just
need to keep FridayOfWeek
and mktret
.
Example output of mktret_weekly
Obs | date | mktret |
---|---|---|
1 | 03JAN1986 | -0.1893222 |
2 | 10JAN1986 | -2.333080418 |
3 | 17JAN1986 | 1.1992620931 |
4 | 24JAN1986 | -0.959555101 |
5 | 31JAN1986 | 2.5916781551 |
6 | 07FEB1986 | 1.3126828796 |
Note here that it's tempting to use having weekday(date)=6
to make sure the
dates are all Friday. However, if Friday in a week is not the last trading day,
then the weekly return will be missing. This is why here I use date=max(date)
to ensure non-missing weekly returns. The date is the last trading day in any
given week, consistent with the CRSP's daily stock file.
The caveat here is that since the dates are the weekly last trading days, when merged with other weekly datasets, you should be very careful about whether the other dataset is using Friday or the last trading day per week as its date variable.
Weekly stock return from daily data¶
Following the same logic, we can calculate the weekly stock returns from daily CRSP data, where dates are aligned to the Friday of the week.
What's wrong?¶
The code above seems okay. We know that CRSP daily stock file contains many observations where the daily trading volume is 0, in which case the price is recorded as the negative bid-ask midpoint. Therefore, we restrict to only those with positive stock prices. So what's the problem?
The problem is that a week can span two calendar years.
For example, check out the last week of 2019:
Mon | Tue | Wed | Thu | Fri | Sat | Sun |
---|---|---|---|---|---|---|
30 | 31 | 1 | 2 | 3 | 4 | 5 |
- Dec30 and Dec31 belong to week 53 of 2019, while the code above will use these two days' returns to compute the weekly return and align the date to Jan03 of 2020.
- Jan01 to Jan03 belong to week 0 of 2020, so the code above will use these three days' returns to compute the weekly return and align the date to Jan03 of 2020.
Now we have a mistake. A single week is broken into two because of the use of
week()
function in SAS. Another consequence is that when there're many years
of data, there will be a lot of duplicates.
CORRECT ways¶
Now let's explore two ways that avoid this mistake. Although both generate the same result (there can be a few differences, see the caveat), the second one is much faster.
1. Start with a list of dates (slow version)¶
Now we can write some correct code to compute the weekly returns. We'll generate a series of Fridays first, then we merge based on the past 5 calendar days. This will ensure all trading days with non-missing data will be included in the weekly return calculation, and correct the mistake mentioned above.
Note that this version is inefficient and takes a long time to run.
2. Group using aligned dates (fast version with caveat)¶
This version uses a similar logic from the previous incorrect
one, but it groups based on the aligned
dates instead of year(date)
and week(date)
.
Caveat
If the beginning and ending dates, "01Jan1986"d and "31Dec2019"d
in the
example, are not Fridays, then the first and last weekly returns for all
stocks will be incorrect, because they are not using all the daily data in
those weeks.
To fix this minor issue, simply extend the beginning and ending dates beyond your sample period by a few weeks.