A Simple Event Study

1. Example
2. Overview
3. Implementation
4. Hypothesis Test
- 4.1. Test Statistic
- 4.2. Test for Significantly Different Means
5. Counterarguments

1. Example

Here we are going to cover how to test for a differential effect of one event on two assets. As an example, our event will be the results of the 2020 Presidential election, and our assets will be oil (XLE) and tech (XLK) ETFs. Our null hypothesis is:

There is no difference in the reactions of XLE anf XLF to the outcome of the 2020 Presidential election.

If we are able to reject the null then we have found evidence in favor of a different reaction.

2. Overview

To calculate the effect of the event, we need to be able to calculate what each asset would have done if the event had not occurred. To do this, in an event study there is an estimation period and an event period.

Estimation period:

begins when you choose and ends, say, 10 days prior to the event.
you estimate a market model regression (\(R_{E} = \alpha + \beta R_M + e\)) for each ETF over this period, and save the alpha and beta coefficients. \(R_E\) is an ETF, and \(R_M\) is the return on the market.

Event period:

Calculate daily abnormal returns for each ETF. Abnormal returns are actual returns minus expected returns. \(R_E - \alpha - \beta R_M\) where the alpha and beta are the estimates for that particular ETF.
Sum the daily abnormal returns into Cumulative Abnormal Returns from 10 days before the event to 10 days after, CAR(-10, 10)

Once you have the CARs for each ETF you can run a t-test for significantly different CARs.

3. Implementation

We'll implement the event study in Python, however R is probably a better choice of language given its built-in statistical capabilities. You can also do this in Excel, though it is more tedious.

First let's load the libraries we are going to use:

import yfinance as yf
from time import sleep
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

3.1. Data Preparation

Now we can pull the data and convert it into returns:

spy = yf.Ticker("SPY")
spy_price = spy.history(period="max")
sleep(5)
xle = yf.Ticker("XLE")
xle_price = xle.history(period="max")
sleep(5)
xlk = yf.Ticker("XLK")
xlk_price = xlk.history(period="max")

Convert to returns:

spy_ret = spy_price["Close"].pct_change()[1:]
xle_ret = xle_price["Close"].pct_change()[1:]
xlk_ret = xlk_price["Close"].pct_change()[1:]
all_rets = pd.concat([spy_ret, xle_ret, xlk_ret], axis=1)
all_rets.columns = ["spy", "xle", "xlk"]

Split into estimation and event windows:

est = all_rets.loc['2020-09-01':'2020-10-19']
event = all_rets.loc['2020-10-20':'2020-11-16']

Let's look at the cumulative returns over the estimation and event windows. This is not required, though it is good to take a look at your data from time to time.

3.2. Regression Estimation

3.2.1. XLE

\[R_{XLE} = \alpha_{XLE} + \beta_{XLE} R_M + e\]

Where \(R_{XLE}\) and \(R_M\) are the returns on the XLE and SPY. Note, \(R_M\) denotes the return on the market portfolio, and we generally use the SPY ETF to proxy for the market.

xle_reg = LinearRegression().fit(est["spy"].values.reshape(-1,1), est["xle"])
xle_alpha = xle_reg.intercept_
xle_beta = xle_reg.coef_[0]
print(xle_alpha)
print(xle_beta)
xle_pred = xle_reg.predict(est["spy"].values.reshape(- 1, 1))
xle_rmse = mean_squared_error(y_true=est["xle"], y_pred=xle_pred, squared=False)
print(xle_rmse)

-0.004507238730465367
0.7730119705798506
0.017148482011230558

3.2.2. XLK

\[R_{XLK} = \alpha_{XLK} + \beta_{XLK} R_M + e\]

Where \(R_{XLE}\) and \(R_M\) are the returns on the XLK and SPY.

xlk_reg = LinearRegression().fit(est["spy"].values.reshape(-1,1), est["xlk"])
xlk_alpha = xlk_reg.intercept_
xlk_beta = xlk_reg.coef_[0]
print(xlk_alpha)
print(xlk_beta)
xlk_pred = xlk_reg.predict(est["spy"].values.reshape(- 1, 1))
xlk_rmse = mean_squared_error(y_true=est["xlk"], y_pred=xlk_pred, squared=False)
print(xlk_rmse)

-0.00027787940826697445
1.4045779196999413
0.00659011881272826

3.3. Event Abnormal Returns

3.3.1. XLE

xle_ars = event["xle"] - xle_alpha - xle_beta * event["spy"]
#print(xle_ars)
xle_cars = xle_ars.cumsum()
print(xle_cars)

Date
2020-10-20    0.013255
2020-10-21    0.000162
2020-10-22    0.041684
2020-10-23    0.038655
2020-10-26    0.021580
2020-10-27    0.015783
2020-10-28    0.004871
2020-10-29    0.032197
2020-10-30    0.050365
2020-11-02    0.080331
2020-11-03    0.065466
2020-11-04    0.054051
2020-11-05    0.043483
2020-11-06    0.026523
2020-11-09    0.164075
2020-11-10    0.202081
2020-11-11    0.190591
2020-11-12    0.170330
2020-11-13    0.200538
2020-11-16    0.261216
dtype: float64

3.3.2. XLK

xlk_ars = event["xlk"] - xlk_alpha - xlk_beta * event["spy"]
#print(xlk_ars)
xlk_cars = xlk_ars.cumsum()
print(xlk_cars)

Date
2020-10-20   -0.001900
2020-10-21   -0.000472
2020-10-22   -0.012600
2020-10-23   -0.018187
2020-10-26   -0.013651
2020-10-27   -0.003526
2020-10-28    0.002337
2020-10-29    0.005020
2020-10-30   -0.002200
2020-11-02   -0.015139
2020-11-03   -0.022025
2020-11-04   -0.014235
2020-11-05   -0.010286
2020-11-06   -0.006138
2020-11-09   -0.030748
2020-11-10   -0.047051
2020-11-11   -0.033484
2020-11-12   -0.028487
2020-11-13   -0.039168
2020-11-16   -0.046769
dtype: float64

4. Hypothesis Test

4.1. Test Statistic

scar_xle = xle_cars.values[-1] / np.sqrt((xle_cars.shape[0] * xle_ars.values.var()))
print(scar_xle)

1.5797904365289115

scar_xlk = xlk_cars.values[-1] / np.sqrt((xlk_cars.shape[0] * xlk_ars.values.var()))
print(scar_xlk)

-1.086763354109323

These test statistics have a t-distribution with the length of the event window minus two degrees of freedom. If we had a longer event window (say over 50 days) then we could use a standard normal. We can use these to test whether a CAR is significantly different from 0.

4.2. Test for Significantly Different Means

\[t = \frac{CAR_{XLE} - CAR_{XLK}}{\sqrt{\frac{\hat{\sigma}_{XLE}^{2} + \hat{\sigma}_{XLK}^{2}}{2}}\sqrt{\frac{2}{n}}}\]

test_stat = (xle_cars.values[-1] - xlk_cars.values[-1]) / (np.sqrt( (xle_cars.shape[0] * xle_ars.values.var() - xlk_cars.shape[0] * xlk_ars.values.var()) / 2) * np.sqrt(2 / xle_cars.shape[0]))
print(test_stat)

8.627317586096085

Which is significant at the less than 1% level (see table of values from t-distribution). So we conclude that or evidence indicates that XLE outperformed XLK around the 2020 election results.

5. Counterarguments

But take a look at the daily CARs, it looks like XLE began to outperform on 11/9 which is about 6 days after the election results were released. So I think the event window is too long. We should calculate CAR(-5, 5) and re-test for significance.