library(tidyverse)
soccer <- read_csv("https://raw.githubusercontent.com/cobriant/teaching-datasets/refs/heads/main/soccer.csv") %>%
mutate(
outcome = factor(outcome, levels = c("tie", "win", "lose")),
year = year(date)
) %>%
group_by(year, tournament, team) %>%
mutate(eliminated = outcome == "lose" & date == last(date),
won_tournament = outcome == "win" & date == last(date)) %>%
ungroup()9 Edmans, Garcia, and Norli (2007)
In class today, we’ll replicate the paper Sports Sentiment and Stock Returns by Alex Edmans, Diego García, and Øyvind Norli.
The paper explores: do people’s moods influence financial markets? When a country’s national soccer team wins an important match, fans experience a boost to their mood. That mood shock might spill over into people’s financial decision-making. They may become more optimistic about the future, more confident in their judgments, and more willing to take risks.
As a result, they may be more likely to buy stocks rather than hold safer assets, increasing demand and pushing stock prices up. Since returns reflect changes in prices, this could lead to higher stock returns following wins.
What about when teams lose an important match? The opposite pattern might occur: worse mood leads to more risk aversion, selling pressure, and lower returns.
To test this idea, the authors get data from two sources:
International soccer match results, focusing on emotionally significant games like World Cup matches.
Stock market data for the corresponding countries: daily returns and broad indices (like the S&P500 for the US).
The empirical strategy is to compare stock returns immediately following wins and losses. In this assignment, you’ll work with similar data and methods to recreate their main finding: that something as seemingly unrelated as a soccer game can have measurable effects on financial markets.
There are two sources of data for this replication:
- International soccer match results
- Daily national stock-market returns
1) Soccer Results
I started with this data set: International soccer scores since 1872 (Kaggle/Github) and prepared it a little for you.
Explore the soccer data set by writing queries or visualizations to answer each of these questions:
# 1. What are the variable names?
soccer %>% ____
# 2. How many observations are there?
soccer %>% ____
# 3. What is the minimum and maximum date?
soccer %>%
____(date_min = min(date), date_max = max(date))
# 4. What countries are in the data set?
soccer %>%
count(____)
# 5. What years had a FIFA World Cup?
soccer %>%
filter(____) %>%
count(year = year(____))
# 6. Visualize the distribution of outcomes (win, lose, and tie) for Brazil.
soccer %>%
filter(team == ____) %>%
ggplot(____(x = ____)) +
geom_____()2) Stock Market Returns (1973-present)
The second data set we’ll need is a daily national equity index for as many countries as possible.
Explore the stocks data set by writing queries or visualizations to answer each of these questions:
stocks <- read_csv("international_stocks.csv")
# 7) What countries are we able to get national equity indices for?
stocks %>% count(____)
# 8) How many Sundays, Mondays, etc. are in `stocks`.
# What days of the week are underrepresented?
stocks %>% count(____)
# 9) Create a "global equity index" that averages national equity indices for each day.
stocks <- stocks %>%
group_by(____) %>%
mutate(global_index = mean(____)) %>%
ungroup()
# 10) Visualize the global equity index over time. Add red vertical lines for:
# - Black Monday (Oct 19, 1987): the largest one-day percentage drop in history
# - Global financial crisis (September 2008)
# - COVID-19 crash: (March 2020)
stocks %>%
ggplot(aes(x = ____, y = ____)) +
geom_vline(color = ____, xintercept = ymd(____), linewidth = 2, alpha = .5) +
geom_line()
# 11) Visualize the global equity index over time, this time assuming an initial investment of $1000, with returns compounding. Include the same red vertical lines as in question 9.
stocks %>%
distinct(date, global_index) %>%
arrange(date) %>%
mutate(compounded_return = 1000 * ____(1 + global_index)) %>%
ggplot(aes(x = ____, y = compounded_return)) +
geom_vline(color = ____, xintercept = ymd(____), linewidth = 2, alpha = .5) +
geom_line()3) Join the two data sets
We’ll join soccer matches to the next available stock-market return for the same country.
# 12) Fill in the blanks to create a new data set `joined_data`. What 2 variables do you need to join by?
joined_data <- soccer %>%
filter(date >= ymd(19860701)) %>%
mutate(
weekday = wday(date, label = T),
next_trading_day = case_when(
weekday == "Fri" ~ date %m+% days(____),
weekday == "Sat" ~ date %m+% days(____),
.default = date %m+% days(____)
)
) %>%
left_join(
stocks %>%
mutate(country = str_to_title(country)) %>%
group_by(date) %>%
mutate(global_index = mean(ret)) %>%
ungroup() %>%
mutate(ret = ret - global_index),
join_by(____ == ____, ____ == ____)) %>%
select(date, tournament, team, outcome, next_ret = ret, global_index, won_tournament, eliminated) %>%
drop_na(next_ret)4) Visualize the Main Result
# 13) Create a ggplot to visualize how a win, loss, or tie effects
# the next day's stock market returns in a country. Is there evidence
# of a pattern here?
joined_data %>%
ggplot(aes(x = ____, y = ____)) +
geom_boxplot() +
labs(
title = "Next-Day Stock Returns After Soccer Matches",
x = "Match outcome",
y = "Next trading day's excess return"
)5) Estimate the Main Result
Estimate the effect of a game’s outcome on the next trading day’s returns. Control for team.
In these regressions, outcome:tie is the omitted category to avoid the dummy variable trap. That means the coefficients on outcome:lose and outcome:win tell us how next-day stock returns differ after losses or wins, compared to ties.
joined_data %>%
lm(next_ret ~ outcome + team, data = .) %>%
broom::tidy()
# 14) Interpret these results:
#
# A loss (increases/decreases) the stock market returns in
# a country by _____, which (is/is not) statistically significant
# at the 5% level.
# A win (increases/decreases) the stock market returns in
# a country by _____, which (is/is not) statistically significant
# at the 5% level.
# 15) What do these results say?
joined_data %>%
lm(next_ret ~ eliminated + team, data = .) %>%
broom::tidy()
# 16) What do these results say?
joined_data %>%
lm(next_ret ~ won_tournament + team, data = .) %>%
broom::tidy()Download this assignment
Here’s a link to download this assignment.