---
title: "10 Functional Programming with map()"
format:
  html:
    self-contained: TRUE
---

By the end of this assignment, you should be able to:

- Explain what an anonymous function is and why we use it.
- Use `ggplot() + stat_function()` with anonymous functions as a "graphing calculator."
- Use `map()` to repeat a task many times and combine the results into a tibble.
- Run an OLS simulation.

Run this code to attach the tidyverse to your current session and get started.

```{r, message = F}
library(tidyverse)
```

## Anonymous Functions

An anonymous function is a function without a name. You write it when you want a quick "one-time" function.

```{r, eval = F}
# do not run

function(input) {
  # do something with input
}
```

### Example: Apply a function immediately

This creates a function `function(x) {3 * x}` and immediately calls it with `x = 9`.

```{r}
(function(x){3 * x})(9)
```

(Of course, if you want to know what 3 * 9 is, `3 * 9` is a much simpler way of solving the problem.)

### Question 1

Write an anonymous function that divides any input by 3, and then call that function on the vector `c(1, 3, 9)`.

```{r}

```

## R as a Graphing Calculator

You can graph functions with ggplot by using `stat_function()` paired with an anonymous function.

```{r}
ggplot() +
  stat_function(fun = function(x) 2 * x + 1, color = "red")
```

Zoom in or out using xlim() and ylim().

```{r}
ggplot() +
  stat_function(fun = function(x) 2 * x + 1, color = "red") +
  xlim(0, 3) +
  ylim(0, 6)
```

### Question 2

Use ggplot to graph the function $y = x^2 + 4x - 1$ in blue. Set x and y limits appropriately.

```{r}

```

## map() for a Simple Task

### Why `map()`?

Sometimes you want to do the same task many times, once for each element of a list or vector. `map()` forms a mapping from the inputs to outputs, as defined by the anonymous function you use. `map()` takes two arguments:

- `.x`: a list of inputs
- `.f`: the function you want to call on each element of those inputs.

```{r, eval = F}
# do not run
map(.x = <vector_or_list>, .f = <function>)
```

### Example: square a bunch of numbers

A silly example: use `map()` to square every number from 1 to 5. `.x` is the vector `1:5`; `.f` is a function that squares a number.

```{r}
map(
  .x = 1:5,
  .f = function(number) {
    number^2
  }
)
```

Notice that `map()` returns a **list**: similar to a vector, but much more flexible. Lists can be nested and lists can contain different data types (like a character string as one element and a number as the next). If you want to return a vector at the end, just pipe the result into `as_vector()`.

The example above is silly because many operations, including the square, is vectorized in R. A much simpler way to do this task is just:

```{r}
(1:5)^2
```

### Question 3

Use `map()` to divide every number from 1 to 100 by 2. Make sure that the output is a vector.

```{r}

```

Of course, division is also vectorized. So a much simpler way to do this is:

```{r}
(1:100)/2
```

## `map()` OLS Simulation

Now we'll explore a more useful way to use `map()`, on functions that are not vectorized.

### Question 4

First, I'll generate some random data using `rnorm()`, which generates random numbers from the normal distribution. It takes `n` (number of observations to generate), `mean`, and `sd` (standard deviation). My `x` is pure noise around a mean of 50, and `y` depends partially on `x` and partially on its own random noise (we'd call this "u" or epsilon in econometrics).

Read the code closely: what are the true values for $\beta_0$ and $\beta_1$?

```{r}
tibble(
      x = rnorm(n = 100, mean = 50, sd = 10),
      y = 50 + 5 * x + rnorm(n = 100, mean = 0, sd = 100)
    )
```

### Question 5

Take my data set and pipe it into `lm()` to estimate the line of best fit. Observe: do you get estimates that are close to the true values for $\beta_0$ and $\beta_1$. Run the code several times: you'll get new estimates because `rnorm()` keeps on generating random numbers.

```{r}
tibble(
  ___
    ) %>%
  lm(___) %>%
  broom::tidy()
```

### Question 6

Now use `map()` to run your code above 100 times, recording the estimate for $\beta_1$ each time. Let your `.x` be 1:100 so we run the simulation 100 times. Let your `.f` take as an input the variable `s`, but don't do anything with `s` in the body of your function because we don't want anything about the simulation to change each time we run it. Use `slice()` and `select()` to only save the estimate for $\beta_1$. After `map()`, pipe the result into `bind_rows()` to return a tibble instead of a list.

```{r}
map(
  .x = ___,
  .f = function(s) {
    tibble(
      ___
    ) %>%
      lm(___) %>%
      broom::tidy() %>%
      slice(___) %>%
      select(___)
  }
) %>%
  bind_rows()
```

### Question 7

Copy-paste everything you did in question 6, and pipe the resulting tibble into a ggplot histogram to visualize the distribution of one variable: the distribution of $\beta_1$ estimates you found.

```{r}


```