Activity: simulation

Probability simulation: the birthday problem

Suppose we have a class of 30 students. What is the probability that there is at least one shared birthday?

  • Assume there are 365 days in a year
  • Assume that each day is equally likely as a birthday
  • Assume there are no multiple-birth siblings (e.g. twins, triplets, etc.) in the class
  1. Use a for loop to repeat the experiment nsim=10000 times, making sure to store the results. What is the probability of at least one shared birthday?

Solution:

For loop approach:

library(tidyverse)

set.seed(213)

days <- 1:365 # days of the year
n_students <- 30

nsim <- 10000
results <- rep(NA, nsim) # store the simulation results
for(i in 1:nsim){
  birthdays <- sample(days, n_students, replace=TRUE)
  results[i] <- length(unique(birthdays)) < n_students
}

mean(results)
[1] 0.7108

Map approach:

set.seed(213)

nsim <- 10000
n_students <- 30

birthday_match <- function(n){
  days <- 1:365
  birthdays <- sample(days, n, replace=T)
  length(unique(birthdays)) < n
}

map_lgl(1:nsim, function(i) birthday_match(n_students)) |>
  mean()
[1] 0.7108
  1. How many students do we need for the probability to be approximately 50%?

Solution: 23

map_lgl(1:nsim, function(i) birthday_match(22)) |>
  mean()
[1] 0.4842
map_lgl(1:nsim, function(i) birthday_match(23)) |>
  mean()
[1] 0.5116
map_lgl(1:nsim, function(i) birthday_match(24)) |>
  mean()
[1] 0.5399