Practical 1

Introducing purrr and map

  1. Write a function that calculates the square of a number (i.e., \(x^2\)).
  2. Create a vector x containing the numbers 1 to 10.
  3. Use map() to apply your function to each number in the vector.
  4. Repeat, but this time use a temporary function.
1library(tidyverse)
2sq <- function(x) { x^2 }
3x <- 1:10

4map(x, sq)

5map(x, \(x) x^2)
1
We need to load the tidyverse package (or purrr).
2
Write a function to calculate the square.
3
Define a vector of numbers.
4
Use map to apply the function to the vector.
5
Use a temporary function (\(x) x^2) to calculate the square of each number.
  1. Modify your previous answer so that the result is returned as a numeric vector instead of a list.
map_dbl(x, \(x) x^2)
 [1]   1   4   9  16  25  36  49  64  81 100
  1. Write a vector of names (e.g., c("Alice", "Bob")).
  2. Use map_chr() to convert this vector into greetings (e.g., “Hello, Alice!”).
people <- c("Alice", "Bob", "Charlie")
map_chr(people, \(x) paste0("Hello ", x, "!"))
[1] "Hello Alice!"   "Hello Bob!"     "Hello Charlie!"
# We can also use `str_glue` for string interpolation:
map_chr(people, \(x) str_glue("Hello, {x}!"))
[1] "Hello, Alice!"   "Hello, Bob!"     "Hello, Charlie!"
  1. Given the below list of numeric vectors, use map_dbl() to compute the mean of each vector.
numbers <- list(c(1, 2, 3), c(4, 5, 6), c(7, 8, 9))
map_dbl(numbers, mean)
[1] 2 5 8
  1. Use the code below to generate 10 datasets and save them into a subfolder of your current working directory.
generate_data <- function() {
1  d <- as.data.frame(matrix(rnorm(500), ncol = 5))
2  names(d) <- paste0("var", 1:ncol(d))
  return(d)
}

3dir.create("datasets")

4walk(LETTERS[1:10], \(i) {
5  write_csv(generate_data(),
            file = paste0("datasets/", i, ".csv")) 
})
1
Create a matrix from a vector of 500 random numbers and convert it to a data frame.
2
Set the names of the data frame to var1, var2, etc.
3
Create a new folder “datasets”
4
For each letter from A to J
5
Generate a dataset and save it as a CSV file

  1. Use map to import all CSV files into this folder as a list of data frames.

  2. Use list_rbind to convert the list into a single data frame. Use the names_to argument to retain the dataset names.

1library(fs)

2result <- dir_ls("datasets") |>
3  map(read_csv) |>
4  list_rbind(names_to = "file")

# To extract the dataset name (i.e., A, B, C, ...)
result |>
  mutate(file = str_sub(file, 10, 10))
1
fs is a powerful, flexible package for working with the file system.
2
Use dir_ls to get a list of files in the ‘datasets’ folder.
3
Apply read_csv to each filename.
4
Combine the list of data frames by row-binding.