Using across()
In this practical, you will learn how to:
Apply functions across multiple columns using across()
Perform row-wise calculations using c_across()
These exercises will use functions and datasets from the tidyverse package.
library (tidyverse)
data (starwars)
We’ll start with an example using the starwars dataset.
Use across inside summarise to calculate the mean of height, mass, and birth_year.
starwars |>
summarise (
across (
c (height, mass, birth_year), \(x) mean (x, na.rm = TRUE ))
)
# A tibble: 1 × 3
height mass birth_year
<dbl> <dbl> <dbl>
1 175. 97.3 87.6
What happens if you replace try to calculate the mean of all columns using everything()?
The code below will give you a warning because you’re trying to apply mean to non-numeric columns.
starwars |>
summarise (
across (
everything (), \(x) mean (x, na.rm = TRUE ))
)
Warning: There were 11 warnings in `summarise()`.
The first warning was:
ℹ In argument: `across(everything(), function(x) mean(x, na.rm = TRUE))`.
Caused by warning in `mean.default()`:
! argument is not numeric or logical: returning NA
ℹ Run `dplyr::last_dplyr_warnings()` to see the 10 remaining warnings.
# A tibble: 1 × 14
name height mass hair_color skin_color eye_color birth_year sex gender
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 NA 175. 97.3 NA NA NA 87.6 NA NA
# ℹ 5 more variables: homeworld <dbl>, species <dbl>, films <dbl>,
# vehicles <dbl>, starships <dbl>
Repeat your answer to (1), but calculate both the mean and sd.
mean_cc <- function (x) mean (x, na.rm = TRUE )
sd_cc <- function (x) sd (x, na.rm = TRUE )
starwars |>
summarise (
across (c (height, mass, birth_year),
list (mean = mean_cc, sd = sd_cc))
)
# A tibble: 1 × 6
height_mean height_sd mass_mean mass_sd birth_year_mean birth_year_sd
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 175. 34.8 97.3 169. 87.6 155.
Modify the code to include min and max.
Use pivot_longer to convert the result into a ‘tidy’ data frame.
Use the mtcars data frame for the following two questions.
Use summarise and across to compute the median for all columns starting with d.
mtcars |>
summarise (across (starts_with ("d" ), median))
Compute the mean for all numeric columns.
mtcars |>
summarise (across (where (is.numeric), mean))
mpg cyl disp hp drat wt qsec vs am
1 20.09062 6.1875 230.7219 146.6875 3.596563 3.21725 17.84875 0.4375 0.40625
gear carb
1 3.6875 2.8125
For more details, refer to:
Using ‘tidy evaluation’
Write a function to produced a grouped summary of a given dataset.
Your function:
Should take a data frame (.data) and the grouping column (.col) as inputs;
Should use summarise to calculate a summary (e.g., median or mean) of all numeric columns.
Use across and where to apply the summary function to all numeric columns.
Test your function with the mtcars dataset.
get_summary <- function (.data, .group) {
.data |>
group_by ({{.group}}) |>
summarise (across (where (is.numeric), mean))
}
get_summary (mtcars, cyl)
# A tibble: 3 × 11
cyl mpg disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 4 26.7 105. 82.6 4.07 2.29 19.1 0.909 0.727 4.09 1.55
2 6 19.7 183. 122. 3.59 3.12 18.0 0.571 0.429 3.86 3.43
3 8 15.1 353. 209. 3.23 4.00 16.8 0 0.143 3.29 3.5
Write a function to calculate the square root (sqrt) of a given column. Your function should have two arguments: the data frame and the column name.
calc_sqrt <- function (.data, .col) {
.data |>
mutate ({{ .col }} : = sqrt ({{ .col }}))
}
mtcars |>
calc_sqrt (wt)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 1.618641 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 1.695582 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 1.523155 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 1.793042 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 1.854724 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 1.860108 20.22 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 1.889444 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 1.786057 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 1.774824 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 1.854724 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 1.854724 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 2.017424 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 1.931321 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 1.944222 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 2.291288 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 2.328948 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 2.311926 17.42 0 0 3 4
Fiat 128 32.4 4 78.7 66 4.08 1.483240 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.270827 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.354622 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 1.570032 20.01 1 0 3 1
Dodge Challenger 15.5 8 318.0 150 2.76 1.876166 16.87 0 0 3 2
AMC Javelin 15.2 8 304.0 150 3.15 1.853375 17.30 0 0 3 2
Camaro Z28 13.3 8 350.0 245 3.73 1.959592 15.41 0 0 3 4
Pontiac Firebird 19.2 8 400.0 175 3.08 1.960867 17.05 0 0 3 2
Fiat X1-9 27.3 4 79.0 66 4.08 1.391043 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 1.462874 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.230041 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 1.780449 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 1.664332 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 1.889444 14.60 0 1 5 8
Volvo 142E 21.4 4 121.0 109 4.11 1.667333 18.60 1 1 4 2
Update your function so it creates a new column ({.col}_sqrt) instead of overwriting the existing one.
You can extend the column name by putting it in quotes. For example:
"{{.col}}_sqrt"
calc_sqrt <- function (.data, .col) {
.data |>
mutate ("{{.col}}_sqrt" : = sqrt ({{ .col }}))
}
mtcars |>
calc_sqrt (wt)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
wt_sqrt
Mazda RX4 1.618641
Mazda RX4 Wag 1.695582
Datsun 710 1.523155
Hornet 4 Drive 1.793042
Hornet Sportabout 1.854724
Valiant 1.860108
Duster 360 1.889444
Merc 240D 1.786057
Merc 230 1.774824
Merc 280 1.854724
Merc 280C 1.854724
Merc 450SE 2.017424
Merc 450SL 1.931321
Merc 450SLC 1.944222
Cadillac Fleetwood 2.291288
Lincoln Continental 2.328948
Chrysler Imperial 2.311926
Fiat 128 1.483240
Honda Civic 1.270827
Toyota Corolla 1.354622
Toyota Corona 1.570032
Dodge Challenger 1.876166
AMC Javelin 1.853375
Camaro Z28 1.959592
Pontiac Firebird 1.960867
Fiat X1-9 1.391043
Porsche 914-2 1.462874
Lotus Europa 1.230041
Ford Pantera L 1.780449
Ferrari Dino 1.664332
Maserati Bora 1.889444
Volvo 142E 1.667333