12  purrr

Published

March 17, 2023

Modified

January 8, 2024

12.1 purrr resources

12.2 Map family

12.2.1 Functions

  • map() always returns a list.
  • map_lgl(), map_int(), map_dbl(), and map_chr() return an atomic vector of the indicated type (or die trying).
  • map_vec(.ptype = NULL) simplifies to the common type of the output. It works with most types of simple vectors like Date, POSIXct, factors, etc.
  • walk(): calls .f for its side-effect and returns the input .x.

12.2.2 Arguments

  • .f options
    • A named function: mean
    • Anonymous function: \(x) x + 1
    • Formula: ~ .x + 1
    • A string, integer, or list as short hand for pluck()
  • .progress: Whether to have a progress bar
  • .default: specifies value for elements that are missing or NULL.

The preference from purrr 1.0.0 is to use anonymous function style instead of formula or using ... to put in arguments.

12.2.3 map()

# Random generation of vector with the normal distribution of set mean
1:5 |>
  map(\(x) rnorm(n = 10, mean = x))
#> [[1]]
#>  [1] -1.2330706  0.7444393  2.4552033  2.3080622 -1.3509777  1.2861627
#>  [7] -0.8937846  1.1150449  0.5775427  1.0976453
#> 
#> [[2]]
#>  [1]  1.7014398  1.0040024  2.0220898 -0.4113552  2.0727388  1.4441715
#>  [7] -0.6858499  2.3231469  2.6053517  2.7881240
#> 
#> [[3]]
#>  [1] 3.374639 1.078416 3.468923 3.416896 4.082143 4.825970 2.098419 3.404699
#>  [9] 2.186966 3.668954
#> 
#> [[4]]
#>  [1] 3.145787 4.350311 3.202518 4.013772 3.605510 3.251420 1.686825 3.362793
#>  [9] 3.681291 4.796950
#> 
#> [[5]]
#>  [1] 6.305505 5.302494 6.199307 5.612322 6.773631 5.688255 5.063160 7.106426
#>  [9] 3.801539 5.028287

# Simplify output to a vector instead of a list
# by computing the mean of the distributions
1:5 |>
  map(\(x) rnorm(n = 10, mean = x)) |> 
  map_dbl(mean)
#> [1] 0.4142382 2.3074282 2.9051373 4.5171249 5.3758250

12.2.4 pluck() style

Use string, integer, or list as short hand for pluck(). See Section 12.5.1 on using pluck().

  • "idx" short hand for \(x) pluck(x, "idx")
  • 1 short hand for \(x) pluck(x, 1)
  • list("idx", 1) short hand for \(x) pluck(x, "idx", 1)
  • Use .default argument to specify elements that are missing or NULL
# Extract by name or position
l1 <- list(list(a = 1L), list(a = NULL, b = 2L), list(b = 3L))

# name: elements named "b"
l1 |> map_int("b", .default = NA)
#> [1] NA  2  3

# position: 2nd element
l1 |> map_int(2, .default = NA)
#> [1] NA  2 NA

# Supply multiple values to index deeply into a list
l2 <- list(
  list(num = 1:3,     letters[1:3]),
  list(num = 101:103, letters[4:6]),
  list()
)
# map vs pluck
l2 |> map(c(2, 2))
#> [[1]]
#> [1] "b"
#> 
#> [[2]]
#> [1] "e"
#> 
#> [[3]]
#> NULL
l2 |> map(\(x) pluck(x, 2, 2))
#> [[1]]
#> [1] "b"
#> 
#> [[2]]
#> [1] "e"
#> 
#> [[3]]
#> NULL

# Use a list to mixes numeric indices and names
l2 |> map_int(list("num", 3), .default = NA)
#> [1]   3 103  NA

12.2.5 map() with data frames

# Calculate on data frame columns and turn into list or vector
mtcars |> map_dbl(sum)
#>      mpg      cyl     disp       hp     drat       wt     qsec       vs 
#>  642.900  198.000 7383.100 4694.000  115.090  102.952  571.160   14.000 
#>       am     gear     carb 
#>   13.000  118.000   90.000

12.3 Map variants

12.3.1 map_if() and map_at()

Conditionally apply function to some elements of x.

  • map_if(.x, .p, .f, .else = NULL) and map_at(.x, .at, .f)
iris |> map_if(is.factor, as.character, .else = as.integer) |> str()
#> List of 5
#>  $ Sepal.Length: int [1:150] 5 4 4 4 5 5 4 5 4 4 ...
#>  $ Sepal.Width : int [1:150] 3 3 3 3 3 3 3 3 2 3 ...
#>  $ Petal.Length: int [1:150] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Petal.Width : int [1:150] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Species     : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...

# Use numeric vector of positions select elements to change:
iris |> map_at(c(4, 5), is.numeric) |> str()
#> List of 5
#>  $ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>  $ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>  $ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>  $ Petal.Width : logi TRUE
#>  $ Species     : logi FALSE

# Use vector of names to specify which elements to change:
iris |> map_at("Species", toupper) |> str()
#> List of 5
#>  $ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>  $ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>  $ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>  $ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>  $ Species     : chr [1:150] "SETOSA" "SETOSA" "SETOSA" "SETOSA" ...

12.3.2 map_depth(.x, .depth, .f)

Map or modify elements at a given depth

  • map_depth() or modify_depth()
  • .depth argument
    • map_depth(x, 0, fun) is equivalent to fun(x).
    • map_depth(x, 1, fun) is equivalent to x <- map(x, fun)
    • map_depth(x, 2, fun) is equivalent to x <- map(x, \(y) map(y, fun))
x <- list(a = list(foo = 1:2, bar = 3:4), b = list(baz = 5:6))

x |> map_depth(2, sum) |> str()
#> List of 2
#>  $ a:List of 2
#>   ..$ foo: int 3
#>   ..$ bar: int 7
#>  $ b:List of 1
#>   ..$ baz: int 11

12.3.3 map2(.x, .y, .f)

Map over two inputs

x <- list(1, 1, 1)
y <- list(10, 20, 30)

map2(x, y, \(x, y) x + y)
#> [[1]]
#> [1] 11
#> 
#> [[2]]
#> [1] 21
#> 
#> [[3]]
#> [1] 31

map2_dbl(x, y, \(x, y) x + y)
#> [1] 11 21 31

12.3.4 pmap(.l, .f)

Map over multiple input simultaneously

x <- list(1, 1, 1)
y <- list(10, 20, 30)
z <- list(100, 200, 300)

pmap(list(x, y, z), sum)
#> [[1]]
#> [1] 111
#> 
#> [[2]]
#> [1] 221
#> 
#> [[3]]
#> [1] 331

pmap_dbl(list(x, y, z), sum)
#> [1] 111 221 331

12.3.5 imap(.x, .f)

Apply a function to each element of a vector and its index.

# Note that the order is value then index
imap_chr(sample(10), paste)
#>  [1] "5 1"  "8 2"  "6 3"  "4 4"  "2 5"  "9 6"  "10 7" "7 8"  "1 9"  "3 10"

# Use anonymous function to reverse value/index order in output
imap_chr(sample(10), \(x, idx) paste0(idx, ": ", x))
#>  [1] "1: 7"  "2: 4"  "3: 2"  "4: 8"  "5: 3"  "6: 9"  "7: 1"  "8: 10" "9: 6" 
#> [10] "10: 5"

# Name of column and calculated value of column
iwalk(mtcars, \(x, idx) cat(idx, ": ", median(x), "\n", sep = ""))
#> mpg: 19.2
#> cyl: 6
#> disp: 196.3
#> hp: 123
#> drat: 3.695
#> wt: 3.325
#> qsec: 17.71
#> vs: 0
#> am: 0
#> gear: 4
#> carb: 2

12.3.6 modify(.x, .f)

Modify elements of .x by .f. Unlike map() family, modify() always returns an object of the same type.

12.4 Predicate functionals

A predicate function is a function that either returns TRUE or FALSE. The predicate functionals take a vector and a predicate function and do something useful.

12.4.1 keep(), discard(), compact()

By predicate function

set.seed(24)
# Create a list of 5 values from 1:10
x <- rep(10, 10) |>
  map(\(x) sample(x, size = 5))

# Keep
x |> keep(\(x) mean(x) > 6)
#> [[1]]
#> [1] 10  6  9  8  2
#> 
#> [[2]]
#> [1]  3  6  9 10  4

# Discard
x |> discard(\(x) mean(x) > 6) |> length()
#> [1] 8

By name or position

x <- c(a = 1, b = 2, cat = 10, dog = 15, elephant = 5, e = 10)
x |> keep_at(letters)
#>  a  b  e 
#>  1  2 10
x |> discard_at(letters)
#>      cat      dog elephant 
#>       10       15        5

compact(): discards elements that are NULL or that have length zero

list(a = "a", b = NULL, c = integer(0), d = NA, e = list()) |>
  compact()
#> $a
#> [1] "a"
#> 
#> $d
#> [1] NA

12.4.2 detect()

Find the value or position of the first match

  • detect()
  • detect_index()
  • .direction argument
    • "forward": starts at the beginning of the vector and move towards the end.
    • "backward": starts at the end of the vector and moves towards the beginning.
is_even <- function(x) x %% 2 == 0

3:10 |> detect(is_even)
#> [1] 4
3:10 |> detect_index(is_even)
#> [1] 2

# If you need to find all values, use keep()
3:10 |> keep(is_even)
#> [1]  4  6  8 10

# If you need to find all positions, use map_lgl()
3:10 |> map_lgl(is_even) |> which()
#> [1] 2 4 6 8

12.4.3 every(), some(), none()

Do every, some, or none of the elements of a list satisfy a predicate?

  • some(): TRUE when .p is TRUE for at least one element.
  • every(): TRUE when .p is TRUE for all elements.
  • none(): TRUE when .p is FALSE for all elements.
x <- list(0:10, 5.5)
every(x, is.integer)
#> [1] FALSE
some(x, is.integer)
#> [1] TRUE
none(x, is.character)
#> [1] TRUE

12.4.4 has_element()

has_element(.x, .y): Does a list contain an object?

x <- list(1:10, 5, 9.9)
x |> has_element(1:10)
#> [1] TRUE
x |> has_element(3)
#> [1] FALSE

12.5 Plucking

12.5.1 pluck()

pluck(.x, ...) implements a generalized form of [[ that allow you to index deeply and flexibly into data structures. It always succeeds, returning .default if the index you are trying to access does not exist or is NULL.

You can use a combination of numeric positions, vector or list names, and accessor functions.

  • pluck(x, 1) is equivalent to x[[1]]
  • pluck(x, 1, 2) is equivalent to x[[1]][[2]]
x <- list(
  list("a", list(1, elt = "foo")),
  list("b", list(2, elt = "bar"))
)
# equivalent to `x[[1]][[2]]`
pluck(x, 1, 2)
#> [[1]]
#> [1] 1
#> 
#> $elt
#> [1] "foo"

# Combine numeric positions with names
pluck(x, 1, 2, "elt")
#> [1] "foo"

# By default returns `NULL` when an element does not exist
pluck(x, 10)
#> NULL

# You can supply a default value for non-existing elements
pluck(x, 10, .default = NA)
#> [1] NA

Can assign values with pluck() <- x

pluck(x, 1, 2, 2) <- "kook"
pluck(x, 1, 2, 2)
#> [1] "kook"

12.5.2 chuck()

chuck() is the same as pluck() but throws an error if the index does not exist.

chuck(x, 1, 2, 2)
#> [1] "kook"

# pluck vs chuck
pluck(x, 10)
#> NULL
chuck(x, 10)
#> Error in `chuck()`:
#> ! Index 1 exceeds the length of plucked object (10 > 2).

12.5.3 assign_in() and modify_in()

Modify values in a pluck location.

  • assign_in(x, where, value): Assigns a value to a pluck location.
  • modify_in(.x, .where, .f): Applies a function to a pluck location.
  • where/.where arguments are a pluck location as a numeric vector of positions, a character vector of names, or a list combining both.
# Return "kook" back to "foo"
pluck(x, 1, 2, 2)
#> [1] "kook"
x <- assign_in(x, c(1, 2, 2), "foo")
pluck(x, 1, 2, 2)
#> [1] "foo"

# Modify a location by a function
pluck(x, 1, 2, "elt")
#> [1] "foo"
x <- modify_in(x, list(1, 2, "elt"), toupper)
pluck(x, 1, 2, "elt")
#> [1] "FOO"

12.6 Transforming lists and vectors

12.6.1 list_flatten()

Removes a single level of hierarchy from a list; the output is always a list. Lists are flattened from outside to inside so the outermost list is flattened and inner lists are maintained.

list_flatten() supersedes flatten() in purrr 1.0.0.

x <- list(1, list(2, 3), list(4, list(5)))
str(x)
#> List of 3
#>  $ : num 1
#>  $ :List of 2
#>   ..$ : num 2
#>   ..$ : num 3
#>  $ :List of 2
#>   ..$ : num 4
#>   ..$ :List of 1
#>   .. ..$ : num 5

# 2nd-level lists are flattened
# 3rd-level lists made into 2nd level
x |> list_flatten() |> str()
#> List of 5
#>  $ : num 1
#>  $ : num 2
#>  $ : num 3
#>  $ : num 4
#>  $ :List of 1
#>   ..$ : num 5

# Flat lists are left as is
list(1, 2, 3, 4, 5) |> list_flatten() |> str()
#> List of 5
#>  $ : num 1
#>  $ : num 2
#>  $ : num 3
#>  $ : num 4
#>  $ : num 5

12.6.2 list_c()

Concatenate the elements of a list to produce a vector. This allows elements in a list to be different lengths, breaking the one-to-one mapping between input and output of map() family of functions. Optional argument ptype to provide a prototype to ensure that the output type is always the same.

list_c() supersedes flatten_lgl(), flatten_int(), flatten_dbl(), and flatten_chr() in purrr 1.0.0.

list_c(list(1, 1:2, 1:3))
#> [1] 1 1 2 1 2 3

12.6.3 list_simplify()

Reduces a list to a homogeneous vector; the output is always the same length as the input. Thus, all elements of the list must be length 1. Use .ptype to specify what class the resulting vector should be.

list_simplify() supersedes simplify(), simplify_all(), and as_vector() in purrr 1.0.0.

list_simplify(list(1, 2, 3))
#> [1] 1 2 3

# Error with more than one element
list_simplify(list(1, 2, 1:3))
#> Error in `list_simplify()`:
#> ! `x[[3]]` must have size 1, not size 3.

12.6.4 list_rbind() and list_cbind()

Combine data frames together to create a larger data frame either by row or column. x must be a list of data frames.

list_rbind() and list_cbind() supersedes flatten_dfr() and flatten_dfc()in purrr 1.0.0.

x <- list(
  a = data.frame(x = 1:2),
  b = data.frame(y = "a")
)
# rbind
list_rbind(x)
#>    x    y
#> 1  1 <NA>
#> 2  2 <NA>
#> 3 NA    a
list_rbind(x, names_to = "id")
#>   id  x    y
#> 1  a  1 <NA>
#> 2  a  2 <NA>
#> 3  b NA    a

#cbind
list_cbind(x)
#>   x y
#> 1 1 a
#> 2 2 a

12.6.5 accumulate() and reduce()

# accumulate is equivalent to cumsum
1:5 |> accumulate(`+`)
#> [1]  1  3  6 10 15

# reduce is equivalent to sum
1:5 |> reduce(`+`)
#> [1] 15

# with paste
accumulate(letters[1:5], paste, sep = ".")
#> [1] "a"         "a.b"       "a.b.c"     "a.b.c.d"   "a.b.c.d.e"
reduce(letters[1:5], paste, sep = ".")
#> [1] "a.b.c.d.e"