12 purrr
12.1 purrr
resources
12.2 Map family
12.2.1 Functions
-
map()
always returns a list. -
map_lgl()
,map_int()
,map_dbl()
, andmap_chr()
return an atomic vector of the indicated type (or die trying). -
map_vec(.ptype = NULL)
simplifies to the common type of the output. It works with most types of simple vectors like Date, POSIXct, factors, etc. -
walk()
: calls.f
for its side-effect and returns the input.x
.
12.2.2 Arguments
-
.f
options- A named function:
mean
- Anonymous function:
\(x) x + 1
- Formula:
~ .x + 1
- A string, integer, or list as short hand for
pluck()
- A named function:
-
.progress
: Whether to have a progress bar -
.default
: specifies value for elements that are missing orNULL
.
The preference from purrr 1.0.0
is to use anonymous function style instead of formula or using ...
to put in arguments.
12.2.3 map()
# Random generation of vector with the normal distribution of set mean
1:5 |>
map(\(x) rnorm(n = 10, mean = x))
#> [[1]]
#> [1] -1.2330706 0.7444393 2.4552033 2.3080622 -1.3509777 1.2861627
#> [7] -0.8937846 1.1150449 0.5775427 1.0976453
#>
#> [[2]]
#> [1] 1.7014398 1.0040024 2.0220898 -0.4113552 2.0727388 1.4441715
#> [7] -0.6858499 2.3231469 2.6053517 2.7881240
#>
#> [[3]]
#> [1] 3.374639 1.078416 3.468923 3.416896 4.082143 4.825970 2.098419 3.404699
#> [9] 2.186966 3.668954
#>
#> [[4]]
#> [1] 3.145787 4.350311 3.202518 4.013772 3.605510 3.251420 1.686825 3.362793
#> [9] 3.681291 4.796950
#>
#> [[5]]
#> [1] 6.305505 5.302494 6.199307 5.612322 6.773631 5.688255 5.063160 7.106426
#> [9] 3.801539 5.028287
# Simplify output to a vector instead of a list
# by computing the mean of the distributions
1:5 |>
map(\(x) rnorm(n = 10, mean = x)) |>
map_dbl(mean)
#> [1] 0.4142382 2.3074282 2.9051373 4.5171249 5.3758250
12.2.4 pluck()
style
Use string, integer, or list as short hand for pluck()
. See Section 12.5.1 on using pluck()
.
-
"idx"
short hand for\(x) pluck(x, "idx")
-
1
short hand for\(x) pluck(x, 1)
-
list("idx", 1)
short hand for\(x) pluck(x, "idx", 1)
- Use
.default
argument to specify elements that are missing orNULL
# Extract by name or position
l1 <- list(list(a = 1L), list(a = NULL, b = 2L), list(b = 3L))
# name: elements named "b"
l1 |> map_int("b", .default = NA)
#> [1] NA 2 3
# position: 2nd element
l1 |> map_int(2, .default = NA)
#> [1] NA 2 NA
# Supply multiple values to index deeply into a list
l2 <- list(
list(num = 1:3, letters[1:3]),
list(num = 101:103, letters[4:6]),
list()
)
# map vs pluck
l2 |> map(c(2, 2))
#> [[1]]
#> [1] "b"
#>
#> [[2]]
#> [1] "e"
#>
#> [[3]]
#> NULL
l2 |> map(\(x) pluck(x, 2, 2))
#> [[1]]
#> [1] "b"
#>
#> [[2]]
#> [1] "e"
#>
#> [[3]]
#> NULL
# Use a list to mixes numeric indices and names
l2 |> map_int(list("num", 3), .default = NA)
#> [1] 3 103 NA
12.2.5 map()
with data frames
# Calculate on data frame columns and turn into list or vector
mtcars |> map_dbl(sum)
#> mpg cyl disp hp drat wt qsec vs
#> 642.900 198.000 7383.100 4694.000 115.090 102.952 571.160 14.000
#> am gear carb
#> 13.000 118.000 90.000
12.3 Map variants
12.3.1 map_if()
and map_at()
Conditionally apply function to some elements of x
.
-
map_if(.x, .p, .f, .else = NULL)
andmap_at(.x, .at, .f)
iris |> map_if(is.factor, as.character, .else = as.integer) |> str()
#> List of 5
#> $ Sepal.Length: int [1:150] 5 4 4 4 5 5 4 5 4 4 ...
#> $ Sepal.Width : int [1:150] 3 3 3 3 3 3 3 3 2 3 ...
#> $ Petal.Length: int [1:150] 1 1 1 1 1 1 1 1 1 1 ...
#> $ Petal.Width : int [1:150] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
# Use numeric vector of positions select elements to change:
iris |> map_at(c(4, 5), is.numeric) |> str()
#> List of 5
#> $ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#> $ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#> $ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#> $ Petal.Width : logi TRUE
#> $ Species : logi FALSE
# Use vector of names to specify which elements to change:
iris |> map_at("Species", toupper) |> str()
#> List of 5
#> $ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#> $ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#> $ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#> $ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#> $ Species : chr [1:150] "SETOSA" "SETOSA" "SETOSA" "SETOSA" ...
12.3.2 map_depth(.x, .depth, .f)
Map or modify elements at a given depth
-
map_depth()
ormodify_depth()
-
.depth
argument-
map_depth(x, 0, fun)
is equivalent tofun(x)
. -
map_depth(x, 1, fun)
is equivalent tox <- map(x, fun)
-
map_depth(x, 2, fun)
is equivalent tox <- map(x, \(y) map(y, fun))
-
12.3.3 map2(.x, .y, .f)
Map over two inputs
12.3.4 pmap(.l, .f)
Map over multiple input simultaneously
12.3.5 imap(.x, .f)
Apply a function to each element of a vector and its index.
- Short hand for
map2(x, names(x), ...)
ifx
has names, ormap2(x, seq_along(x), ...)
if it does not. -
imap_lgl()
,imap_chr()
,imap_int()
,imap_dbl()
, andiwalk()
# Note that the order is value then index
imap_chr(sample(10), paste)
#> [1] "5 1" "8 2" "6 3" "4 4" "2 5" "9 6" "10 7" "7 8" "1 9" "3 10"
# Use anonymous function to reverse value/index order in output
imap_chr(sample(10), \(x, idx) paste0(idx, ": ", x))
#> [1] "1: 7" "2: 4" "3: 2" "4: 8" "5: 3" "6: 9" "7: 1" "8: 10" "9: 6"
#> [10] "10: 5"
# Name of column and calculated value of column
iwalk(mtcars, \(x, idx) cat(idx, ": ", median(x), "\n", sep = ""))
#> mpg: 19.2
#> cyl: 6
#> disp: 196.3
#> hp: 123
#> drat: 3.695
#> wt: 3.325
#> qsec: 17.71
#> vs: 0
#> am: 0
#> gear: 4
#> carb: 2
12.3.6 modify(.x, .f)
Modify elements of .x
by .f
. Unlike map()
family, modify()
always returns an object of the same type.
-
modify()
is a shortcut forx[[i]] <- f(x[[i]]); return(x)
-
modify2()
,modify_if()
,modify_at()
- Very similar to
map()
family.
12.4 Predicate functionals
A predicate function is a function that either returns TRUE
or FALSE
. The predicate functionals take a vector and a predicate function and do something useful.
12.4.1 keep()
, discard()
, compact()
By predicate function
By name or position
-
keep_at()
: keep by name or position -
discard_at()
: discard by name or position
x <- c(a = 1, b = 2, cat = 10, dog = 15, elephant = 5, e = 10)
x |> keep_at(letters)
#> a b e
#> 1 2 10
x |> discard_at(letters)
#> cat dog elephant
#> 10 15 5
compact()
: discards elements that are NULL
or that have length zero
12.4.2 detect()
Find the value or position of the first match
detect()
detect_index()
-
.direction
argument-
"forward"
: starts at the beginning of the vector and move towards the end. -
"backward"
: starts at the end of the vector and moves towards the beginning.
-
12.4.3 every()
, some()
, none()
Do every, some, or none of the elements of a list satisfy a predicate?
12.4.4 has_element()
has_element(.x, .y)
: Does a list contain an object?
x <- list(1:10, 5, 9.9)
x |> has_element(1:10)
#> [1] TRUE
x |> has_element(3)
#> [1] FALSE
12.5 Plucking
12.5.1 pluck()
pluck(.x, ...)
implements a generalized form of [[
that allow you to index deeply and flexibly into data structures. It always succeeds, returning .default
if the index you are trying to access does not exist or is NULL
.
You can use a combination of numeric positions, vector or list names, and accessor functions.
-
pluck(x, 1)
is equivalent tox[[1]]
-
pluck(x, 1, 2)
is equivalent tox[[1]][[2]]
x <- list(
list("a", list(1, elt = "foo")),
list("b", list(2, elt = "bar"))
)
# equivalent to `x[[1]][[2]]`
pluck(x, 1, 2)
#> [[1]]
#> [1] 1
#>
#> $elt
#> [1] "foo"
# Combine numeric positions with names
pluck(x, 1, 2, "elt")
#> [1] "foo"
# By default returns `NULL` when an element does not exist
pluck(x, 10)
#> NULL
# You can supply a default value for non-existing elements
pluck(x, 10, .default = NA)
#> [1] NA
Can assign values with pluck() <- x
12.5.2 chuck()
chuck()
is the same as pluck()
but throws an error if the index does not exist.
12.5.3 assign_in()
and modify_in()
Modify values in a pluck location.
-
assign_in(x, where, value)
: Assigns a value to a pluck location. -
modify_in(.x, .where, .f)
: Applies a function to a pluck location. -
where/.where
arguments are a pluck location as a numeric vector of positions, a character vector of names, or a list combining both.
12.6 Transforming lists and vectors
12.6.1 list_flatten()
Removes a single level of hierarchy from a list; the output is always a list. Lists are flattened from outside to inside so the outermost list is flattened and inner lists are maintained.
list_flatten()
supersedes flatten()
in purrr 1.0.0
.
x <- list(1, list(2, 3), list(4, list(5)))
str(x)
#> List of 3
#> $ : num 1
#> $ :List of 2
#> ..$ : num 2
#> ..$ : num 3
#> $ :List of 2
#> ..$ : num 4
#> ..$ :List of 1
#> .. ..$ : num 5
# 2nd-level lists are flattened
# 3rd-level lists made into 2nd level
x |> list_flatten() |> str()
#> List of 5
#> $ : num 1
#> $ : num 2
#> $ : num 3
#> $ : num 4
#> $ :List of 1
#> ..$ : num 5
# Flat lists are left as is
list(1, 2, 3, 4, 5) |> list_flatten() |> str()
#> List of 5
#> $ : num 1
#> $ : num 2
#> $ : num 3
#> $ : num 4
#> $ : num 5
12.6.2 list_c()
Concatenate the elements of a list to produce a vector. This allows elements in a list to be different lengths, breaking the one-to-one mapping between input and output of map()
family of functions. Optional argument ptype
to provide a prototype to ensure that the output type is always the same.
list_c()
supersedes flatten_lgl()
, flatten_int()
, flatten_dbl()
, and flatten_chr()
in purrr 1.0.0
.
12.6.3 list_simplify()
Reduces a list to a homogeneous vector; the output is always the same length as the input. Thus, all elements of the list must be length 1. Use .ptype
to specify what class the resulting vector should be.
list_simplify()
supersedes simplify()
, simplify_all()
, and as_vector()
in purrr 1.0.0
.
list_simplify(list(1, 2, 3))
#> [1] 1 2 3
# Error with more than one element
list_simplify(list(1, 2, 1:3))
#> Error in `list_simplify()`:
#> ! `x[[3]]` must have size 1, not size 3.
12.6.4 list_rbind()
and list_cbind()
Combine data frames together to create a larger data frame either by row or column. x
must be a list of data frames.
list_rbind()
and list_cbind()
supersedes flatten_dfr()
and flatten_dfc()
in purrr 1.0.0
.
x <- list(
a = data.frame(x = 1:2),
b = data.frame(y = "a")
)
# rbind
list_rbind(x)
#> x y
#> 1 1 <NA>
#> 2 2 <NA>
#> 3 NA a
list_rbind(x, names_to = "id")
#> id x y
#> 1 a 1 <NA>
#> 2 a 2 <NA>
#> 3 b NA a
#cbind
list_cbind(x)
#> x y
#> 1 1 a
#> 2 2 a
12.6.5 accumulate()
and reduce()
- Accumulate intermediate results of a vector reduction.
- Reduce a list to a single value by iteratively applying a binary function
# accumulate is equivalent to cumsum
1:5 |> accumulate(`+`)
#> [1] 1 3 6 10 15
# reduce is equivalent to sum
1:5 |> reduce(`+`)
#> [1] 15
# with paste
accumulate(letters[1:5], paste, sep = ".")
#> [1] "a" "a.b" "a.b.c" "a.b.c.d" "a.b.c.d.e"
reduce(letters[1:5], paste, sep = ".")
#> [1] "a.b.c.d.e"