29 rlang: Metaprogramming
29.1 rlang
metaprogramming vignettes
29.2 Defusing R expressions
Defusing is the act of capturing code and returning an expression in a tree-like structure that provides a recipe for how to compute the value.
Defuse your own R expressions with expr()
; defuse expressions supplied by the user of a function you write with enquo()
or enquos()
; and evaluate it with eval()
or eval_tidy()
.
“The most common use case for defusing expressions is to resume its evaluation in a data mask. This makes it possible for the expression to refer to columns of a data frame as if they were regular objects.”
29.2.1 The booby trap analogy
With lazy evaluations arguments are like booby traps. They are only evaluated when touched. “Defusing an argument can be seen as defusing the booby trap.” The argument is captured rather than evaluated, rather than setting off the booby trap.
29.2.2 Types of defused expressions
There are three basic types of defused expressions. See Wickham, Advanced R, Chapter 18: Expressions for more details.
- Calls: calling a function
- Symbols: named objects
- Environment-variable: object defined in the global environment or a function.
- Data-variable: object is a column in a data frame.
- Constants: Either
NULL
or an atomic vector of length 1.
Another way to create defused expressions is to assemble them from data.
29.2.3 Defuse and inject
The defuse and inject pattern is to defuse an argument and inject the expression into another function in the context of a data mask. This can be done in two steps with enquo()
and !!
or in a single defuse-and-inject step with the embrace operator {{
. The two-step process can be useful in more complex settings where you need access to the defused expression rather than just passing it on.
# Defuse-and-inject: two steps
my_summarise2 <- function(data, arg) {
# Defuse the user expression in `arg`
arg <- enquo(arg)
# Inject the expression contained in `arg`
# inside a `summarise()` argument
data |>
dplyr::summarise(mean = mean(!!arg, na.rm = TRUE))
}
# Defuse and inject in a single step with the embracing operator
my_summarise1 <- function(data, arg) {
data |>
dplyr::summarise(mean = mean({{ arg }}, na.rm = TRUE))
}
29.2.4 Defused arguments and quosures
expr()
returns a defused expression, while enquo()
returns a quosure, an expression along with an environment. See Section 29.5.
29.3 Injecting with !!
, !!!
, and glue syntax
There are two main families of injection operators that are used to modify code before R processes it:
-
Dynamic dots operators:
!!!
and"{"
- Metaprogramming operators:
!!
, splicing with!!!
, {{, and
“{{”`
29.3.1 Dots injection
Dynamic dots make ...
programmable with injection operators.
Splicing with !!!
You can use list2()
to turn ...
into dynamic dots. For instance, to turn a list into a set of arguments taken in by ...
in base R you can use do.call()
, but if you use list2()
within the do.call()
call, you can splice this list of arguments with !!!
.
Injecting names with “{”
Dynamic dots also allows you to use an argument name that is stored in a variable. In the case of rbind2()
this makes it possible to use a variable to name the row.
name <- "foo"
rbind2("{name}" := 1:2, bar = 3:4)
#> [,1] [,2]
#> foo 1 2
#> bar 3 4
rbind2("prefix_{name}" := 1:2, bar = 3:4)
#> [,1] [,2]
#> prefix_foo 1 2
#> bar 3 4
29.3.2 Metaprogramming injection
Embracing with {{
The embracing operator is made for dealing with function arguments. “It defuses the expression supplied as argument and immediately injects it in place.” The evaluation usually takes place in the context of a data mask.
Injecting with !!
!!
is meant to inject a single object in place. For example, it can inject a data-symbol object stored in an environment variable into a data-masking context to ensure that it is evaluated.
Splicing with !!!
The splice operator !!!
can be used in data-masking contexts and inside inject()
. For example, rbind2()
could be rewritten with inject()
so that the function can also use the splice operator and no longer needs do.call()
.
29.4 Metaprogramming patterns
This vignette is meant to present more theoretical and advanced patterns than those discussed in Data mask programming patterns.
29.4.1 Forwarding patterns
Defuse and inject
The defuse and inject pattern can be done in wither one or two steps. Using the embracing operator and passing the dots is the simpler form. However, sometimes you might want to inspect or modify the expression before injecting them in the target context. This is made possible by the two-step patterns of enquo()
and !!
or enquos()
and !!!
.
-
{{
is the combination ofenquo()
and!!
. - Passing
...
is equivalent to the combination ofenquos()
and!!!
.
my_summarise <- function(data, var) {
data %>% dplyr::summarise({{ var }})
}
my_summarise <- function(data, var) {
data %>% dplyr::summarise(!!enquo(var))
}
my_group_by <- function(.data, ...) {
.data %>% dplyr::group_by(...)
}
my_group_by <- function(.data, ...) {
.data %>% dplyr::group_by(!!!enquos(...))
}
Inspecting input labels
Use of as_label()
or englue()
to create an automatic name for one or more defused arguments.
With multiple arguments you can use enquos()
and set the .name
argument to TRUE
to automatically call as_label()
on the inputs, though the user can also provide names.
29.4.2 Names patterns: Symbolize and inject
You can use a symbolize and inject pattern when across(all_of())
is not supported. In this pattern defused expressions are created that refer to column names that are then injected into a data-mask context.
You can cast a string to a symbol with sym()
and syms()
to return simple symbols or data_sym()
and data_syms()
that return calls to $
to subset the .data
pronoun. The later functions can only be used in a tidy eval context.
This pattern can be used to create a group_by()
variant that takes a vector of names that is captured by data_syms()
and then injected with the splice operator !!!
.
my_group_by <- function(data, vars) {
data %>% dplyr::group_by(!!!data_syms(vars))
}
mtcars %>% my_group_by(vars)
#> # A tibble: 32 × 11
#> # Groups: cyl, am [6]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
29.4.3 Bridge patterns
mutate()
as a data-mask to selection bridge that accomplishes the same task as in Data masking patterns - Bridge patterns but using enquos()
to defuse and inspect the names.
my_pivot_longer <- function(data, ...) {
# Defuse the dots and inspect the names
dots <- enquos(..., .named = TRUE)
names <- names(dots)
# Pass the inputs to `mutate()`
data <- data %>% dplyr::mutate(!!!dots)
# Select `...` inputs by name with `all_of()`
data %>%
tidyr::pivot_longer(cols = all_of(names))
}
mtcars %>% my_pivot_longer(cyl, am = am * 100)
#> # A tibble: 64 × 11
#> mpg disp hp drat wt qsec vs gear carb name value
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
#> 1 21 160 110 3.9 2.62 16.5 0 4 4 cyl 6
#> 2 21 160 110 3.9 2.62 16.5 0 4 4 am 100
#> 3 21 160 110 3.9 2.88 17.0 0 4 4 cyl 6
#> 4 21 160 110 3.9 2.88 17.0 0 4 4 am 100
#> 5 22.8 108 93 3.85 2.32 18.6 1 4 1 cyl 4
#> 6 22.8 108 93 3.85 2.32 18.6 1 4 1 am 100
#> 7 21.4 258 110 3.08 3.22 19.4 1 3 1 cyl 6
#> 8 21.4 258 110 3.08 3.22 19.4 1 3 1 am 0
#> 9 18.7 360 175 3.15 3.44 17.0 0 3 2 cyl 8
#> 10 18.7 360 175 3.15 3.44 17.0 0 3 2 am 0
#> # ℹ 54 more rows
29.4.4 Transformation patterns
It is also possible to recreate my_mean()
from Data masking patterns - Transformation patterns. “The pattern consists in defusing the input expression, building larger calls around them, and finally inject the modified expressions inside the data-masking functions.” With ...
to take in multiple arguments, you need to use purrr::map()
to loop over the arguments to construct the call.
my_mean <- function(.data, ...) {
# Defuse the dots. Make sure they are automatically named.
vars <- enquos(..., .named = TRUE)
# Map over each defused expression and wrap it in a call to `mean()`
vars <- purrr::map(vars, ~ expr(mean(!!.x, na.rm = TRUE)))
# Inject the expressions
.data %>% dplyr::summarise(!!!vars)
}
mtcars %>% my_mean(cyl, mpg)
#> cyl mpg
#> 1 6.1875 20.09062
The difference with the previous version of my_mean()
is that the function does not inherit tidy selection helpers and syntax. However, it does gain the ability to create new vectors on the fly as in summarise()
.
mtcars %>% my_mean(cyl = cyl * 100, mpg)
#> cyl mpg
#> 1 618.75 20.09062
29.5 What are quosures and when are they needed?
A quosure is a special type of defused expression that keeps track of the original context in which the expression was written. The ability to keep track of the original context helps to interface multiple data-masking functions that might come from two unrelated environments, like two different packages.
29.5.1 Blending environments
Example of a function call that uses a function from a package that uses another function from another package, which, in turn, is built on data-masking functions in dplyr
. This creates a number of different contexts or environments in which the code needs to pass through to be evaluated. The role of quosures is to ensure that each variable is evaluated in the correct context.
# Function call
::starwars %>%
dplyr::summarise_bmi(mass, div100(height))
foo
# Context 1: global environment of user
<- function(x) {
div100 / 100
x
}
# Context 2: foo package
<- function(mass, height) {
bmi / height^2
mass
}
<- function(data, mass, height) {
summarise_bmi %>%
data ::summarise_stats(bmi({{ mass }}, {{ height }}))
bar
}
# Context 3: bar package
<- function(x) {
check_numeric stopifnot(is.numeric(x))
x
}
<- function(data, var) {
summarise_stats # Context 4: dplyr package
%>%
data ::transmute(
dplyrvar = check_numeric({{ var }})
%>%
) ::summarise(
dplyrmean = mean(var, na.rm = TRUE),
sd = sd(var, na.rm = TRUE)
)
}
# Final expression with quosures identified by ^
::transmute(
dplyrvar = ^check_numeric(^bmi(^mass, ^div100(height)))
)
29.5.2 When should I create quosures?
{{
and dynamic dots create quosures for you, and so tidy eval documentation has moved away from directly discussing quosures.
As a rule of thumb, quosures are only needed for defused arguments that come from another environment (often the user environment) not your own. Any local expressions created within a function do not need quosures because there is no exchange of environment. Thus, local expressions can be created with expr()
, which do not carry an environment and so are not quosures. These expressions can be evaluated with either !!
or eval()
/eval_tidy()
.
29.5.3 Technical description of quosures
- Quosures are made up of an expression and an environment.
- Quosures are:
- Callable: evaluation produces a result
- Hygienic: evaluated in the tracked environment
- Maskable: can be evaluated in a data mask such that the mask comes first in scope before the quosure environment.