29  rlang: Metaprogramming

Published

February 17, 2023

Modified

January 8, 2024

29.1 rlang metaprogramming vignettes

29.2 Defusing R expressions

Defusing is the act of capturing code and returning an expression in a tree-like structure that provides a recipe for how to compute the value.

Defuse your own R expressions with expr(); defuse expressions supplied by the user of a function you write with enquo() or enquos(); and evaluate it with eval() or eval_tidy().

# Return the result of `1 + 1`
1 + 1
#> [1] 2

# Defuse the code and return the expression `1 + 1`
expr(1 + 1)
#> 1 + 1

# Evaluate the defused code and return the result
eval(expr(1 + 1))
#> [1] 2

“The most common use case for defusing expressions is to resume its evaluation in a data mask. This makes it possible for the expression to refer to columns of a data frame as if they were regular objects.”

e <- expr(mean(cyl))
eval(e, mtcars)
#> [1] 6.1875

29.2.1 The booby trap analogy

With lazy evaluations arguments are like booby traps. They are only evaluated when touched. “Defusing an argument can be seen as defusing the booby trap.” The argument is captured rather than evaluated, rather than setting off the booby trap.

29.2.2 Types of defused expressions

There are three basic types of defused expressions. See Wickham, Advanced R, Chapter 18: Expressions for more details.

  1. Calls: calling a function
  2. Symbols: named objects
    • Environment-variable: object defined in the global environment or a function.
    • Data-variable: object is a column in a data frame.
  3. Constants: Either NULL or an atomic vector of length 1.
# 1. Create a call representing the computation of the mean of `foo`
expr(mean(foo, na.rm = TRUE))
#> mean(foo, na.rm = TRUE)

# 2. Create a symbol representing objects called `foo`
expr(foo)
#> foo

# 3. Return a constant
expr(1)
#> [1] 1

Another way to create defused expressions is to assemble them from data.

# Assemble a symbol from a string
var <- "foo"
sym(var)
#> foo

# Assemble a call from strings, symbols, and constants
call("mean", sym(var), na.rm = TRUE)
#> mean(foo, na.rm = TRUE)

29.2.3 Defuse and inject

The defuse and inject pattern is to defuse an argument and inject the expression into another function in the context of a data mask. This can be done in two steps with enquo() and !! or in a single defuse-and-inject step with the embrace operator {{. The two-step process can be useful in more complex settings where you need access to the defused expression rather than just passing it on.

# Defuse-and-inject: two steps
my_summarise2 <- function(data, arg) {
  # Defuse the user expression in `arg`
  arg <- enquo(arg)

  # Inject the expression contained in `arg`
  # inside a `summarise()` argument
  data |> 
    dplyr::summarise(mean = mean(!!arg, na.rm = TRUE))
}

# Defuse and inject in a single step with the embracing operator
my_summarise1 <- function(data, arg) {
  data |> 
    dplyr::summarise(mean = mean({{ arg }}, na.rm = TRUE))
}

29.2.4 Defused arguments and quosures

expr() returns a defused expression, while enquo() returns a quosure, an expression along with an environment. See Section 29.5.

29.3 Injecting with !!, !!!, and glue syntax

There are two main families of injection operators that are used to modify code before R processes it:

  1. Dynamic dots operators: !!! and "{"
  2. Metaprogramming operators: !!, splicing with !!!, {{, and“{{”`

29.3.1 Dots injection

Dynamic dots make ... programmable with injection operators.

Splicing with !!!

You can use list2() to turn ... into dynamic dots. For instance, to turn a list into a set of arguments taken in by ... in base R you can use do.call(), but if you use list2() within the do.call() call, you can splice this list of arguments with !!!.

# Create an rbind function that takes dynamic dots
rbind2 <- function(...) {
  do.call("rbind", list2(...))
}

rows <- list(a = 1:2, b = 3:4)
rbind2(!!!rows, c = 5:6)
#>   [,1] [,2]
#> a    1    2
#> b    3    4
#> c    5    6

Injecting names with “{”

Dynamic dots also allows you to use an argument name that is stored in a variable. In the case of rbind2() this makes it possible to use a variable to name the row.

name <- "foo"

rbind2("{name}" := 1:2, bar = 3:4)
#>     [,1] [,2]
#> foo    1    2
#> bar    3    4

rbind2("prefix_{name}" := 1:2, bar = 3:4)
#>            [,1] [,2]
#> prefix_foo    1    2
#> bar           3    4

29.3.2 Metaprogramming injection

Embracing with {{

The embracing operator is made for dealing with function arguments. “It defuses the expression supplied as argument and immediately injects it in place.” The evaluation usually takes place in the context of a data mask.

Injecting with !!

!! is meant to inject a single object in place. For example, it can inject a data-symbol object stored in an environment variable into a data-masking context to ensure that it is evaluated.

var <- data_sym("disp")

mtcars %>%
  dplyr::summarise(avg = mean(!!var, na.rm = TRUE))
#>        avg
#> 1 230.7219

Splicing with !!!

The splice operator !!! can be used in data-masking contexts and inside inject(). For example, rbind2() could be rewritten with inject() so that the function can also use the splice operator and no longer needs do.call().

rbind2 <- function(...) {
  inject(rbind(!!!list2(...)))
}
rbind2(!!!rows, c = 5:6)
#>   [,1] [,2]
#> a    1    2
#> b    3    4
#> c    5    6

29.4 Metaprogramming patterns

This vignette is meant to present more theoretical and advanced patterns than those discussed in Data mask programming patterns.

29.4.1 Forwarding patterns

Defuse and inject

The defuse and inject pattern can be done in wither one or two steps. Using the embracing operator and passing the dots is the simpler form. However, sometimes you might want to inspect or modify the expression before injecting them in the target context. This is made possible by the two-step patterns of enquo() and !! or enquos() and !!!.

  • {{ is the combination of enquo() and !!.
  • Passing ... is equivalent to the combination of enquos() and !!!.
my_summarise <- function(data, var) {
  data %>% dplyr::summarise({{ var }})
}
my_summarise <- function(data, var) {
  data %>% dplyr::summarise(!!enquo(var))
}

my_group_by <- function(.data, ...) {
  .data %>% dplyr::group_by(...)
}
my_group_by <- function(.data, ...) {
  .data %>% dplyr::group_by(!!!enquos(...))
}

Inspecting input labels

Use of as_label() or englue() to create an automatic name for one or more defused arguments.

# as_label()
f <- function(var) {
  var <- enquo(var)
  as_label(var)
}

# englue()
f2 <- function(var) {
  englue("{{ var }}")
}

f(cyl)
#> [1] "cyl"

f2(1 + 1)
#> [1] "1 + 1"

With multiple arguments you can use enquos() and set the .name argument to TRUE to automatically call as_label() on the inputs, though the user can also provide names.

g <- function(...) {
  vars <- enquos(..., .named = TRUE)
  names(vars)
}

# automatic names with as_label
g(cyl, 1 + 1)
#> [1] "cyl"   "1 + 1"

# user provided names
g(x = cyl, y = 1 + 1)
#> [1] "x" "y"

29.4.2 Names patterns: Symbolize and inject

You can use a symbolize and inject pattern when across(all_of()) is not supported. In this pattern defused expressions are created that refer to column names that are then injected into a data-mask context.

You can cast a string to a symbol with sym() and syms() to return simple symbols or data_sym() and data_syms() that return calls to $ to subset the .data pronoun. The later functions can only be used in a tidy eval context.

var <- "cyl"
vars <- c("cyl", "am")

sym(var)
#> cyl
syms(vars)
#> [[1]]
#> cyl
#> 
#> [[2]]
#> am

data_sym(var)
#> .data$cyl
data_syms(vars)
#> [[1]]
#> .data$cyl
#> 
#> [[2]]
#> .data$am

This pattern can be used to create a group_by() variant that takes a vector of names that is captured by data_syms() and then injected with the splice operator !!!.

my_group_by <- function(data, vars) {
  data %>% dplyr::group_by(!!!data_syms(vars))
}

mtcars %>% my_group_by(vars)
#> # A tibble: 32 × 11
#> # Groups:   cyl, am [6]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ℹ 22 more rows

29.4.3 Bridge patterns

mutate() as a data-mask to selection bridge that accomplishes the same task as in Data masking patterns - Bridge patterns but using enquos() to defuse and inspect the names.

my_pivot_longer <- function(data, ...) {
  # Defuse the dots and inspect the names
  dots <- enquos(..., .named = TRUE)
  names <- names(dots)

  # Pass the inputs to `mutate()`
  data <- data %>% dplyr::mutate(!!!dots)

  # Select `...` inputs by name with `all_of()`
  data %>%
    tidyr::pivot_longer(cols = all_of(names))
}

mtcars %>% my_pivot_longer(cyl, am = am * 100)
#> # A tibble: 64 × 11
#>      mpg  disp    hp  drat    wt  qsec    vs  gear  carb name  value
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
#>  1  21     160   110  3.9   2.62  16.5     0     4     4 cyl       6
#>  2  21     160   110  3.9   2.62  16.5     0     4     4 am      100
#>  3  21     160   110  3.9   2.88  17.0     0     4     4 cyl       6
#>  4  21     160   110  3.9   2.88  17.0     0     4     4 am      100
#>  5  22.8   108    93  3.85  2.32  18.6     1     4     1 cyl       4
#>  6  22.8   108    93  3.85  2.32  18.6     1     4     1 am      100
#>  7  21.4   258   110  3.08  3.22  19.4     1     3     1 cyl       6
#>  8  21.4   258   110  3.08  3.22  19.4     1     3     1 am        0
#>  9  18.7   360   175  3.15  3.44  17.0     0     3     2 cyl       8
#> 10  18.7   360   175  3.15  3.44  17.0     0     3     2 am        0
#> # ℹ 54 more rows

29.4.4 Transformation patterns

It is also possible to recreate my_mean() from Data masking patterns - Transformation patterns. “The pattern consists in defusing the input expression, building larger calls around them, and finally inject the modified expressions inside the data-masking functions.” With ... to take in multiple arguments, you need to use purrr::map() to loop over the arguments to construct the call.

my_mean <- function(.data, ...) {
  # Defuse the dots. Make sure they are automatically named.
  vars <- enquos(..., .named = TRUE)

  # Map over each defused expression and wrap it in a call to `mean()`
  vars <- purrr::map(vars, ~ expr(mean(!!.x, na.rm = TRUE)))

  # Inject the expressions
  .data %>% dplyr::summarise(!!!vars)
}

mtcars %>% my_mean(cyl, mpg)
#>      cyl      mpg
#> 1 6.1875 20.09062

The difference with the previous version of my_mean() is that the function does not inherit tidy selection helpers and syntax. However, it does gain the ability to create new vectors on the fly as in summarise().

mtcars %>% my_mean(cyl = cyl * 100, mpg)
#>      cyl      mpg
#> 1 618.75 20.09062

29.5 What are quosures and when are they needed?

A quosure is a special type of defused expression that keeps track of the original context in which the expression was written. The ability to keep track of the original context helps to interface multiple data-masking functions that might come from two unrelated environments, like two different packages.

29.5.1 Blending environments

Example of a function call that uses a function from a package that uses another function from another package, which, in turn, is built on data-masking functions in dplyr. This creates a number of different contexts or environments in which the code needs to pass through to be evaluated. The role of quosures is to ensure that each variable is evaluated in the correct context.

# Function call
dplyr::starwars %>%
  foo::summarise_bmi(mass, div100(height))

# Context 1: global environment of user
div100 <- function(x) {
  x / 100
}

# Context 2: foo package
bmi <- function(mass, height) {
  mass / height^2
}

summarise_bmi <- function(data, mass, height) {
  data %>%
    bar::summarise_stats(bmi({{ mass }}, {{ height }}))
}

# Context 3: bar package
check_numeric <- function(x) {
  stopifnot(is.numeric(x))
  x
}

summarise_stats <- function(data, var) {
  # Context 4: dplyr package
  data %>%
    dplyr::transmute(
      var = check_numeric({{ var }})
    ) %>%
    dplyr::summarise(
      mean = mean(var, na.rm = TRUE),
      sd = sd(var, na.rm = TRUE)
    )
}

# Final expression with quosures identified by ^
dplyr::transmute(
  var = ^check_numeric(^bmi(^mass, ^div100(height)))
)

29.5.2 When should I create quosures?

{{ and dynamic dots create quosures for you, and so tidy eval documentation has moved away from directly discussing quosures.

As a rule of thumb, quosures are only needed for defused arguments that come from another environment (often the user environment) not your own. Any local expressions created within a function do not need quosures because there is no exchange of environment. Thus, local expressions can be created with expr(), which do not carry an environment and so are not quosures. These expressions can be evaluated with either !! or eval()/eval_tidy().

my_mean <- function(data, var) {
  # `expr()` is sufficient
  expr <- expr(mean({{ var }}))
  dplyr::summarise(data, !!expr)
}
my_mean(mtcars, cyl)
#>   mean(cyl)
#> 1    6.1875

my_mean <- function(data, var) {
  expr <- expr(mean({{ var }}))
  eval_tidy(expr, data)
}
my_mean(mtcars, cyl)
#> [1] 6.1875

29.5.3 Technical description of quosures

  • Quosures are made up of an expression and an environment.
  • Quosures are:
    • Callable: evaluation produces a result
    • Hygienic: evaluated in the tracked environment
    • Maskable: can be evaluated in a data mask such that the mask comes first in scope before the quosure environment.