Constant Propagation

Background

Constant propagation is the process of substituting the values of known constants in expressions. Constant propagation eliminates cases in which values are copied from one location or variable to another, in order to simply assign their value to another variable.

For example, consider the following code:

x <- 14
y <- 7 - x / 2
z <- y * (28 / x + 2) - x

Here, x is assigned a constant, and thus, can be propagated (three times). Propagating yields:

x <- 14
y <- 7 - 14 / 2
z <- y * (28 / 14 + 2) - 14

Constant propagation enables the code to assign static values, which is faster than looking up and copying the value of a variable, and also saves time by eliminating assigning a value to a variable that is itself subsequently used only to propagate that value throughout the code. In some cases, copy propagation itself may not provide direct optimizations, but simply facilitates other transformations, such as constant folding, code motion, and dead code elimination.

Example

A simple example would be a code that converts the unit of many temporary samples, from hours to miliseconds miliseconds <- 1000 * 60 * 60 * hours.

code <- paste(
  "n <- 1000",
  "hours_vector <- runif(n, 0, 24)",
  "ms_vector <- numeric(n)",
  "hs_to_mins <- 60",
  "mins_to_secs <- 60",
  "secs_to_ms <- 1000",
  "# of course it would be much efficient to do vectorized operations xP",
  "for (i in 1:n) {",
  "  ms_vector[i] <- secs_to_ms * mins_to_secs * hs_to_mins * hours_vector[i]",
  "}",
  sep = "\n"
)
cat(code)
## n <- 1000
## hours_vector <- runif(n, 0, 24)
## ms_vector <- numeric(n)
## hs_to_mins <- 60
## mins_to_secs <- 60
## secs_to_ms <- 1000
## # of course it would be much efficient to do vectorized operations xP
## for (i in 1:n) {
##   ms_vector[i] <- secs_to_ms * mins_to_secs * hs_to_mins * hours_vector[i]
## }

Then, the automatically optimized code would be:

opt_code <- opt_constant_propagation(list(code))
cat(opt_code$codes[[1]])
## n <- 1000
## hours_vector <- runif(n, 0, 24)
## ms_vector <- numeric(n)
## hs_to_mins <- 60
## mins_to_secs <- 60
## secs_to_ms <- 1000
## # of course it would be much efficient to do vectorized operations xP
## for (i in 1:n) {
##   ms_vector[i] <- 1000 * 60 * 60 * hours_vector[i]
## }

And if we measure the execution time of each one, and the speed-up:

bmark_res <- microbenchmark({
  eval(parse(text = code))
}, {
  eval(parse(text = opt_code))
})
autoplot(bmark_res)

plot of chunk unnamed-chunk-6

speed_up(bmark_res)
##            Min. 1st Qu.   Median    Mean  3rd Qu.     Max.
## Expr_2 29.33842 29.0092 27.40792 27.9919 28.49016 29.41789

Implementation

The opt_constant_propagation optimizer analyzes the code from top to bottom. As it goes through the code, it performs two tasks:

Depending on what expression it finds, it applies one of the following criteria:

VAR <- CONST

If it is an assignment of a constant value: then it keeps the same expression, and saves VAR = CONST in values.

This criteria also includes, multi-assign ( VAR <- VAR <- CONST ).

VAR * CONST + VAR

If it is only operators, variables, constants, and precedence operators: then it will check if any of the variables is stored in values vector, and would replace them in the expression. It will return the modified expression.

FUN({FUN_PARAMS})

If it is a function call, and opt_constant_propagation's in_fun_call parameter is set to TRUE: then it will try to constant propagate in {FUN_PARAMS}. Consider the case that the function call is seq_len(x+30), it could be replaced by seq_len(-3+30).

Then, it empties the values vector ( values <- c() ). It should be noted that in R, calling a function can modify the current environment, being the simplest example rm(list = ls()), or assign("x", 4).

LOOP (COND) { BODY }

If it is a loop ( repeat, while, or for ): it will get which variables are being assigned in the loop and remove them from values, and then propagate in COND and BODY.

In-loop assigned variables are removed from values as, if not, it would try to propagate them, and these variables might change in next execution of BODY. For instance:

x <- 3
y <- 1 + 1
while (y < 5) {
  y <- x
  x <- x + 1
}

Would try to propagate to:

x <- 3
y <- 1 + 1
while (y < 5) {
  y <- 3
  x <- 3 + 1
}

which is not equivalent.

IF (COND) { BODY } ELSE { BODY }

If it is an if: it will propagate values in COND and BODY, then it will get which variables are being assigned in the if/else BODY and remove them from values.

In-loop assigned variables are removed from values, as it is possible that the BODY is never executed (FALSE condition), and thus, these variables would not be updated or assigned.

VAR <- EXPR

If it is an assignment (of a non constant value): it will keep VAR <- and try to propagate on EXPR. Moreover, it will remove VAR from values as it is assigned a non-constant value.

FUN ({FUN_PARAMS}) { BODY }

If it is a function definition: it will propagate on the BODY but with a new empty values vector (the global values vector would not be changed). Note that in R, a function definition will have another environment, and thus it should not use the current values.

Other cases

In any other case, it should keep propagating on sub-expressions.

To-Do