Wednesday, May 30, 2018

Use R Environment Object to Share and Modify (Big) Data Within Functions

We can use the R environment object just like a list.  One advantage is that it will be passed by reference to functions.  So it can be modified inside functions without the need to make additional copies.  Refer to Environment for more detailed explanation.  Following is an example usage.


## Use Reference Semantics to pass data object to functions by reference.
dat <- new.env(parent = emptyenv())

## Put in some data
dat$iris <- iris
dat$cars <- cars

## Checking the original object. 
ls(dat)                                
## [1] "cars" "iris"

## An example function
f <- function(d) d$N <- nrow(d$iris) + nrow(d$cars)

## Passing the data by reference: It can be modified inside the function.
f(dat)

## Checking the modified object. 
ls(dat)                                
## [1] "cars" "iris" "N"   
dat$N
## [1] 200
identical(dat$N, nrow(dat$cars) + nrow(dat$iris))
## [1] TRUE

## In practice, we can use f1(dat), f2(dat), f3(dat), ...
## to continuously update the information/states maintained in "dat".

No comments:

Post a Comment