2

I am applying a user-defined function to individual cells of a 3D array. The contents of each cell are one of the following possibilities, all of which are character vectors because of prior formatting:

"N"
"A"
""
"1"
"0"

I want to create a new 3D array of the same dimensions, where cells contain either NA or a numeric vector containing 1 or 0. Thus, I wrote a function named Numericize and used aaply to apply it to the entire array. However, it takes forever to apply it.

Numericize <- function(x){
  if(!is.na(x)){
    x[x=="N"] <- NA; x
    x[x=="A"] <- NA; x
    x[x==""] <- NA; x
    x <- as.integer(x)
  }
  return(x)
}

The dimensions original array are 480x866x366. The function takes forever to apply using the following code:

Final.Daily.Array <- aaply(.data = Complete.Daily.Array,
                           .margins = c(1,2,3),
                           .fun = Numericize,
                           .progress = "text")

I am unsure if the speed issue comes from an inefficient Numericize, an inefficient aaply, or something else entirely. I considered trying to set up parallel computing using the plyr package but I wouldn't think that such a simple command would require parallel processing.

On one hand I am concerned that I created a stack overflow for myself (see this for more), but I have applied other functions to similar arrays without problems.

ex.array <- array(dim = c(3,3,3))

ex.array[,,1] <- c("N","A","","1","0","N","A","","1")
ex.array[,,2] <- c("0","N","A","","1","0","N","A","")
ex.array[,,3] <- c("1","0","N","A","","1","0","N","A")

desired.array <- array(dim = c(3,3,3))

desired.array[,,1] <- c(NA,NA,NA,1,0,NA,NA,NA,1)
desired.array[,,2] <- c(0,NA,NA,NA,1,0,NA,NA,NA)
desired.array[,,3] <- c(1,0,NA,NA,NA,1,0,NA,NA)

ex.array
desired.array

Any suggestions?

3
  • 1
    Maybe just array(as.numeric(ex.array), dim = dim(ex.array)), because identical(array(as.numeric(ex.array), dim = dim(ex.array)), desired.array) is TRUE. Commented Sep 18, 2017 at 19:10
  • Thank you for this elegantly simple answer. This approach comes with a warning that NA values are introduced by coercion but that is to be expected based on what goes in and what is expected to come out. Commented Sep 18, 2017 at 19:18
  • jep, I'd ignore that warning (or rather take note of it, because often times it's valuable). Commented Sep 18, 2017 at 19:20

1 Answer 1

1

You can just use a vectorized approach:

ex.array[ex.array %in% c("", "N", "A")] <- NA
storage.mode(ex.array) <- "integer"

You can simply use the second line and it will introduce NAs by coercion.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.