0

I have data frame with 18 columns. Columns 2 to 13 include numeric values such as 0, 1, 2, 4 ... I want to recode them based on range into three categories:

if columns 2:13 are 0 -> 0
if columns 2:13 between 1 & 5 -> 1
else columns 2:13 >- 2.

My attempt works, but not efficient:

df[,2:13][df[,2:13] == 1 | df[,2:13] == 2 | df[,2:13] == 3 | df[,2:13] == 4 | df[,2:13] == 5] <- 1

I appreciate your help.

1 Answer 1

1

Try findInterval:

dplyr

library(dplyr)
df %>%
  mutate(
    across(2:13, ~ findInterval(., c(0, 1, 5), rightmost.closed = TRUE) - 1L)
  )

If this gets any more complex (such as non-consecutive recoded values), we might switch to case_when:

df %>%
  mutate(
    across(2:13, ~ case_when(
      . == 0           ~ 0L,
      between(., 1, 5) ~ 1L,
      TRUE             ~ 2L
    ))
  )

base R

df[,2:13] <- lapply(df[,2:13], function(z) findInterval(z, c(0, 1, 5), rightmost.closed = TRUE) - 1L)
Sign up to request clarification or add additional context in comments.

2 Comments

Awesome! Thank you so much. I was not aware of findInterval function.
It's very similar to cut, useful for returning strings/labels (including number-range looking things)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.