2

A relatively trivial question that has been bothering me for a while, but to which I have not yet found an answer - perhaps because I have trouble verbalizing the problem for search engines.

Here is a column of a data frame that contains identifiers.

data <- data.frame("id" = c("D78", "L30", "F02", "A23", "B45", "T01", "Q38", "S30", "K84", "O04", "P12", "Z33")) 

Based on a lookup table, outdated identifiers are to be recoded into new ones. Here is an example look up table.

recode_table <- data.frame("old" = c("A23", "B45", "K84", "Z33"),
                           "new" = c("A24", "B46", "K88", "Z33"))

What I need now can be done with a merge or a loop. Here a loop example:

for(ID in recode_table$old) {
  data[data$id == ID, "id"] <- recode_table[recode_table$old == ID, "new"]
}

But I am looking for a dplyr solution without having to use the " join" family. I would like something like this.

    data <- mutate(data, id = ifelse(id %in% recode_table$old, filter(recode_table, old == id) %>% pull(new), id))

Obviously though, I can't use the column name ("id") of the table in order to identify the new ID. References to corresponding passages in documentations or manuals are also appreciated. Thanks in advance!

2 Answers 2

6

You can use recode with unquote splicing (!!!) on a named vector

library(dplyr)

# vector of new IDs
recode_vec <- recode_table$new
# named with old IDs
names(recode_vec) <- recode_table$old

data %>% 
  mutate(id = recode(id, !!!recode_vec))
#     id
# 1  D78
# 2  L30
# 3  F02
# 4  A24
# 5  B46
# 6  T01
# 7  Q38
# 8  S30
# 9  K88
# 10 O04
# 11 P12
# 12 Z33
Sign up to request clarification or add additional context in comments.

Comments

0

Using left_join()

data |> 
  left_join(recode_table, by = join_by(id == old)) |>
  mutate(id = coalesce(new, id), new = NULL)

# id
# 1  D78
# 2  L30
# 3  F02
# 4  A24
# 5  B46
# 6  T01
# 7  Q38
# 8  S30
# 9  K88
# 10 O04
# 11 P12
# 12 Z33

Converting to vector (as Ric did) and using subsetting:

recode_vec <- with(recode_table, setNames(new, old))
data |> 
  mutate(id = coalesce(recode_vec[id], id))

Same as above but with only base R

update_rows <- data$id %in% recode_vec
data$id[update_rows] <- recode_vec[data$id[update_rows]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.