I need your help because I have a big data set about illnesses (wide format). So I have 54 different illnesses, each having a block of 18 questions (data is nested in illnesses and participants).
As I have the same variables/questions for each illness, I am trying to find a fast way to calculate rowMeans for the scales (maybe using a loop function).
So basically I have the variables epi_scm.1 - epi_scm.18, ms_scm.1-ms_scm.18, autism_scm.1-autism_scm.18 and so on (beginning of column name indicating the illness and end indicating multi items) and I need to calculate the rowMeans out of the multi items for each illness (e.g., Morality = rowMeans([, c("epi_scm.1","epi_scm.2", etc.)] but I do not wanna do that manually for every illness (as there are many).
Do you know how to do this more efficiently ? (I hope you understood what I mean)
Thanks and best regards!
L
I tried to subset each illness but that takes too much time and isnt really suitable for my main analyses:
#Subset data to include only 42 columns
subset_epi <- df4[, 1:42] # Replace 1:42 with the indices or column names of the columns I want to keep
subset_epi <- subset_epi[complete.cases(subset_epi), ]
#Organize index numbers subset
rownames(subset_epi) <- 1:nrow(subset_epi)
dim(subset_epi)
#New variable Morality = Mean Score for Morality
subset_epi$Morality <- rowMeans(na.omit(subset_epi[, c("epi_scm_1", "epi_scm_2", "epi_scm_3", "epi_scm_4", "epi_scm_5")]))
illnessand simplifies everything downstream.