I want to plot the SHAP values for my RSF model; here is the code and error:
xvars <- c("RIDRETH1", "RXDLIPID", "DRXTKCAL", "DRXTPROT", "DRXTCARB", "DRXTCHOL", "DRXTFIBE", "DRXTVARA", "DRXTATOC", "DRXTSODI", "DRXTPOTA", "DRXTM161", "DRXTM181", "DRXTM201", "DRXTM221", "DRXTP182", "DRXTP183", "DRXTP184", "DRXTP204", "DRXTP205", "DRXTP225", "DRXTP226", "DRXTRET", "DRXT_G_TOTAL", "DRXT_V_STARCHY_TOTAL", "DRXTS160", "DRXTS180", "DRXTsumSFA", "INDFMPIR", "LBXCOT", "GENDERRC")
X <- Data[sample(nrow(Data), 1000), xvars]
bg_X <- Data[sample(nrow(Data), 200), ]
system.time(
ks <- kernelshap(rf_mort_nutrients_withoutage_1018_all, X, bg_X = bg_X, type = 'prob')
)
ks
ks <- shapviz(ks)
sv_importance(ks, kind = "bee", )
Error: Fejl i align_pred(pred_fun(object, bg_X, ...)) : Predictions must be numeric! Timing stopped at: 0.03 0.05 0.11
These are my predictions:
rf_mort_nutrients_withoutage_1018_all$predicted
[1] 81.31376 75.82491 99.35944 58.63055 67.65847 98.32906 75.33934 107.81604 62.22175 75.69875 69.99881 83.67161 81.39735 65.59381
I am not sure why it is not working. Anyone has an idea?


predict(rf_mort_nutrients_withoutage_1018_all, X, type = 'prob')? You should get numeric output. Without reproducible example, we won't be able to help.Sample size of test (predict) data: 1000 Number of grow trees: 200 Average no. of grow terminal nodes: 35.705 Total no. of grow variables: 31 Resampling used to grow trees: swor Resample size used to grow trees: 37243 Analysis: RSF Family: survrf_mort_nutrients <- rfsrc(Surv(endage, mortality_status) ~ . , data = data, ntree = 200, nodesize = 1000, importance = T)and I would like to plot the SHAP values to understand which variables contribute most to mortality and in which direction, ie, how feature value alters the prediction. Is there a method compatible with random survival forests to have this?