709 questions
5
votes
1
answer
107
views
step_rename does not work like dplyr::rename
With dplyr::rename I can rename columns if they exist:
library(dplyr)
df <- data.frame(a_old = 1:3, b_new = 11:13)
lkp <- c(a_new = "a_old", b_new = "b_old")
df %>% rename(...
-1
votes
2
answers
62
views
Recipe steps on response cannot be replicated with the test data set
I defined a workflow_set with a couple of different models with a common recipe. The recipe is created via the following function (irrelevant steps removed):
library(tidymodels)
preprocess_data <- ...
0
votes
0
answers
80
views
tidymodels lightgbm hyperparameters training issue
I have an issue with training lightgbm models through tidymodels.
There seems to be some sort of issues in how the hyperparameters are translated between tidymodels and lightgbm.
This is my code:
...
1
vote
1
answer
98
views
Parsnip::set_engine("xgboost") - passing direct arguments vs list
I am training an xgboost model using parsnip::set_engine() in which I am trying to pass additional arguments to the engine.
I have tried both
model_list$xgboost <- parsnip::boost_tree(
trees ...
0
votes
0
answers
45
views
Stratified (group-wise) imputation with tidymodels::recipes
In my application, the data-generating process requires stratified handling, as the data was sampled within known strata (e.g., by country), and each stratum is assumed to follow a structurally ...
1
vote
1
answer
59
views
How to prevent new factor beeing coerced to NA (warning) when predicting unlabelled data using tidymodels?
Context
I used the tidymodels package to build a model following the steps in the example below. I would like to use the trained model to predict values for new, unlabelled data (object df_nolabel in ...
0
votes
1
answer
95
views
`future` and `rsample` for parallel bootstrapping
I am trying to use future parallelization via furrr or future.apply in combination with the rsample package to bootstrap the estimates of a model. Sequential estimation works. Parallel estimation with ...
0
votes
1
answer
42
views
Tidymodels: Use step_dummy() for multiple binary classifications?
I am a little bit lost in tidymodels. I have a some data from topicmodeling:
prevalent_topic: factor variable with most prevalent topic, ranging from "Topic_1" to "Topic_5"
value1 ...
0
votes
0
answers
51
views
TidySDM non-conformable arguments model evaluation error in Ecological Niche Model with subsetted data
I am making ecological niche models in TidySDM in R. I recently made a range-wide model using 5 fold random cross validation, grid set to 100, and algorithms that included GLM, GBM, MAXENT, and MARS. ...
1
vote
1
answer
57
views
R tidymodels: how to pass argument `initial` using workflow_map()?
I am trying to pass the results from tune_grid() into the argument initial of tune_bayes(). This works when calling directly tune_bayes() (see example in ?tune_bayes). However, I don't see how I can ...
0
votes
1
answer
73
views
jackknife (leave-one-out) cross validation with tidySDM and SpatialSample package in R
I am working with very low sample size (10-15) trying to create an ensemble SDM with the package TidySDM. Cross validation won't work with the random k fold method and in these cases of low sample ...
1
vote
1
answer
132
views
Error Downloading Chelsa Climate Data with pastclim
I attempted to download the current chelsa 2.1 climate variables using pastclim. However, I keep getting the error:
Error: buildvrt: options error.
terra, gdal, and pastclim are up-to-date. Is this ...
1
vote
1
answer
101
views
Problem running classification Gam models with Additive Function
I am having problem running a classification (binary outcome) gam model using the additive package in combination with tidymodels. I have tried to follow the creator of additive's GetStarted.Rmd ...
1
vote
1
answer
93
views
Setting maxent parameters manually within TidySDM Package in R
Within the TidySDM package, you can define what models you wish to include in your ensemble with the "workflow_set" function. When including a maxent model you can use the argument "...
0
votes
1
answer
74
views
Why does tune_grid find character variables instead of factors?
Below is a self-contained code example.
I have test_data with character columns (name, id, gender)
I convert them all to factors
I mark name and id as "informational" (i.e. not to be used ...
1
vote
1
answer
65
views
Why is step_cut from R's recipes package complaining about a factor in my data frame?
This may be something simple I'm missing. I'm new to recipes.
Below is some code that tries to apply recipes::step_cut to two variables: xpo, then xpr. The prep after the second step_cut errors out, ...
0
votes
1
answer
97
views
How to deploy R tidymodels ML model to GCP
I want to deploy my Tidymodels ML model to GCP so it can serve up predictions to other.
I am following along to this video from Julia Silge where she uses Vetiver and Docker to deploy to RStudio ...
0
votes
2
answers
115
views
How to use column names in a list of values in a loop?
The code below is what I am currently using. It will take a tibble of population data and join a separate tibble with three different variables. I then am only running the process of taking three ...
1
vote
1
answer
75
views
What is the difference between setting the engine parameter in linear_reg() implicitly vs. explicitly?
I am doing a course on linear regression with R using the tidymodels package and the sample code for specifying the linear regression model is:
linear_model_specification <- linear_reg() %>%
...
0
votes
1
answer
39
views
Tidymodel grid search for semi-supervised algorithms
I need to use tidymodels to perform a grid search of hyperparameters for a few semi-supervised algorithms as implemented in the package SSRL https://dicits.ugr.es/software/SSLR/index.html.
Let us take ...
1
vote
1
answer
71
views
How to get the training error from fit_resamples and hyperparameter tuning?
During a cross-validation, fit_resamples return the average of the metric from the validation set.
lr_model <-
linear_reg() |>
set_engine('lm')
lr_wf <-
workflow() |>
add_recipe(...
0
votes
0
answers
90
views
SVM tuning process: Caused by error in `vectbl_recycle_rhs_rows()`:
I don't understand the output of the tuning process of SVM model.
Here is my code.
basic_recipe <-
recipe(target ~
loan_type + New_versus_Repeat +
Total_Amount + ...
1
vote
1
answer
147
views
Access to Lasso coefficients using Tidymodels
I'm trying to follow along to one of Julia Silge's blog posts about lasso regression using tidymodels however I am getting stuck when trying to get access to the coefficients.
I wish to plot something ...
1
vote
1
answer
135
views
tidymodels workflow errors when trying to use workflow_map()
Thanks for your help in advance! I'm new to tidymodels (and modeling in general) and am having a hard time identifying what's going wrong to troubleshoot my workflow set up.
I'm running four different ...
0
votes
2
answers
62
views
View which were the nearest neighbors in tidymodel workflow
I have a similar question like this: How to view the nearest neighbors in R?
Code example:
library(tidymodels)
knn_rec <- recipe(Species ~ ., data = iris)
knn_lookup <- workflow() %>%
...
0
votes
1
answer
114
views
How can I fit and predict each model from workflow_map results?
I am using tidymodels to train various classification models and to tune hyperparameters within these models using parsnip. As a newcomer to tidymodels, I have completed some tasks by following ...
0
votes
1
answer
127
views
Multistate Survival Models with Censored and Tidymodels
I am trying to use tidymodels' newish censored functionality to create a multistate survival model and am running into issues. I am seeing the below error when I try to fit my model. Normally this can ...
2
votes
1
answer
67
views
How can I pass the retain parameter, when fitting a tidymodels workflow
I can use the retain=TRUE parameter to the prep() function to store the preprocessed train data to the recipe. The help page of prep() highly recommends using workflows. But how can I pass the retain=...
0
votes
2
answers
61
views
step_ns() in tidymodels: Error in `step_ns()`: Caused by error in `qr.default()`: ! NA/NaN/Inf in foreign function call (arg 1)
I want to fit a linear model and I want to add splines of all numerical variables using tidyverse for evaluation, and ideally for penalization.
While this sounds clever, I cannot figure out how to ...
0
votes
1
answer
172
views
Calibration Plots for Survival Analysis
I am unable to create calibration plots for my survival analysis project. (oesophageal cancer dataset)
I have finalised my model (AORSF) after tuning:
aorsf_fit <- last_fit(
final_aorsf_wf,
...
1
vote
3
answers
178
views
Error with H2O in R - No active connection to an H2O cluster despite "Successful Connection"
I would like to switch the parsinp engine to h2o and use h2o and agua packages to fit models. The following code is from the standard help site.
With tune::tune_grid, I bump into the "Warning: ...
1
vote
0
answers
160
views
Conducting LCA using tidySEM Package
I attempted to estimate the number of classes in an exploratory Latent Class Analysis (LCA) using the tidySEM package
res <- mx_lca(data = df, classes = 1:7))
Unfortunately, the process took an ...
0
votes
0
answers
177
views
Issues with step_interact in a recipe in R
trying to fit a model but keep getting this error about step_interact. No idea what the issue is.
code:
recipe1 <- recipe(survived~pclass+sex+age+sib_sp+parch+fare,data=training)
%>% ...
1
vote
1
answer
154
views
Error in validate_column_names(): Missing required columns after applying recipe in Tidymodels workflow with XGBoost
I'm encountering an issue when using tidymodels with xgboost in a workflow. After applying a recipe that includes step_dummy() to convert categorical variables into dummy variables, I receive the ...
2
votes
0
answers
38
views
tidymodels: how to update worklow set to remove parameters leading to same RMSE?
I am using the tidymodels workflow_map(), and inspecting the resulting "workflow set/tibble" I see many parameters leading to the same fit (same rmse, same rsq). (how) can I remove ...
0
votes
1
answer
30
views
tidymodels / shapviz::shapviz: Error in s[, nms, drop = FALSE] : incorrect number of dimensions
When I run shapviz::shapviz, I get the error:
Error in s[, nms, drop = FALSE] : incorrect number of dimensions
I believe I am using the correct predictors for the model, which are being "...
0
votes
1
answer
94
views
Case weights for roc auc and brier score in tidymodels
I am testing the performance of a prediction model (binary scenario: 0 or 1) using tidymodels in R. I have created importance weights before the fitting process for all individuals in my dataset. I ...
0
votes
1
answer
65
views
tidymodels step_corr() fails to remove highly correlated columns?
I can't seem to get step_corr() to function inside a recipe.
Minimal example:
df <- data.frame(x1=runif(10)) %>%
mutate(x2=x1+1) %>%
mutate(y=x1+rnorm(10))
cor(df)
rec <- recipe(y~...
1
vote
1
answer
156
views
How can I get Variable Importance Plot for a categorical SVM in tidymodels
I would like to get VIP info on a SVM model that is doing classification. I found this useful post Variable importance plot for support vector machine with tidymodel framework is not working which ...
0
votes
0
answers
31
views
workflow_sets is generating columns with different lengths
I am trying to define a workflow_set to try some models on a machine learning task.
base_recipe = recipe(high_traffic ~ ., data = recipes) %>%
# remove recipe from list of predictors
update_role(...
1
vote
2
answers
56
views
Can I create new features from a matrix?
I'd like to create features from an ispline in tidymodels. One way to do this would be to use step_mutate as follows:
library(tidymodels)
library(tidyverse)
library(splines2)
data <- data.frame(x ...
1
vote
1
answer
272
views
How to fit kernelshap and shapviz into a tidymodels workflow?
I am using a mtcars dataset and tune an xgboost model to predict mpg
# Load required packages
library(tidymodels)
library(xgboost)
library(DALEX)
library(DALEXtra)
# Load example data
data(mtcars)
# ...
0
votes
1
answer
50
views
Subset data into analysis and assessment data to generate and apply grouped prediction models in R [closed]
This question is related to the following but I am unable to make it work
Linear Regression model building and prediction by group in R
Fit a model on each group and evaluate it using data from all ...
0
votes
1
answer
56
views
R tidymodels recipes step - from a column with multiple values, create a new column for each of these values (one hot encoding)
Say i have this dataframe:
library(tidyverse)
# Sample data frame
df <- data.frame(
id = 1:3,
fruits = c("apple | oranges", "apple | bananas", "bananas | oranges")
...
0
votes
1
answer
152
views
Tidymodels and DALEX Error: The class of model must have a model_type method
The following code illustrates a tuned xgboost with mtcars data. I want to work with surogate models using DALEX later on.
# Load required packages
library(tidymodels)
library(xgboost)
library(DALEX)
...
0
votes
1
answer
65
views
R Tidyclust: Tune a K-prototypes model
I'm trying to tune a K-prototypes clutering model using the tidyclust framework. I'm modifing the guide at tidymodels.org to use K-prototypes instead of k-means. However, I get the below error when ...
1
vote
0
answers
43
views
Is nest + rsample::bootstraps + unnest the same as rsample::group_bootstraps?
I noticed that group_bootstraps is fairly slow compared to bootstraps.
As a workaround I am doing nesting and unnesting which is a lot faster.
Are version 1 and version 2 equivilant or am I missing ...
0
votes
2
answers
141
views
rsample group_bootstrap is ~2000 times slower than bootstrap. Why?
The package rsample contains a function for bootstrapping and another function which allows bootstrapping on groups of the data.
The grouped vesion is a lot slower (~2000 times). I was expecting it to ...
0
votes
1
answer
284
views
How to Extract Preprocessed Variables in Recipes from Tidymodels
I’m using the recipes package from the tidymodels to preprocess my data. I’ve applied a series of preprocessing steps using recipe() to two different groups. I’m trying to compare which variables were ...
0
votes
1
answer
92
views
How to calculate the confidence intervals of performance metrics in tidymodels?
I'm learning to get the confidence intervals of performance metrics following this article: Confidence Intervals for Performance Metrics
but I always got this error when I run mars_boot <- int_pctl(...