Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
27 views

Problem: In sklearn Random Forest classifier, the class labelling influences the gini importance of features. I would expect the labelling of classes should not influence the importance values and ...
Micha's user avatar
  • 1
0 votes
0 answers
88 views

I have a trained random forest regressor from scikit-learn: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html I then want to make use of (but not train ...
Tor's user avatar
  • 803
0 votes
0 answers
55 views

I am trying to build a multilabel classification model using a random forest classifier. However, for some reason iterative_train_test_split does not return the same split even though it should use np....
MisterJAcobz's user avatar
0 votes
0 answers
35 views

I have created 3 models (a RF, CNN-LSTM and a MLP) to be deployed in the FRDM MCXN947 microcontroller. They have sizes of 2.18 GB, 7.75 MB and 0.186 MB respectively. I have saved the RF model as ....
min_een's user avatar
6 votes
2 answers
103 views

I am trying to replicate a specific decision tree trained by a RandomForestRegressor class, using DecisionTreeRegressor. However, I cannot get the exact results, even with using the exact same ...
NOnaMe's user avatar
  • 99
0 votes
2 answers
119 views

I'm trying to train a RandomForestClassifier Model. However, when I train it, it gives me all zeroes. And, I really can't seem to understand why. The dataset is HUGE (close to like 75,0000 rows), so, ...
hacks_and_nimbus's user avatar
0 votes
0 answers
16 views

(https://i.sstatic.net/TMOXGHAJ.png) I need to set Age as the Target column but it is not allowing me to select the attribute. How does Knime decide what features to offer as the target column? I ...
Shishir Khanal's user avatar
2 votes
0 answers
62 views

Lately I tried a random forest model by golearn. I want to use saved model (.gob) and only explain variables. As long as I explored, the only way I found was to prepare a template base.FixedDataGrid ...
os tk's user avatar
  • 21
0 votes
0 answers
17 views

I am training a RandomForestClassifier from sklearn.ensemble with the following code: adata = ad.read_h5ad(f'{data_dir}{ct}_clean_log1p_normalized.h5ad') adata = adata[:, adata.var....
Rushil Patel's user avatar
1 vote
1 answer
72 views

I am working on a data science project and trying to find the optimal parameters for my project this is what I want to test but it takes forever and I could not see the output since its been 1 hour. ...
Ata Berk Çinetçi's user avatar
-1 votes
1 answer
124 views

I'm building a classification model for sleep disorders using Voting Ensemble and I have three base models: Logistic Regression, Random Forest and SVM. Now I want to combine these models using a ...
YRN's user avatar
  • 3
0 votes
0 answers
117 views

I am trying to obtain the SHAP values of a Random Forest model for binary classification, trained in Python. I am using the following code: final_model = RandomForestClassifier(random_state=42, **...
a12456's user avatar
  • 1
0 votes
0 answers
48 views

Friends, I want to run a random forest model using GEE. I selected a study area and generated some sample points randomly within it, assigning attributes "0", "1", "2", &...
尹辛斓's user avatar
1 vote
1 answer
99 views

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...
lsr729's user avatar
  • 854
3 votes
1 answer
79 views

I have a dataset that I used to make a random forest (it is split into testing and training data). I have already made the random forest and generated predictions (code below), but I don't know how to ...
shrimp's user avatar
  • 101
0 votes
0 answers
20 views

I have the following dataset: X1 X2 X3 y 0 0.548814 0.715189 0.602763 0.264556 1 0.544883 0.423655 0.645894 0.774234 2 0.437587 0.891773 0.963663 0.456150 3 ...
quant's user avatar
  • 4,512
0 votes
1 answer
66 views

I have a csv. with my data and I put it through a RF to predict sediment type based off of bathymetry data from 24 sample points (and get sediment distribution as an image output), the RF is working ...
user30057623's user avatar
0 votes
0 answers
28 views

I am obtaining a ValueError regarding the input arrays and there dimension. I am trying to create a Random Forest Regression Model for price prediction using both numerical features and categorical ...
leakie's user avatar
  • 1
0 votes
1 answer
382 views

Im running a random forest model and to get some feature importance and Im trying to run a SHAP analysis. The problem is that every time I try to plot the shap values, I keep getting this error: ...
Starterkit07's user avatar
1 vote
1 answer
112 views

I'm currently engaged in a machine - learning project where I need to utilize the random forest algorithm. I've installed the randomForest package in R, but I'm facing significant issues when ...
wzj's user avatar
  • 25
1 vote
1 answer
55 views

I have come around a strange thing when plotting a decision tree in sklearn. I just wanted to compare a Random Forest model consisting of one estimator using bootstrapping and one without ...
EleniSop's user avatar
0 votes
1 answer
95 views

I am trying to run a geographically weighted random forest classification using the function SpatialML::rgf(). However, I am encountering the following error: 'Error in x[[jj]][iseq] <- vjj : ...
Rodrigo Nehara's user avatar
0 votes
0 answers
41 views

I want to simplify the decision tree output and hide the values ​​in the "value" field.Below is the code I am using enter image description here fig, ax = plt.subplots(figsize=(10, 10)) ...
user29687883's user avatar
-1 votes
1 answer
95 views

I am using sklearn to run a random forest. I am setting the seed for the random forest, as well as splitting the data for cross validation. When I re-run the code consecutive times, it gives me the ...
rnoob's user avatar
  • 1
0 votes
0 answers
15 views

I am creating a machine learning model (random forest) in Spark (Pyspark) with cross-validation and grid search. I have two dataframes: one for training and one for testing, both stored in Parquet. ...
cyber-cavalera's user avatar
0 votes
1 answer
49 views

I am trying to optimize my model with GridSearchCV, using a custom profit function. However, when I Run my code, I end up with the following error message: TypeError: profit_scorer() missing 1 ...
user29348541's user avatar
0 votes
0 answers
37 views

I've been using Whitebox Workflow Random Forest Regression fit for my undergraduate thesis, a plugin from QGIS. The plugin creates a model from the input data such as Raster files, number of trees, ...
Mcpaws's user avatar
  • 1
1 vote
0 answers
91 views

I want to estimate the relative importance of variables in explaining a response variable ("dep_var", a numeric variable based on a 4-point Likert scale). I am mostly intersted in the ...
Jonathan Raphael's user avatar
2 votes
1 answer
113 views

I am trying to change the functionality of a random forest classifier. While usually features are selected at random for each split, I want one specific feature to be evaluated at each split. I know ...
Niklas Jacobs's user avatar
0 votes
2 answers
55 views

Apologies for bothering you, but I haven't been able to find a definitive answer after searching the site. I'm building a RandomForestClassifier on some clinical data where the target variable (...
Aezhel's user avatar
  • 11
2 votes
0 answers
221 views

So I trained my model and here I can share some relevant parts regarding the issue: import pandas as pd import pickle from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.ensemble ...
idkrlly's user avatar
  • 41
1 vote
1 answer
401 views

I want to plot the SHAP values for my RSF model; here is the code and error: xvars <- c("RIDRETH1", "RXDLIPID", "DRXTKCAL", "DRXTPROT", "DRXTCARB", ...
mtvpr's user avatar
  • 11
2 votes
1 answer
148 views

I am learning how to build a single decision tree and random forests in R. I understand that set.seed() is needed before building a random forest to ensure reproducibility of the results, e.g. if ...
Jabjab4's user avatar
  • 21
1 vote
0 answers
82 views

I'm trying to test whether I've understood the way RandomForestRegressor produces forecast after a model's fitted. I used the California housing example to train a simple model and predict the first ...
user24791428's user avatar
0 votes
0 answers
30 views

I am creating a DataFrame from the raster arrays of large raster images to train a Random Forest model, but the process of creating the DataFrame is taking too long due to the large size of the images....
Budin's user avatar
  • 3
0 votes
1 answer
111 views

I am creating a Machine Learning Model that determines whether a user is a bot or not, I used seaborn to plot a pairplot and realised most of the data is overlapping. Below is the code I wrote for ...
Shane Caesar's user avatar
3 votes
0 answers
89 views

I'm trying to use the text R package to train a ML model with the textTrainRandomForest() function, but I'm encountering an error: Error in `dplyr::bind_cols()`: ! Can't recycle `..1` (size 10) to ...
Elena Voevodina's user avatar
0 votes
1 answer
49 views

I can't find the syntax for calculating AUC for this random forest model. See the code below; please advise. ## 1 library(caret) library(dplyr) library(pROC) library(readxl) library(car) set.seed(1234)...
Gabriel Odok's user avatar
-2 votes
1 answer
118 views

I am working on credit risk analysis. I want to predict the risk of each company developing a debt with a fictional company. I obtained the feature importance from the model, but I want to know if it ...
Heloisa Ramos's user avatar
1 vote
0 answers
48 views

In the ranger package for R, the node impurity is measured with the Gini index for classification trees. I would expect the Gini index to lie between 0 and 1, as $$ Gini = 1-\sum_{i=1}^C p_i^2$$, ...
Lino's user avatar
  • 11
-1 votes
1 answer
35 views

I'm building a restaurant recommender for my city using a Kaggle dataset and RandomForestRegressor. I built the model, and now want the model to recommend a good restaurant when it is given 4 ...
Carl Jacob's user avatar
2 votes
2 answers
131 views

I'm working on a multilabel classification problem using the ClassifierChain approach with RandomForestClassifier as the base estimator. I've encountered an issue where my input matrix X contains np....
BSalvatori's user avatar
-1 votes
2 answers
207 views

I have a dataset with personal characteristics of customers who purchase from a fictional company. Initially, I don't have any target variable, only their characteristics. My goal is to find a pattern,...
Heloisa Ramos's user avatar
-1 votes
1 answer
78 views

I have trained Random forest model using June dataset to predict status_value of an employee and using 0.3 split for test_size. I am including code snippets as the code it self works well without any ...
Hardik Trivedi's user avatar
-1 votes
1 answer
148 views

I am competing in a Kaggle competiton (https://www.kaggle.com/competitions/playground-series-s4e8) where we have to predict whether a mushroom is poisonous or not based on the data provided. The issue ...
Unusual Catcher's user avatar
-2 votes
1 answer
55 views

I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...
Sadura Akinrinwa's user avatar
0 votes
0 answers
95 views

I am doing a random forest model on PC orders data, which is mostly in Chinese. I have done the model and accuracy checks. However, I can't seem to generate the image due to a UnicodeEncodeError, ...
Theodore Maximus's user avatar
0 votes
0 answers
190 views

I need to predict different types of exploitation using a RandomForestClassifier. My dataset contains several categorical variables such as gender, citizenship, and CountryOfExploitation. These ...
tswift1998's user avatar
0 votes
1 answer
59 views

I'm just working on this problem where I thought RandomForestClassifier from scikit-learn would be better solution for a large dataset. Only after trying with it for this, I found it to be not ...
kiruthikpurpose's user avatar
0 votes
1 answer
95 views

Does anybody know how to solve this problem? I'm trying to build a species Distribution model using bioclimatic variables sdm package by Naimi, 2016 R version 4.4.1 ERROR: model <- sdm(Species ~ ., ...
Simon's user avatar
  • 1

1
2 3 4 5
73