Newest 'random-forest' Questions

0 votes

0 answers

27 views

Class labelling influences the gini importance of features [closed]

Problem: In sklearn Random Forest classifier, the class labelling influences the gini importance of features. I would expect the labelling of classes should not influence the importance values and ...

Micha

1

asked Nov 28 at 13:18

0 votes

0 answers

88 views

Evaluating pre-trained random forest in Fortran

I have a trained random forest regressor from scikit-learn: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html I then want to make use of (but not train ...

Tor

803

asked Oct 23 at 9:59

0 votes

0 answers

55 views

iterative_train_test_split does not return same split

I am trying to build a multilabel classification model using a random forest classifier. However, for some reason iterative_train_test_split does not return the same split even though it should use np....

MisterJAcobz

85

asked Sep 23 at 8:17

0 votes

0 answers

35 views

How to deploy .keras and .joblib models in FRDM MXCN947 (NXP) Microcontroller?

I have created 3 models (a RF, CNN-LSTM and a MLP) to be deployed in the FRDM MCXN947 microcontroller. They have sizes of 2.18 GB, 7.75 MB and 0.186 MB respectively. I have saved the RF model as ....

min_een

5

asked Jul 20 at 18:37

6 votes

2 answers

103 views

Reproduce a particular tree from the random forest using DecisionTreeRegressor

I am trying to replicate a specific decision tree trained by a RandomForestRegressor class, using DecisionTreeRegressor. However, I cannot get the exact results, even with using the exact same ...

NOnaMe

99

asked Jul 10 at 7:57

0 votes

2 answers

119 views

Rancom Forest Classifier model returns all zeroes

I'm trying to train a RandomForestClassifier Model. However, when I train it, it gives me all zeroes. And, I really can't seem to understand why. The dataset is HUGE (close to like 75,0000 rows), so, ...

hacks_and_nimbus

33

asked Jun 17 at 21:35

0 votes

0 answers

16 views

KNIME: Random Forest Learner

(https://i.sstatic.net/TMOXGHAJ.png) I need to set Age as the Target column but it is not allowing me to select the attribute. How does Knime decide what features to offer as the target column? I ...

Shishir Khanal

1

asked Jun 7 at 20:06

2 votes

0 answers

62 views

Prediction by trained model of sjwhitworth/golearn

Lately I tried a random forest model by golearn. I want to use saved model (.gob) and only explain variables. As long as I explored, the only way I found was to prepare a template base.FixedDataGrid ...

os tk

21

asked May 27 at 6:24

0 votes

0 answers

17 views

Getting extremely low importance scores in ensemble.randomForestClassifier

I am training a RandomForestClassifier from sklearn.ensemble with the following code: adata = ad.read_h5ad(f'{data_dir}{ct}_clean_log1p_normalized.h5ad') adata = adata[:, adata.var....

Rushil Patel

13

asked May 20 at 14:23

1 vote

1 answer

72 views

RandomForest Classifier Takes Forever [closed]

I am working on a data science project and trying to find the optimal parameters for my project this is what I want to test but it takes forever and I could not see the output since its been 1 hour. ...

Ata Berk Çinetçi

63

asked May 3 at 13:00

-1 votes

1 answer

124 views

do i need to scale the rf model while creating voting ensemble model? [closed]

I'm building a classification model for sleep disorders using Voting Ensemble and I have three base models: Logistic Regression, Random Forest and SVM. Now I want to combine these models using a ...

YRN

3

asked May 1 at 13:22

0 votes

0 answers

117 views

SHAP value in TreeExplainer: Additivity check failed in TreeExplainer

I am trying to obtain the SHAP values of a Random Forest model for binary classification, trained in Python. I am using the following code: final_model = RandomForestClassifier(random_state=42, **...

a12456

1

asked Apr 23 at 9:01

0 votes

0 answers

48 views

I used GEE to do random forest classification, but the result is only one color. Why?

Friends, I want to run a random forest model using GEE. I selected a study area and generated some sample points randomly within it, assigning attributes "0", "1", "2", &...

尹辛斓

1

asked Apr 20 at 16:19

1 vote

1 answer

99 views

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...

lsr729

854

asked Apr 15 at 19:53

3 votes

1 answer

79 views

How do I use a random forest to predict gaps in a dataset? [closed]

I have a dataset that I used to make a random forest (it is split into testing and training data). I have already made the random forest and generated predictions (code below), but I don't know how to ...

shrimp

101

asked Apr 10 at 18:12

0 votes

0 answers

20 views

Get analytical equation of RF regressor model [duplicate]

I have the following dataset: X1 X2 X3 y 0 0.548814 0.715189 0.602763 0.264556 1 0.544883 0.423655 0.645894 0.774234 2 0.437587 0.891773 0.963663 0.456150 3 ...

quant

4,512

asked Apr 7 at 9:27

0 votes

1 answer

66 views

Terra predict function failing to predict my random forest model

I have a csv. with my data and I put it through a RF to predict sediment type based off of bathymetry data from 24 sample points (and get sediment distribution as an image output), the RF is working ...

user30057623

3

asked Mar 25 at 16:54

0 votes

0 answers

28 views

Issue with Input Array on Random Forest Model with Both Numerical and Categorical Features

I am obtaining a ValueError regarding the input arrays and there dimension. I am trying to create a Random Forest Regression Model for price prediction using both numerical features and categorical ...

leakie

1

asked Mar 21 at 15:53

0 votes

1 answer

382 views

Length of features is not equal to the length of SHAP Values

Im running a random forest model and to get some feature importance and Im trying to run a SHAP analysis. The problem is that every time I try to plot the shap values, I keep getting this error: ...

Starterkit07

1

asked Mar 17 at 19:16

1 vote

1 answer

112 views

Error Building Random Forest in R: randomForest Function Fails

I'm currently engaged in a machine - learning project where I need to utilize the random forest algorithm. I've installed the randomForest package in R, but I'm facing significant issues when ...

wzj

25

asked Mar 13 at 0:04

1 vote

1 answer

55 views

Plotting one Decision Tree of a Random Forest in sklearn

I have come around a strange thing when plotting a decision tree in sklearn. I just wanted to compare a Random Forest model consisting of one estimator using bootstrapping and one without ...

EleniSop

13

asked Feb 22 at 16:58

0 votes

1 answer

95 views

R: Error in x[[jj]][iseq] <- vjj : replacement has lenght zero (Library SpatialML::rgf)

I am trying to run a geographically weighted random forest classification using the function SpatialML::rgf(). However, I am encountering the following error: 'Error in x[[jj]][iseq] <- vjj : ...

Rodrigo Nehara

25

asked Feb 21 at 18:02

0 votes

0 answers

41 views

Draw a decision tree while hiding the values of the "value" row

I want to simplify the decision tree output and hide the values in the "value" field.Below is the code I am using enter image description here fig, ax = plt.subplots(figsize=(10, 10)) ...

user29687883

1

asked Feb 18 at 2:59

-1 votes

1 answer

95 views

Different Results (With Seed) For sklearn Random Forest

I am using sklearn to run a random forest. I am setting the seed for the random forest, as well as splitting the data for cross validation. When I re-run the code consecutive times, it gives me the ...

rnoob

1

asked Feb 4 at 2:34

0 votes

0 answers

15 views

How to reproduce the results of an ML model in Spark? [duplicate]

I am creating a machine learning model (random forest) in Spark (Pyspark) with cross-validation and grid search. I have two dataframes: one for training and one for testing, both stored in Parquet. ...

cyber-cavalera

1

asked Jan 29 at 12:40

0 votes

1 answer

49 views

GridSeachCV custom profit function results with an error: missing 1 required positional argument: 'y'

I am trying to optimize my model with GridSearchCV, using a custom profit function. However, when I Run my code, I end up with the following error message: TypeError: profit_scorer() missing 1 ...

user29348541

3

asked Jan 24 at 12:38

0 votes

0 answers

37 views

Whitebox Workflow Random Forest Regression Fit Hyperparameter Tuning

I've been using Whitebox Workflow Random Forest Regression fit for my undergraduate thesis, a plugin from QGIS. The plugin creates a model from the input data such as Raster files, number of trees, ...

Mcpaws

1

asked Jan 7 at 16:32

1 vote

0 answers

91 views

partykit: Error when using varimp on cforest for data set including NA values

I want to estimate the relative importance of variables in explaining a response variable ("dep_var", a numeric variable based on a 4-point Likert scale). I am mostly intersted in the ...

Jonathan Raphael

11

asked Jan 2 at 13:29

2 votes

1 answer

113 views

Modification of Random Forest to always evaluate some feature(s) at every split

I am trying to change the functionality of a random forest classifier. While usually features are selected at random for each split, I want one specific feature to be evaluated at each split. I know ...

Niklas Jacobs

31

asked Dec 18, 2024 at 11:48

0 votes

2 answers

55 views

Feature Importance with ColumnTransform and OneHotEncoder in RandomForestClassifier

Apologies for bothering you, but I haven't been able to find a definitive answer after searching the site. I'm building a RandomForestClassifier on some clinical data where the target variable (...

Aezhel

11

asked Dec 2, 2024 at 13:09

2 votes

0 answers

221 views

SHAP Additivity Check Fails with Astronomical SHAP Values for RandomForestClassifier

So I trained my model and here I can share some relevant parts regarding the issue: import pandas as pd import pickle from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.ensemble ...

idkrlly

41

asked Nov 24, 2024 at 13:20

1 vote

1 answer

401 views

SHAP values for random survival forest

I want to plot the SHAP values for my RSF model; here is the code and error: xvars <- c("RIDRETH1", "RXDLIPID", "DRXTKCAL", "DRXTPROT", "DRXTCARB", ...

mtvpr

11

asked Nov 22, 2024 at 6:49

2 votes

1 answer

148 views

Is set.seed() needed when building a single decision tree in R?

I am learning how to build a single decision tree and random forests in R. I understand that set.seed() is needed before building a random forest to ensure reproducibility of the results, e.g. if ...

Jabjab4

21

asked Nov 13, 2024 at 10:23

1 vote

0 answers

82 views

How to use Python to replicate Random Forest Regression prediction using decision paths?

I'm trying to test whether I've understood the way RandomForestRegressor produces forecast after a model's fitted. I used the California housing example to train a simple model and predict the first ...

user24791428

11

asked Nov 13, 2024 at 7:55

0 votes

0 answers

30 views

Dataframe of raster images is taking too long

I am creating a DataFrame from the raster arrays of large raster images to train a Random Forest model, but the process of creating the DataFrame is taking too long due to the large size of the images....

Budin

3

asked Nov 12, 2024 at 12:49

0 votes

1 answer

111 views

How to deal with overlapping data in machine learning

I am creating a Machine Learning Model that determines whether a user is a bot or not, I used seaborn to plot a pairplot and realised most of the data is overlapping. Below is the code I wrote for ...

Shane Caesar

11

asked Nov 5, 2024 at 1:41

3 votes

0 answers

89 views

Problem with textTrainRandomForest() function

I'm trying to use the text R package to train a ML model with the textTrainRandomForest() function, but I'm encountering an error: Error in `dplyr::bind_cols()`: ! Can't recycle `..1` (size 10) to ...

Elena Voevodina

31

asked Oct 25, 2024 at 13:02

0 votes

1 answer

49 views

Calculating AUC for a random forest model

I can't find the syntax for calculating AUC for this random forest model. See the code below; please advise. ## 1 library(caret) library(dplyr) library(pROC) library(readxl) library(car) set.seed(1234)...

Gabriel Odok

1

asked Oct 20, 2024 at 8:32

-2 votes

1 answer

118 views

Is it possible to identify which indicators influence the credit risk for each client company in credit risk analysis?

I am working on credit risk analysis. I want to predict the risk of each company developing a debt with a fictional company. I obtained the feature importance from the model, but I want to know if it ...

Heloisa Ramos

21

asked Oct 2, 2024 at 12:36

1 vote

0 answers

48 views

Why is the split statistic in the ranger package for R greater than 1?

In the ranger package for R, the node impurity is measured with the Gini index for classification trees. I would expect the Gini index to lie between 0 and 1, as $$ Gini = 1-\sum_{i=1}^C p_i^2$$, ...

Lino

11

asked Oct 1, 2024 at 9:41

-1 votes

1 answer

35 views

X has 8 features, but RandomForestRegressor is expecting 2924 features as input

I'm building a restaurant recommender for my city using a Kaggle dataset and RandomForestRegressor. I built the model, and now want the model to recommend a good restaurant when it is given 4 ...

Carl Jacob

1

asked Sep 21, 2024 at 13:43

2 votes

2 answers

131 views

ClassifierChain with Random Forest: Why is np.nan not supported even though Base Estimator handles it?

I'm working on a multilabel classification problem using the ClassifierChain approach with RandomForestClassifier as the base estimator. I've encountered an issue where my input matrix X contains np....

BSalvatori

23

asked Sep 13, 2024 at 8:25

-1 votes

2 answers

207 views

How to use Machine Learning to find the pattern customer profile? [closed]

I have a dataset with personal characteristics of customers who purchase from a fictional company. Initially, I don't have any target variable, only their characteristics. My goal is to find a pattern,...

Heloisa Ramos

21

asked Aug 30, 2024 at 13:39

-1 votes

1 answer

78 views

Random Forest Test split

I have trained Random forest model using June dataset to predict status_value of an employee and using 0.3 split for test_size. I am including code snippets as the code it self works well without any ...

Hardik Trivedi

1

asked Aug 22, 2024 at 22:54

-1 votes

1 answer

148 views

Final Predictions accuracy of my ML Binary Classification Model is horrible [closed]

I am competing in a Kaggle competiton (https://www.kaggle.com/competitions/playground-series-s4e8) where we have to predict whether a mushroom is poisonous or not based on the data provided. The issue ...

Unusual Catcher

1

asked Aug 12, 2024 at 23:32

-2 votes

1 answer

55 views

Hybridized collaborative filtering and sentence similarity-based system for doctor recommendation based on user input of symptoms and location

I'm trying to solve a problem of recommending a doctor based on a user's symptoms and location using a hybridized collaborative filtering and sentence similarity-based recommender system that follow ...

Sadura Akinrinwa

1

asked Jul 23, 2024 at 17:20

0 votes

0 answers

95 views

How to visualize random forest plot using graphviz, in characters outside of UTF-8 (Chinese)

I am doing a random forest model on PC orders data, which is mostly in Chinese. I have done the model and accuracy checks. However, I can't seem to generate the image due to a UnicodeEncodeError, ...

Theodore Maximus

23

asked Jul 16, 2024 at 2:54

0 votes

0 answers

190 views

How to Encode Non-Ordinal Categorical Variables for RandomForest without Using Label Encoding?

I need to predict different types of exploitation using a RandomForestClassifier. My dataset contains several categorical variables such as gender, citizenship, and CountryOfExploitation. These ...

tswift1998

1

asked Jul 12, 2024 at 21:11

0 votes

1 answer

59 views

How to optimise hyperparameterss for RandomForestClassifier in Python for large datasets?

I'm just working on this problem where I thought RandomForestClassifier from scikit-learn would be better solution for a large dataset. Only after trying with it for this, I found it to be not ...

kiruthikpurpose

27

asked Jul 10, 2024 at 15:06

0 votes

1 answer

95 views

x@presence error for Species Distribution modeling [closed]

Does anybody know how to solve this problem? I'm trying to build a species Distribution model using bioclimatic variables sdm package by Naimi, 2016 R version 4.4.1 ERROR: model <- sdm(Species ~ ., ...

Simon

1

asked Jun 22, 2024 at 17:08

Collectives™ on Stack Overflow