Newest 'tidyverse' Questions - Page 5

0 votes

1 answer

25 views

Ordering Y-axis labels in ggplot2 when using a leveled factor doesn't work

I'm using the following to generate a column plot of my data. But despite converting the variable Recommendations into a leveled factor, the Y-axis labels still are NOT ordered the way they are listed ...

Simon Harmel

1,489

asked Jul 30, 2024 at 18:46

1 vote

2 answers

153 views

How can I efficiently read a large CSV file from Azure Blob Storage into R for analysis?

I have the following function to read a CSV file from Azure: read_csv_from_azure <- function(file_path, container) { # Try to download the file and handle potential errors tryCatch({ # ...

Enes

149

asked Jul 30, 2024 at 12:15

0 votes

2 answers

63 views

Transforming Categorical Column into Binary Columns in R Based on Multiple Conditions

I have a dataframe with two columns in R. One of the columns (column1) has three possible values (A, A and B, B). The rows are patients. I want to transpose column1, so I'd have binary columns (Yes, ...

roybatty

105

asked Jul 30, 2024 at 9:56

0 votes

1 answer

47 views

Apply a function in a data frame group-wise on a subset of rows? [closed]

By using tidyverse, I want to calculate standard deviation of alt_freq column grouping by rsid in a data frame. In each group, I want to consider only those rows which have at least 100 samples. I ...

Ömer A.

677

asked Jul 30, 2024 at 8:00

4 votes

2 answers

3k views

Why does this create an "NAs introduced by coercion" warning

Curious why the following produces an "NAs introduced by coercion" warning # Example dataframe df <- tibble( session = c("a",2), ) df %>% mutate(sessionNum = case_when( ...

Alex Holcombe

2,621

asked Jul 27, 2024 at 19:38

0 votes

1 answer

35 views

merge two dataframes of different sizes without key [duplicate]

I have two dataframes with different columns and different row sizes library(tidyverse) tb1 <- tibble(id= 1:10, a= 1:10, b=11:20) tb2 <- tibble(id= 1:5, ...

mr.T

634

asked Jul 27, 2024 at 6:33

4 votes

2 answers

156 views

R: Efficient way to str_replace_all without recursively replacing conflicting substitutions?

Hello, The problem First, let me try to illustrate the problem. Assume I want to apply the following cipher to encode the string, "abc". library(tidyverse) cipher <- tibble( byte = c(...

jatx50

127

asked Jul 21, 2024 at 9:17

0 votes

0 answers

50 views

strip.white doesn't work in r if the data frame is too large

I import, clean, and merge two different data sets that are created from Qualtrics surveys. I use read.csv to load the data, and I have strip.white=TRUE to remove leading and trailing spaces. If I ...

Cam McM

1

asked Jul 18, 2024 at 20:34

0 votes

3 answers

135 views

Using 'slice_max()' in for loop

I'm trying to create new dataframes with the top three values for each column across a dataframe. probUnweighted <- data.frame( Sample1 = c(0.9, 0.2, 0.03, 0.1, 0.5, 0.09), Sample2 = c(0.045, 0.11,...

Chris Stantis

33

asked Jul 18, 2024 at 16:39

1 vote

2 answers

66 views

Remove string from column across group of rows in another column

I would like to remove a string from one column across a group of rows in another column. In the below reprex, I would like to remove the string in snippet from the string in text in any row in the ...

maia-sh

641

asked Jul 16, 2024 at 18:26

1 vote

2 answers

319 views

How to select specific columns across multiple dataframes in R and then bind them into one data.frame?

I am trying to select or subset multiple data frames with different number of columns. They all contain the same columns of interest, so I am trying to make them all contain the same columns so I can ...

Victor Shin

319

asked Jul 16, 2024 at 16:52

1 vote

2 answers

50 views

Recode relationship matrices based on new subgrouping

Problem: I have a survey dataset which includes intra-household relationships. I had to subdivide household into tax-unit, which means I need to redefine the relationship matrices based on the new tax-...

ravinglooper

221

asked Jul 16, 2024 at 13:55

0 votes

1 answer

52 views

additional arguments to purrr:map don't work as expected

I'm using the purrr::map function to iterate over several columns and tidy the result. for a short example, I provide the following code: library(tidymodels) library(broom) > penguins %>% + ...

user23485480

59

asked Jul 14, 2024 at 20:14

3 votes

2 answers

71 views

Trying to create a grouped barchart in R - producing a stacked one instead

I am trying to create a bar chart that has the number of each species grouped into years. I want each year represented on the x axis with the number of each of the 3 species grouped next to one ...

Anna

61

asked Jul 14, 2024 at 8:52

0 votes

1 answer

51 views

Propensity density score with MatchIt package -- how to bind rows when we have lot of datasets to have a final dataset with matched characteristics

I'm expanding this post -- answered by @edwards (Thanks). I'm working with panel data. We assessed children in 2019, 2020, 2021 and 2022. Therefore, I have four datasets (2019, 2020, 2021, and 2022). ...

Luis

1,574

asked Jul 14, 2024 at 2:36

0 votes

1 answer

50 views

Name nested column list with specific name

This is my code library(tidyverse) # Criar um dataframe de exemplo com dados de futebol dat <- tibble( continent = rep(c("Asia", "Europe", "Africa", "Americas&...

Laura

759

asked Jul 13, 2024 at 13:29

0 votes

1 answer

73 views

Using for loops, while, tidyverse, or packages to create a dataset with matching characteristics from a previous one (sampling) [closed]

I'm working with panel data. We assessed children in 2019 and 2020. Therefore, I have two datasets (2019 and 2020) and I want to create a third dataset matching the data from the second dataset (2020) ...

Luis

1,574

asked Jul 13, 2024 at 5:57

5 votes

5 answers

150 views

Convert a list into a tibble with nested columns

I would like to convert a list like this into a tibble. lst <- list( "A"=list( "Category"="A", "Team"=c("x"), "City"="...

mindlessgreen

12.3k

asked Jul 12, 2024 at 22:09

0 votes

2 answers

74 views

Loop through variables to filter a tibble in R

This feels like it should be easier than it is but here we go. I have a data frame that looks like this: to.csv = structure(list(geography = c("030223131022122122", "030223131220201023&...

Priya Patel

171

asked Jul 12, 2024 at 18:28

1 vote

3 answers

81 views

Apply command for complex functions and calculations on a dataset in R

I'm a reasonably experienced R user who has often struggled to use the apply family. I have very slow-moving iterative code whose performance I'm hoping to improve through the use of this family, but ...

St4096

87

asked Jul 12, 2024 at 16:55

2 votes

2 answers

104 views

How to create subgroups based on group relationship criteria

Context: I have a dataframe of individual people grouped by household, which includes relationship parameters for each individual describing their relationship to every other individual in the ...

ravinglooper

221

asked Jul 12, 2024 at 12:20

0 votes

1 answer

100 views

How to refer to dataset in ggplot using dplyr

I have the following dataset in R crude_data <- structure(list(date = structure(c(19570, 19601, 19631, 19662, 19692, 19723, 19754, 19783, 19814, 19844, 19875, 19905, 19936, 19967, 19997, 20028, ...

Tanga94

867

asked Jul 12, 2024 at 9:57

3 votes

4 answers

117 views

Joining lat/lon data frames by nearest distance

Let's say I have a regular latitude/longitude grid and data at irregular locations, like this: grid = tidyr::crossing(lon = seq(0, 1, 0.25), lat = seq(0, 1, 0.25)) data = tibble::tibble(lon = runif(4),...

sieste

9,177

asked Jul 12, 2024 at 8:54

0 votes

1 answer

52 views

Sample all rows of N groups

I'm trying to find a way to sample N whole groups from a dataframe. For example, if we had the below dataframe: group value 1 a 1 2 a 2 3 a 3 4 b 4 5 b ...

ravinglooper

221

asked Jul 11, 2024 at 10:27

1 vote

2 answers

72 views

getting the latest date for a a duplicated item in r [closed]

I have the following dataset (below). I am trying to get the latest SEnd value for each individual tag (see Desired output) where I have the Tag, Owner and the latest SEnd date only. Essentially I am ...

pat

105

asked Jul 10, 2024 at 9:55

0 votes

1 answer

63 views

Function works with plan(sequential) but not plan(multisession)

Here is my code : plan(multisession,workers=detectCores()-2) future_map_dfr(.x= Liste_model[1:2],.f = summaryModel, df = DF_MODEL_TRAIN, df_test = DF_MODEL_TEST, df_global = DF_MODEL_GLOBAL, .id = &...

Rgrvkfer

417

asked Jul 9, 2024 at 13:41

1 vote

2 answers

90 views

parallel/automatic way of unnesting list columns that contains data frames (list columns might be empty)

Please consider the following data frame: df <- structure(list(oID = c(37751L, 30978L, 33498L), peId = c(12L, 13L, 14L), last_Name = c("ABC", &...

deschen

11.6k

asked Jul 8, 2024 at 19:35

1 vote

1 answer

127 views

Little hack for ggplot -- an easy way to add a text with the real means and standard deviation when using lines or bars

I just want to add some text of real means and sd to my plots when I'm working with one outcome or multiple outcomes. see the pictures below for reference. Code is below. If any updated package ...

Luis

1,574

asked Jul 6, 2024 at 18:33

0 votes

0 answers

88 views

What is the preferred / recommended rlang metaprogramming syntax to use on both sides of an assignment operator in the `dplyr::mutate()` function?

I have a question about an issue that's similar to this older question about the dplyr::filter() function, except that my example is a bit more complicated because dplyr::mutate() needs to process ...

stachyra

4,613

asked Jul 6, 2024 at 3:49

1 vote

1 answer

30 views

Long to wide format based on variable suffixes in tidyverse in R

I wonder if there is a way for my DATA to be reformatted to my Desired_output below? Specifically, for each unique study, we stick together a pair of pre and postNUMBER together, separately for T and ...

Simon Harmel

1,489

asked Jul 5, 2024 at 21:55

1 vote

2 answers

72 views

Use ifelse for several columns in R

My goal is to create a binary variable (k) that turns 1 if in any column between mpg and wt the values 3.90 and/or 160.0 appears. Code library(tidyverse) mtcars<-mtcars%>% mutate(k=ifelse(mpg:...

Amc

159

asked Jul 5, 2024 at 21:28

0 votes

1 answer

34 views

Transforming value into row number

I am conducting some survival analysis and an attempting to turn my wide table into long format for analysis using dplyR. I want to turn the value of 'dead flies' into rows with a binary status for ...

Dan Pritchard

15

asked Jul 5, 2024 at 15:33

0 votes

0 answers

72 views

How to color the legend labels in ggplot [duplicate]

Using the iris dataset, we can make a boxplot and customise the legend when plotting using ggplot like so: ggplot(data = iris, aes(x=Species, y=Sepal.Length, fill=Species))+ geom_boxplot()+ ...

Ginko-Mitten

410

asked Jul 4, 2024 at 19:13

0 votes

1 answer

71 views

Is there a way to pass a string as a variable/column name to my function and use in a call to mutate?

I have a dataframe with a column indicating choices (of a survey) as well as a column indicating the index of the choice made in each row. e.g., df <- tibble( record_id = 1:9, choices = c(rep(&...

sometimes_sci

319

asked Jul 4, 2024 at 19:10

4 votes

1 answer

81 views

Is there a way to prevent facet labels from being equal width (after rotation)

When I apply a facet_grid, sometimes the labels are quite wide, sometimes so wide that they don't fit, and I have to rotate them. This isn't a problem unless I want to facet by multiple different ...

Dan

286

asked Jul 3, 2024 at 14:34

0 votes

1 answer

96 views

How to iterate a function for multiple values (Loop function)?

By running the following function, the output would be: library(pmsampsize) pmsampsize(type = "s", csrsquared = 0.5, parameters = 10, rate = 0.065, timepoint = 2, meanfup = 2.07) NB: ...

elisa

105

asked Jul 1, 2024 at 20:21

0 votes

1 answer

72 views

Change raster extent with tidy

I have files from ERA5 that have extent from 0 to 360 (lon) and -90 to 90 (lat) Example: > era5_sr class : SpatRaster dimensions : 721, 1440, 744 (nrow, ncol, nlyr) resolution : 0.25, 0....

herakles_1950

25

asked Jul 1, 2024 at 14:58

1 vote

2 answers

45 views

In R, how to find the proportion of cases which have a value present in another column?

This seemed really simple to me at first, but is unexpectedly giving me trouble. Let's say my dataset looked like this: mock <- tribble(~case_id, ~characteristic, 1, "A&...

daltoncito5034

81

asked Jun 28, 2024 at 16:57

1 vote

2 answers

71 views

Divide groups in other groups by date intervals

I'm dealing with dates and I wanted to group some rows together but I can't find how. In my data, one row is an individual in a time interval and in a place. Something like that : ind place ...

jrdavalos

87

asked Jun 28, 2024 at 12:10

3 votes

4 answers

81 views

How to obtain all the numeric variables in data frame and use in another function in R

the below example is to obtain the min value among 3 columns, we can use the pmin(V1, V2, V3). if we have lots of columns, how to get the minimal value among all numeric variables especially using the ...

ynsec37

91

asked Jun 28, 2024 at 3:06

1 vote

1 answer

72 views

Conditional filtering of dataframe in R

I wonder how to dplyr::filter() my DATA to catch the rows for IDs whose Language value when 'Type!=5F' and when 'Type==5F' changes from other languages to "English"? For example, ID==1 has ...

Simon Harmel

1,489

asked Jun 26, 2024 at 23:53

2 votes

2 answers

41 views

Wide format data by pasting two sets of variables into one in R

I've tried to wide-format my DATA into my Desired_output using: pivot_wider(DATA, names_from = Year, values_from = c(Type, Language)) without success. Is there a way to achieve my Desired_output? ...

Simon Harmel

1,489

asked Jun 26, 2024 at 21:39

-2 votes

1 answer

260 views

How to utilize Rauh's German Political Sentiment Dictionary

I need to utlize the named sentiment dictionary for my sentiment analysis in R studio. Unfortunately I have problems at that. The dictionary comes within a zip archive and specifically (as I assume) ...

user23820003

1

asked Jun 26, 2024 at 10:26

5 votes

3 answers

103 views

How to extract birth and death year from string in R?

I have the first paragraph of Wikipedia articles from the wikifacts package (only for people). I like to extract birth year and year of death. library(wikifacts) library(tidyverse) politicians <- ...

Marco

2,887

asked Jun 25, 2024 at 10:54

1 vote

1 answer

1k views

"Error in initializePtr(): function 'cholmod_factor_ldetA' not provided by package Matrix" gets displayed while trying to generate mixed effects model

I have already tried the previous solutions displayed by some users. I have tried removing and reinstalling Matrix and lme4 packages. To make matters worse, now R is unable to install lme4 or Matrix ...

Ramanpreet Kaur

11

asked Jun 24, 2024 at 2:59

1 vote

3 answers

192 views

Classification of rows/individuals based on their column output in an incidence matrix

I wrote an R function to classify rows (individuals) based on the columns output in an incidence matrix M5 for the following requirements: M5 <- structure(c(1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1,...

MYaseen208

24.1k

asked Jun 20, 2024 at 22:41

1 vote

1 answer

59 views

pivot_longer from multiple columns into a singular names_to and two values_to

I've spent most of the day on this and finally calling in some help. There are multiple entries here on related questions, but none that quite get at what I'm trying to do. Below is an example df. x &...

Scott Davidson

37

asked Jun 20, 2024 at 19:59

1 vote

1 answer

108 views

Is there an elegant way to handle changing number of rows within tidyverse?

In Tidyverse there are limitations concerning the row number resulting from some data processing. Most prominent, mutate expects that the row number equals to the original data set. For example, if we ...

LulY

1,385

asked Jun 19, 2024 at 14:28

-2 votes

2 answers

195 views

Is there a %$% operator?

In the book R for Data Science, there is an operator %$%, as in the example code below. But when I run that code, I get the error message "there is no such operator". Can anyone help with ...

anderwyang

2,547

asked Jun 19, 2024 at 7:51

0 votes

1 answer

107 views

Order columns based of suffix condition in R

The name of my variables looks like this: df <- data.frame(var_NA = 1:10, var = 11:20, var_Level = 21:30, var_Total = 31:40) Except I have lots of variables. The key feature is that for every &...

luchonacho

7,295

asked Jun 18, 2024 at 19:08

Collectives™ on Stack Overflow