10,285 questions
0
votes
1
answer
110
views
How to create 2 new columns in R (date difference + convert Seasons to minutes)? [duplicate]
I am new to R and trying to create two new variables from my dataset.
My data frame is called netflix and it contains these relevant columns:
date_added and duration
Example values:
date_added: "...
-3
votes
0
answers
46
views
How can I use a named vector with recode? [duplicate]
Suppose I have
df <- data.frame(name=c("Hello", "Hi", "GoodMorning"))
I would like to convert "GoodMorning" into "GoodEvening" (of course this is ...
0
votes
2
answers
120
views
Fill but with a conditional
I have the following dataframe:
df <- tribble(
~nuts_code, ~value,
"AT", 1,
"AT1", NA,
"AT2", NA,
"BG", NA,
"BG1", 10,
"BG2"...
1
vote
4
answers
193
views
How to coalesce using a function of other rows?
I have the following tibble:
eu_df <- structure( list( nuts_code = c( "PT17", "PT17", "PT17", "PT17", "PT17", "PT17", "PT17", &...
3
votes
5
answers
180
views
index/match equivalent in R for merging data frames in a tidy way
I have two data frames in R that share SOME columns and SOME rows in an arbitrary order.
I now want to create a third data frame that has the columns and rows of data frame 1 merged with the ...
3
votes
7
answers
332
views
Combine columns of a list of dataframes with a custom function
Suppose I have a list of dataframes l. All dataframes are guaranteed to have the same shape and contain the same columns.
I would like to combine the columns of those dataframes with a column-specific ...
3
votes
1
answer
110
views
Creating custom dbplyr compatible function in SQL
I'm working with dbplyr and DuckDB to process very large Parquet files using limited system resources. To make my workflow more efficient, I want to create a custom function that can be seamlessly ...
4
votes
4
answers
296
views
How to expand printing of tibble's nested data types (list columns)?
When printing a tibble, how do I print the cells with nested data types (in list columns)? Let's have a look at two examples:
1. call data type
b <- tibble(a = 1, prior = list(call("rnorm"...
3
votes
4
answers
179
views
Two-level unstack in R
Say I have data.frame h like this:
component hyperpar x
1 i sigma2 1.0
2 envN sigma2 1.0
3 envN sigma2_int 0.1
4 envN sigma2_int ...
2
votes
2
answers
118
views
Deducting values across a data frame based on condition
I am trying to deduct values across a data frame of elements.
structure(list(Date = structure(c(20095, 20095, 20095, 20095,
20123, 20123, 20123, 20123), class = "Date"), Type = c("Drip&...
3
votes
3
answers
214
views
Mutating detection data into binary
Currently I have a dataframe of bear detections that I want to convert into a binary detection history (14 columns of day1, day2, day3, etc. where:
actual_date_out = the date the camera was deployed, ...
0
votes
1
answer
116
views
In `dplyr` `filter`, how to parse conditions which are stored in string? [duplicate]
In dplyr filter, how to parse conditions which are stored in string?When there's only one condition, below code can work
library(dplyr)
conditions_string_1 <- "Species=='versicolor'"
iris ...
2
votes
3
answers
164
views
Calculating Overlapping hours / minutes
I have some date/time data that has overlapping minutes/hours and I want to calculate how much there is. I'm a little stumped on how to do this. I prefer a tidyverse solution because I'm more familiar ...
0
votes
2
answers
84
views
How to keep or recover group names
Sometimes, in case report forms the almost-same entries are inserted on a certain day and for some patient.
I want to find similar, possibly misspelled comment, per patient and date. Finding the ...
0
votes
1
answer
140
views
Extract/ subset rows from df start based on partial match in one column and ends extraction based on second column value
I want to extract information from a large df of multiple .docx files. Using the officer package I have so far read all .docx from a folder into 1 df. However I need a way to extract specific sections ...
9
votes
6
answers
360
views
Use R to visualize activities across 24 hours as a pi chart 'clock' or bar
For fun, I would like to track the time when a baby is sleeping or feeding over the course of 24 hours and then visualize this as a pi chart or polar chart so that it resembles a 24-hour clock. Every ...
1
vote
0
answers
81
views
Meta-analysis: InfluenceAnalysis()
I am currently learning how to conduct a meta-analysis using this guide: https://bookdown.org/MathiasHarrer/Doing_Meta_Analysis_in_R/
However, I have encountered an issue in Chapter 5 when using ...
3
votes
4
answers
178
views
Create a dataframe column containing a list of number whose length is determined by another column
I have a data frame of names, and each name has a count of individuals, e.g.
df = read.table(header=TRUE, text=
'name count
a 10
b 1
c 7
d 3')
> df
name count
1 a 10
2 b ...
4
votes
2
answers
175
views
How to tidy messy data [closed]
I have a messy data set, which generally resembles the output of the following
schools_messy <- tibble::tribble(
~data,
"state:maryland",
"location:bowie||name:bowie state ...
0
votes
1
answer
101
views
Tidyverse pivot_wider specify data which is retained as rows [duplicate]
I have a recurring problem which I have been addressing using a messy approach of multiple passes of pivot_wider and rbinding the results together. This seems inelegant, does anyone have a single ...
0
votes
1
answer
137
views
R Tidyverse Efficient Way to group_by, summarize, mutate, nest Multiple Levels
I am trying to get a large dataset prepped to create an ECharts4R sunburst visualization, which requires a hierarchical data structure. What is an efficient way to group by, summarize, and nest data ...
2
votes
3
answers
118
views
dplyr: find if a column value is substring of any item in a fixed list and mutate value
I started with a simple example which works, but I don't know how to use within dplyr mutate
This works, I get "bcd-234":
library(tidyverse)
list = c("abc-123", 'bcd-234', 'cde-345'...
-1
votes
3
answers
186
views
Drop rows with missing values in all columns [duplicate]
It looks like tidyr's drop_na will drop rows if any of the specified columns contain missing values.
Example:
> library(tidyverse)
> df <- data.frame(a=c(1,NA,2,NA), b=c(3,4,NA,NA))
> df
...
1
vote
2
answers
135
views
Why does spec() return NULL after subsetting a tibble? (And how do I avoid that?)
After reading in my data using read_csv() from readr, the command spec() returns "full column specification" for the resulting tibble:
> spec(steps)
cols(
duration = col_double(),
...
2
votes
3
answers
138
views
Difference between if_any(any_of(vars)) and if_any(all_of(vars))
Take the following MWE:
df <- data.frame(a=c(TRUE, TRUE, FALSE), b=c(FALSE, TRUE, FALSE))
myvars <- c("a","b")
The aim is to build a column c which is row-wise TRUE if one ...
3
votes
2
answers
143
views
How can I pass column names through nested functions using `dplyr` in R?
I am trying to write a function that, using tidyverse tools, will load two CSVs, join them on some key column, and write the joined result to a CSV. I am passing in the names of the key columns as ...
10
votes
8
answers
752
views
Is there an R function to calculate row sums using a range/window of column indices?
Is there a way to calculate the sum (or mean, etc.) for a range of values that are defined for that row in another column?
Here's some sample data:
structure(list(start = c("cmi_apr", "...
1
vote
2
answers
108
views
Join two dataframes, group by the first keeping all its columns, summarize the second
I have two data.frames, nests, and nest_days, which contains more days for one nest. I want to do a join, and group by nests (i.e. group by the rows of the first data.frame). I want to keep all ...
0
votes
1
answer
69
views
Adding row names within the reframe() function in dplyr
I would like to know how to add a column of row names in combination with the reframe() function in dplyr. Here's toy data: a grouped dataset with three groups crossed with three conditions and a ...
8
votes
4
answers
717
views
How to count number of instances over a threshold value between two dates?
I have two data frames I am working with. The first contains information on a cohort of kids. These same kids have been measured multiple times on different days (uniqueID AD_1 indicates child AD ...
0
votes
2
answers
83
views
how to extract dataframe name from a list using map
I tried to use 'map' to filter a list of dataset. how can I extract the name of dataframe, which is 'YYY' in my code.
library(dplyr)
library(purrr)
df1 <- data.frame(var1=seq(1:10), var2=seq(1:10),...
1
vote
1
answer
40
views
Joint Histogram and Boxplot Plot with Faceting
I want to visualize the distribution of a variable ('F3A7_1') by plotting a histogram, and right underneath it a boxplot (below the y-axis of the histogram or something). I also want to create this ...
2
votes
1
answer
139
views
Alternatives to deparse(substitute()) to get name of object within a function
In vignette("programming", package = "dplyr") there is the following example code
my_summarise2 <- function(data, expr) {
data %>% summarise(
mean = mean({{ expr }}),
...
2
votes
3
answers
88
views
set_names() returns "character" not "tbl_df, tbl, data.frame"
Unexpected output of set_names()
I'm trying to name a list of tibbles using set_names.
I expect the names to be "tbl_df, tbl, data.frame"
However, the result is "character".
Any ...
0
votes
1
answer
69
views
R distance to first point of SpatVect points
I have a spatial vector (Spatvector) with points arranged in groups. I want to calculate the distance of each point to the first point in the group.
Trans2024_df
Name Distance_to_first
1 ...
2
votes
1
answer
73
views
How can I use group_by and mutate to perform a subtraction calculation with specific groupings? Time 0 minus Time X for all groups [closed]
I have this dataframe:
mydf <- structure(list(Time = c("T0", "T3", "T3", "T0", "T3", "T3"),
Organism = c("BB", "BB&...
2
votes
2
answers
350
views
How to resolve "no visible global binding for variable" for column name in dplyr tidy-select pipeline?
The following code runs fine, but it fails a lintr check and raises a note from devtools::check(), which is preventing me from submitting my package to CRAN.
# MINIMAL REPRODUCIBLE EXAMPLE OF ...
1
vote
2
answers
59
views
How to plot geom_bar without showing multiple lines
I am trying to do a bar plot, however, is printing multiple lines on each bar. I understand that I might have to aggregate but if I do I need to retain the original columns because I need to fill by ...
0
votes
1
answer
55
views
Using pivot_longer() to expand rows of data and combine a pattern of columns
I have a data frame that looks like this:
ID Email Name Company TripIdentifier Date1 Campsite1 NumberOfAnimals1 Date2 Campsite2 NumberOfAnimals2
1 1 [email protected] ...
2
votes
2
answers
293
views
Handling larger than memory datasets in R
I'm working with 50 Parquet files (each file is of ~800MB with ~380000 rows and ~8 columns). I need to perform a grouped summarisation in R. Something like:
group_by(sample_id, gene1, gene2) %>%
...
1
vote
2
answers
94
views
Create a Grouping Column/Variable from other Columns in R
I'm trying to group data into a grouping variable based on whether or not there is data in specific columns. In other words, if there is data in the same row for V1 & V2 below, then I want to put ...
1
vote
3
answers
86
views
Create multiple tables
I am using survey data and would like to create multiple cross tables with row percentages at once.
Below is some dummy data and the way I would create those tables individually. Where I get confused ...
0
votes
0
answers
31
views
Escaping by a inserting a slash before special characters [duplicate]
I have the following list of words/phrases:
vec <- c("Apple (Red)",
"Ginger [Indian]",
"Avocado",
"Cherry: Spring time")
I want to ...
4
votes
5
answers
164
views
Add a column in R based on other columns values [closed]
I'm relatively new to R. I have a table of data consisting of an id, plus 3 values.
library(dplyr)
df <- tibble(id=c(1, 2, 3),val_a = c(13,25,42), val_b = c(25,30,0), val_c = c(7,27,21))
df
# A ...
3
votes
5
answers
190
views
Convert a specific warning type into an error
Consider the following code:
> df <- tibble(gender=c(1,1,0))
> df$male
Warning: Unknown or uninitialised column: `male`.
NULL
How can I convert this specific warning type into an error?
I ...
2
votes
1
answer
126
views
Conditionally overwrite cell text with gt::gt()
With {gt}, you can conditionally style cells based on the values in another column in your dataframe. Is there a way to conditionally overwrite the cell display values based on a different column? In ...
6
votes
6
answers
150
views
The function dplyr mutate does not to apply a function stored in one of the dataframe/data.table object
I have a data.table object which contains 3 columns (spec_fun, param_1, param_2). I wanted to use mutate to create a new column based on the result of the 3 mentioned columns but I got the error:
...
2
votes
2
answers
101
views
cleaner way to match observations using IDs in different columns
I am using mark-recapture data, with each individual having a PIT tag (Tag) and a genetic sample ID given at the first observation (ID), which is typically used as the ID for the individual (Unified....
5
votes
0
answers
91
views
Extending tbl with custom print
I have created a sub-class of a tbl and I want to provide a custom print method which adds some more arguments to further tweak the output. I have read vignette("extending", package = "...
1
vote
1
answer
50
views
parse dynamic case_when based on thresholds values
I'd like to know if there is a way to parse a dynamic case_when expression.
Let's take as an example a vector of of 3 thresholds :
thresholds = c(0.4, 0.6, 0.8)
Based on that, I would like to assign ...