Skip to main content
Filter by
Sorted by
Tagged with
-4 votes
3 answers
168 views

I'm searching for duplicate values per column and need a count of them and data from some additional columns. Table sample BillNr Name email 1000 Shakira [email protected] 1001 Shakira [email protected] ...
Kaptah's user avatar
  • 9,887
0 votes
2 answers
103 views

We have got a table with a identifier, a key/value pairs and a start and end timestamp which indicates the valid period for the values. MASTER_WORK_ORDR_ID START_TS END_TS WORK_ORDR_ID_CTXT ...
L.P.'s user avatar
  • 13
3 votes
3 answers
157 views

I have a SQL table in postgres 14 that looks something like this: f_key data1 data2 fit 1 {'a1', 'a2'} null 3 1 {'b1', 'b2'} {'b3'} 2 2 {'c1', 'c2'} null 3 Note that data1 and data2 are arrays. I need ...
fitek's user avatar
  • 303
1 vote
5 answers
98 views

I use Postgres on my web server in order to record incoming queries into a table calls2, basically writing a single row each time with lots of repeating information, such as a date field ("when&...
Thomas Tempelmann's user avatar
0 votes
0 answers
108 views

I am getting error using GROUP BY RANGE in GridDB sql. I am referring to the example mention in the doc https://griddb.org/docs-en/manuals/GridDB_SQL_Reference.html#group-by-range name: trend_data1 ts ...
sayana_dutta's user avatar
3 votes
4 answers
223 views

Hopefully the title is reasonably intuitive, edits welcome. Say I have this dataframe: df = pd.DataFrame({'x': ['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D'], 'y': [None, None, 1, 2, 3, 4,...
Hendy's user avatar
  • 10.7k
0 votes
1 answer
48 views

I have the following code: SELECT h3s.h3id, h3s.geog, MIN(ST_DISTANCE(`carto-os`.carto.H3_CENTER(htsp.h3id), `carto-os`.carto.H3_CENTER(h3s.h3id))) OVER (PARTITION BY h3s.h3id) FROM ...
Chris's user avatar
  • 67
1 vote
2 answers
104 views

I have a dataset called TotalPhosphorus, and I want to assign seasons to each observation. However, I need the winter season to include December from the previous year and January–March from the ...
Daniela's user avatar
  • 17
3 votes
4 answers
129 views

I am trying to perform a loop which loops through a list of single or multiple variables then sums a column. I am essentially trying to paste in from a list into the group_by() function so that it ...
KatChristiansen's user avatar
1 vote
1 answer
62 views

I have an in - memory DolphinDB table created as follows: ticker = `AAPL`IBM`IBM`AAPL`AMZN`AAPL`AMZN`IBM`AMZN volume = 106 115 121 90 130 150 145 123 155; t = table(ticker, volume); t; The output of ...
Dongyun Huang's user avatar
2 votes
1 answer
88 views

I am trying to manipulate a CSV using Pandas and I need to get the data into the format of one row per ID. This is an example of what I am trying to accomplish: From: df = pd.DataFrame({ 'ID': [1, 1, ...
sar's user avatar
  • 21
0 votes
1 answer
94 views

I have noticed that although df %>% group_by(firm) %>% arrange(week) %>% mutate(lag_sales = lag(sales)) %>% ungroup() ignores the grouping but calculates the correct lags as the ...
ZayzayR's user avatar
  • 307
2 votes
1 answer
78 views

I have a dataframe in R that looks like this (spaced to ease readability): utterance word syllable label syll_start syll_end 1 1 1 NA 1.1 2 1 1 ...
Jackson's user avatar
  • 75
0 votes
0 answers
27 views

I have a dataset populated from an API call to Splunk. The dataset contains the following: time destip destport transport 2025-09-17 22:03:09 172.16.5.1 53 UDP 2025-09-17 22:03:10 172.16.5.1 53 UDP ...
Jhowel's user avatar
  • 63
3 votes
2 answers
183 views

Here is an example of my data: sound word part syllable pitch_peak sound-1 mary subject 1 3.1 sound-1 mary subject 2 1.9 sound-1 studied verb 1 ...
Jackson's user avatar
  • 75
4 votes
0 answers
138 views

I’m storing IoT readings in a GridDB container and need one row per hour with the true average of the points that actually fall inside each hour (not interpolated values): ts_bucket ...
Badhon Ashfaq's user avatar
0 votes
1 answer
59 views

I'm trying to use power query to aggregate some invoicing columns by project number in power query. I'm currently using a group by function which looks at the project number and then aggregates each ...
Stephanie Noyce's user avatar
1 vote
1 answer
126 views

I have a polars dataframe that I want to group by and concatenate the unique values in as a single entry. in pandas, I go: def unique_colun_values(x): return('|'.join(set(x))) dd=pd.DataFrame({'...
frank's user avatar
  • 3,816
3 votes
1 answer
99 views

Table schema: CREATE TABLE WeatherReadings ( ts TIMESTAMP, temp DOUBLE ); Sample data: INSERT INTO WeatherReadings (ts, temp) VALUES (TIMESTAMP('2025-08-22T01:05:00Z'), 20.5), (TIMESTAMP('...
Mr Jahangir's user avatar
1 vote
2 answers
100 views

I have a dataframe df made up of n columns which are groups and one, "data". This dataframe is then grouped on the n group columns. df = pd.DataFrame(data={"g0": ["foo", ...
Aristide's user avatar
0 votes
1 answer
79 views

I am trying to use XSLT in my application(OIC) where based on input structure, I have to construct an output file which filters the records based on 2 elements. Input structure: <?xml version='1.0' ...
kumarb's user avatar
  • 497
7 votes
3 answers
444 views

I am trying to do a somewhat complicated group and sort operation in pandas. I want to sort the groups by their values in ascending order, using successive values for tiebreaks as needed. I have read ...
Jessica's user avatar
  • 1,813
0 votes
2 answers
122 views

I have a table of customer data. I will be joining it to a location table. Customer ID is distinct but Location ID is not because multiple customers can belong to one location. Each customer is ...
HawaiianShirts's user avatar
0 votes
2 answers
81 views

I'm trying to create a list of contracts that expire by dates. I looked on the many sites for a solution. I have a measure that calculates the date and i need calculated table with a summurazed ...
Pat N.'s user avatar
  • 47
-2 votes
2 answers
189 views

In the code example below I am grouping a pandas series using the same series but with a modified index. The groups in the end make no sense. There is no warning or error. Could you please help me ...
karpan's user avatar
  • 597
2 votes
2 answers
94 views

We're trying to group up date counts by month and index values are returning as decimals instead of integers when series contain any number of NaTs / na values. Simplified reproducible example: import ...
Chris Dixon's user avatar
  • 1,148
0 votes
2 answers
113 views

In my application I want to find the latest duty of each user from 'StaffDuty' table using hibernate query (i.e. HQL). Below is my query. query = session.createQuery("FROM StaffDuty where deptId....
KJEjava48's user avatar
  • 2,073
0 votes
1 answer
35 views

I'm encountering some issues when trying to perform grouped correlation calculations in DolphinDB. Here's my scenario: I'm using DolphinDB to calculate correlations between multiple columns in a table....
RORO's user avatar
  • 1
0 votes
3 answers
105 views

NOTE: This question has many related questions on StackOverFlow but I was unable to get my answer from any of them. I'm attempting to parallelize Prophet time series model training across multiple ...
Arnab Sinha's user avatar
1 vote
2 answers
106 views

I’m working with time-series data in SQL Server and need to retrieve the last valid value for each day. A valid value is defined as one that is non-null and not zero. The challenge is that data points ...
vishal_gosai's user avatar
1 vote
1 answer
65 views

I am not sure if this is possible with XSLT but I am trying to get the below XML into a format where it is name, title, date (if same date then only get date once), last value of In time (might not be ...
BryanG's user avatar
  • 29
1 vote
1 answer
72 views

What I'm currently doing is this: SELECT time_bucket('60 min', raw_data.timestamp) AS time_60min, COUNT(raw_data.vehicle_class) AS "count", raw_data.vehicle_class AS "...
PhilippR's user avatar
1 vote
0 answers
36 views

How can I use the Object.groupBy function with a variable? For example: const inventory = [ { Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" }, ...
ConsMI's user avatar
  • 19
2 votes
1 answer
285 views

I recently had to update the virtual environment for one of my libraries from Python 3.7 to 3.10, which also involved updating Pandas from 1.1.5 to 2.3.0. In the previous virtual environment, this ...
Jan Stuller's user avatar
0 votes
0 answers
69 views

I am working on some proof of concepts for ML and want to try an unusual scaling method. I would like to group my data and then "scale" it and apply a binarize to that data. Basically I ...
Tim Romero's user avatar
0 votes
1 answer
92 views

I have a database of music manuscripts that looks like the below diagram. A 'Source item' belongs to a certain manuscript (source). A source item is then categorized as EITHER a 'Section' of a 'Piece' ...
tapemachine86's user avatar
1 vote
1 answer
134 views

I am trying to come up with a frequency in Pandas that represents the start of a calendar week (configurable by week start). For example, for all dates from 2025-01-06 (Monday) to 2025-01-13 (Sunday), ...
bhub's user avatar
  • 149
2 votes
1 answer
53 views

Suppose I have this: ISresult = h25.groupby(['month','impactedservice']).agg({'resolvetime': ['count','median','mean', 'min', 'max','std']}) The column list looks like this: [('resolvetime', 'count'),...
Mark G's user avatar
  • 97
0 votes
1 answer
35 views

Sharing a common DolphinDB use case and solution for data processing. I have a table with four columns: order_book_id, date, Q, and revenue. I want to group the data by order_book_id and date, and ...
saki's user avatar
  • 319
0 votes
3 answers
120 views

I would like to extract the MIN and MAX from multiple columns (start_1, end_1, start_2, end_2) group by "Name" I have data that looks like this: start_1 end_1 start_2 end_2 name 100 ...
soosa's user avatar
  • 165
1 vote
3 answers
127 views

In pandas, I have the following long format dataframe with 1 binary variable « Metric » with 2 modalities (Nb of rooms in residence, squared meters of the residence) : pd.DataFrame({'State': {0: 'New ...
Lucas's user avatar
  • 59
0 votes
1 answer
97 views

BigQuery has a newly GROUP by grouping set [1]. It is syntax simpler than the traditional Group By Union approach. I wonder if it also performs much better, because grouping set only scan the source ...
Hui Zheng's user avatar
  • 3,247
-1 votes
1 answer
57 views

With this data: name movie john big daddy bob titanic john avatar I want the output to be: name movie john big daddy, avatar bob titanic tried this: SELECT name, LIST_AGG(movie) from people.table ...
yesiamamir's user avatar
0 votes
2 answers
81 views

Following is my table. I want to sum across all different combinations and put the sum in separate columns, not in the same column. data: Subject Var1 Var2 Var3 Var4 Constant1 Constant2 ONE 1 2 1 1 A ...
user10969476's user avatar
0 votes
0 answers
19 views

I have a table “clean_factor“ where the column “y“ indicates the stock returns and the subsequent columns indicate factor exposures. How do I calculate the daily residual return of each stock with the ...
smile qian's user avatar
1 vote
3 answers
94 views

How can I perform a GROUP BY in SQL when the group_name values are similar but not exactly the same? In my dataset, the group_name values may differ slightly (e.g., "Apple Inc.", "...
Ahamad's user avatar
  • 1
1 vote
3 answers
96 views

I am trying to find the max of transactions count based on type for each user. id user_id type 1 1 A 2 1 B 3 1 C 4 1 A 5 2 B 6 2 C 7 2 C 8 2 C I am expecting the output to be: user_id type count 1 A 2 ...
Mr.Singh's user avatar
  • 2,055
1 vote
2 answers
69 views

I am trying to figure out how to code a reverse look up in pandas dataframe using groupby and looking for the owner of a max time. ` import pandas as pd df = {'Name': ['Mike', 'Lilly', 'Frank', 'Jane',...
Tim Romero's user avatar
0 votes
1 answer
101 views

I am trying to use the SQL commands COUNT and GROUP BY to show the number of students with each letter grade, but I'm having difficulty in doing so. A new column that I created contains a letter grade ...
Colton's user avatar
  • 1
0 votes
0 answers
42 views

I am calculating the First Time Resolution (FTR) percentage from call logs using the following Python code with pandas and numpy. When I run the code on one CSV file (calls_logs_cleaned_2025-05-02.csv)...
IAIMT2024's user avatar

1
2 3 4 5
740