Newest 'dataset' Questions

Advice

0 votes

1 replies

83 views

Regression analysis

How should I handle a mass-point in the dependent variable when running OLS regression in R? I’m working with a a household expenditure dataset (Living Costs 2019) where the dependent variable is the ...

Jimothan

1

asked Nov 30 at 11:29

1 vote

1 answer

80 views

Dropdown Menu - Adding an extra space in options

I'm currently in a Coding class and I'm making an app for a project. We have to insert Datasets into our app. The problem I am experiencing is that when I use my dropdown menu for options in my ...

Melina Nichols

21

asked Nov 12 at 1:44

1 vote

1 answer

146 views

Is there a way to write a previously read and updated DataSet back to Excel? [duplicate]

The idea is simple and no matter how much I have searched I couldn't find any similar example. Using the OleDbDataAdapter class, I am able to achieve the first step: Read a specific range in a ...

ptuga

79

asked Oct 30 at 23:47

1 vote

0 answers

108 views

How to lazy load jsonl file

I am trying to build a pytorch Dataset based on some .jsonl files. The size of each .jsonl file is about 2GB, and I have 50 such files. Therefore, it would be not very practical to load all these ...

xuehao-049

11

asked Oct 14 at 16:02

-1 votes

0 answers

24 views

javascript, how to select an element with random "dataset" attribute [duplicate]

there is a set of elements that looks like <a data-v-924a11e2 ... </a> '924a11e2' of 'data-v-924a11e2' is random each time when page updates. And this dataset 'data-v-924a11e2' has no value. ...

Alex Krutoi

1

asked Sep 27 at 18:08

0 votes

0 answers

103 views

Record Selection in Main Dataset from a Sub Dataset in Delphi

The application I'm writing has a main form where all the database fields and DBGrid that contains the entire SQLite dataset (ExpItemQuery) reside. I have designed a database search functionality into ...

Skypilot65

216

asked Sep 23 at 23:42

0 votes

0 answers

21 views

How to name data sets by algorithm function - covariates vs target vs context etc

I have coded up a program that has a scoring target value plus other necessary values associated with that target value, plus the same features are used as dependents in my prediction engine. Up to ...

ciaran haines

389

asked Sep 10 at 8:23

0 votes

1 answer

139 views

How to load dataset from huggingface to google colab?

I am trying to load a training dataset in my Google Colab notebook but keep getting an error. Here is the code snippet which returns the error: from datasets import load_dataset ds = load_dataset(&...

AlecArk

1

asked Aug 23 at 14:07

1 vote

1 answer

48 views

How to access survival jasa dataset

The survival package contains a heart dataset which comes in two format, "jasa" and "jasa1". How can I access the "jasa" format? I tried data("jasa", package=&...

robertspierre

5,383

asked Aug 11 at 14:32

3 votes

1 answer

57 views

Merge many .fasta files

I’m currently working with a large dataset and need help merging multiple .fasta files. Although I’m not an expert, I’ve attempted to automate this process using a Python script. However, the merging ...

Andrea S.

31

asked Jul 23 at 18:59

0 votes

1 answer

54 views

Efficient multi-host TPU dataset processing

I want to train LLM on TPUv4-32 using JAX/Flax. The dataset is stored in a mounted google storage bucket. The dataset (Red-Pajama-v2) consists of 5000 shards, which are stored in .json.gz files: ~/...

innerproduct

3

asked Jul 10 at 21:35

3 votes

2 answers

106 views

Few questions on a medical dataset

I am currently working on a medical dataset containing whole slide images (~300 images). These images are pretty big (.tif, average of 7k by 7k pixels). I am currently dividing each of these images ...

Shini_ Gami

39

asked Jul 10 at 19:26

0 votes

0 answers

70 views

Hugging Face applying Transformation on nested to datasets without loading into memory

I am trying to apply below transformation for preparing my datasets for fine tuning using unsloth huggingface. It requires the dataset to be in following format. def convert_to_conversation(sample): ...

SoraHeart

428

asked Jul 4 at 11:27

0 votes

1 answer

38 views

why TensorDataset devide the data to minibatches?

Why TensorDataset devide the data to minibatches? For example, when putting in it 2D array, instead of yielding 2D tensors as batches, it sets the required batches to be minibatches, and its actual &...

J. Doe

305

asked Jul 3 at 17:32

1 vote

1 answer

54 views

Is there a limit to the number of columns that can be accessed in a wix dataset using velo?

I am trying to help my church have a lending library. Currently, we have a Google sheet inserted in the Wix site. I am trying to make a better version. The idea is to import all the books as a .csv ...

Barney Joseph

11

asked Jun 25 at 9:08

1 vote

1 answer

100 views

Map reduce script usage limit exceeds in the reduce stage in Netsuite

I hope you are doing well! I have developed map reduce script to send dataset results as a csv file using map reduce script. If there is huge data, script exceeds usgae limit in the reduce stage. I ...

Maira S

121

asked Jun 13 at 6:01

0 votes

1 answer

40 views

Discrepancy in number of elements outputted by torch Dataset and DataLoader

I have a custom Subset: class TestSubset2(Subset): def __init__(self, dataset, indices, days=False): super().__init__(dataset, indices) self.days = days def __getitem__(self, ...

Samuel Kostousov

11

asked May 21 at 1:41

0 votes

2 answers

142 views

ADF Lookup Activity doesn't work with DataSet parameters

I am trying to use lookup activity but I want to use a dataset which is parameterized. I thought it will be simple but I think it doesn't support dataset parameters for a azure database source. Can ...

SandeepG

73

asked May 11 at 8:06

0 votes

1 answer

31 views

SSRS Report Manager not returning dataset result

How can I return the expected dataset (stored procedure) results from SSRS report Manager? SSRS report Manager is returning a different dataset (stored procedure) result than what's expected from ...

Andrew Phommathep

1

asked May 9 at 18:35

0 votes

1 answer

50 views

Azure DF error: Unable to parse expression

I am trying to use a dataset parameter set in the pipeline to make my blob path dynamic for each data flow I've created. However, just testing this first data flow, I keep getting an error saying '...

creed6700

1

asked Apr 16 at 13:44

0 votes

1 answer

69 views

NetSuite Saved Search vs. Dataset: Date Discrepancy

I'm encountering a discrepancy between the Created date displayed in a NetSuite Saved Search and a NetSuite Dataset. The saved search shows one date, while the dataset shows the same record with the ...

Maira S

121

asked Mar 26 at 5:38

1 vote

1 answer

62 views

How to retrieve SPICE storage space occupied per dataset

While I am trying to retrieve the SPICE storage copied per dataset in AWS Quicksight. I came across the following: AWS does not provide a direct per-dataset breakdown in the console, but you can use ...

Tam

4,027

asked Mar 19 at 7:55

0 votes

2 answers

50 views

Spark Java Structured Streaming filters

Santosh Shimpi

19

asked Mar 18 at 4:56

1 vote

1 answer

75 views

Between in dplyr with lm function

I am testing for outliers using the iris dataset mod <- lm(Sepal.Width ~ Sepal.Length*Species, data = iris) I use rstudent() to calculate the studentized residuals, and add an indicator whether ...

mariann

35

asked Mar 16 at 18:54

1 vote

1 answer

70 views

boxplot across multiple columns in R

I'm new to R and having next to nothing experience with it I'm struggling with what may be a pretty easy problem. I have a dataset of acceptability judgments provided by a group of 30 participants on ...

Daler Fergani

11

asked Mar 14 at 10:55

0 votes

1 answer

89 views

Tensorflow Keras model with interleave is having issues fitting

I'm working on a CNN project where I make use of TensorFlow and Keras. In addition, I use interleave as the dataset is rather large (too large to load into the RAM with our resources). But there's an ...

William Frid

85

asked Mar 13 at 19:34

0 votes

1 answer

99 views

pytorch code messes up my RAM when using torch.zeros()

I have a function to measure the allocated ram by python in megabytes: def getram(): print(psutil.Process(os.getpid()).memory_info().rss / 1024**2) And also I have: device = "cuda" My ...

Nex

3

asked Mar 12 at 17:21

0 votes

1 answer

46 views

tensorflow dataset loop endless will model.predict

I use this code to setup my dataset for trainning and predict: train_dataset = train_dataset.batch(train_batch_sz) train_dataset = train_dataset.repeat().prefetch(5) test_dataset_sim = test_dataset ...

Jonathan Roy

453

asked Mar 10 at 21:38

0 votes

1 answer

116 views

loading a dataset with datasets.load_dataset is hanging

I'm trying to load some data using datasets.load_datasets. It runs correctly on a head node. The issue is happening on a slurm node. I'm using a conda env with datasets installed. When I run on head ...

ate50eggs

454

asked Mar 10 at 20:01

0 votes

1 answer

89 views

How to import .xlsx in Jupyter Notebook from iPhone Carnets App?

I would like to try to work the dataset in Jupyter Notebook from the iPhone Carnets App, but I can’t search the correct path. If someone can help me. The dataset I download from Kaggle. Are free files ...

Samusdr17

1

asked Mar 9 at 3:23

0 votes

0 answers

14 views

How to update a list of items in a dataset in Oracle Analytics

We are just beginning to use Oracle Analytics. We manage large amounts of data and much of what my team does is runs queries for lists of items. This can be 10, 100, or 1000s of different item ...

babs75

1

asked Mar 6 at 23:23

0 votes

1 answer

73 views

How to email dataset results in a CSV format using scheduled script?

I am using a NetSuite dataset (N/dataset API in SuiteScript 2.1) to retrieve existing dataset results. The dataset returns results in JSON format, including columns like PO Status, Billing Status, Sub ...

Maira S

121

asked Feb 20 at 4:56

0 votes

1 answer

263 views

The FAISS indexing and the dataset indexing don't match

I'm trying to compute the recall after performing a HNSW search in FAISS. By recall, I mean the following metric: Recall = TP / (TP + FN) Where I consider an image as a True Positive (TP) if it ...

No Yeah

23

asked Feb 17 at 1:52

1 vote

6 answers

113 views

Automating averages and stdevs for large dataset in set intervals of rows

I am currently trying to make a dataset which refers to a series of data in another tab, of which it takes 20 points (vertically) and averages them and gives a stdev then two open rows and then again ...

Jan van oest

11

asked Jan 29 at 10:27

1 vote

0 answers

100 views

How to interpolate a multidimensional xarray DataArray?

I am using the xarray library and I have some doubts/questions. I have this dataset:: ds <xarray.Dataset> Size: 2GB Dimensions: (Latitude: 364, Longitude: 246, Lon_u: 247, Lat_v: 364, ...

ChesuCR

9,752

asked Jan 23 at 12:29

0 votes

0 answers

15 views

How to manage a history of data sets feeding into a PBI dashboard?

I have a simple PowerBI dashboard that links to several Excel spreadsheets. I take a cut of data monthly, but want to save and allow the user to select previous iterations using a drop down slicer. ...

FraserS

3

asked Jan 21 at 15:59

0 votes

1 answer

65 views

Trouble with Filtering a .NET DataTable

I have the following code: myDataSet.myTable.DefaultView.RowFilter = "Code <> 'X'" After the line executes, I can see the updated result where no rows contain an X for the Code column,...

Rich

6,585

asked Jan 17 at 15:29

0 votes

1 answer

63 views

About pytorch model input sample size

I'm newbie in pytorch and AI. I found a repo (https://github.com/kacperbednarz1997/AMC_nets) where author compares various models for automatic modulation classification (AMC). I get all datasets used ...

AlexDtd

21

asked Jan 16 at 15:14

0 votes

1 answer

47 views

override pytorch Dataset efficiently

I want to inherit the torch.utils.data.Dataset class to load my custom image dataset, let's say for a classification task. here is the example of official pytorch website in this link: import os ...

Sepehr Amini Afshar

3

asked Jan 16 at 2:38

0 votes

0 answers

28 views

Improving performance of DataSet.ReadXml by means of DataTable.BeginLoadData: DataSet.EnforceConstraints is left false

In Loading a DataSet from XML, I've found the following statement: If you call ReadXml to load a very large file, you may encounter slow performance. To ensure best performance for ReadXml, on a ...

exaiwitmx

119

asked Jan 14 at 17:02

0 votes

0 answers

59 views

Unexpected Feature ID in Yahoo! Webscope ydata-frontpage-todaymodule-clicks-v1_0 Dataset

I'm working with the Yahoo! Webscope dataset ydata-frontpage-todaymodule-clicks-v1_0 (specifically, the click logs for the first ten days in May 2009). The dataset description states that each user ...

amarchin

2,124

asked Jan 11 at 0:13

0 votes

1 answer

76 views

Issue with Merging Time-Series datasets for consistent Time Intervals

I am currently working on a project where I have to first merge two datasets: The first dataset contains weather data in 30 minute intervals. The second dataset contains minute-level data with PV ...

TryingToLearn

1

asked Jan 10 at 15:52

0 votes

0 answers

14 views

How to quantify differences between data sets in SQL Server?

I'm comparing two datasets in SQL Server (tables of the same schema) using row hashing (for example, using CheckSum() or HashBytes()). At this point, I can tell which records are identical and which ...

waza

11

asked Jan 8 at 20:54

0 votes

1 answer

61 views

Dataset uploaded and verified, but cannot use read.csv

I have uploaded a csv I created. This dataset is visible in Kaggle under "My datasets" - I can open and inspect it. I have tried various ways of linking a dataset to my notebook. clicking ...

Heike Vau

1

asked Jan 5 at 1:01

2 votes

0 answers

68 views

How to supply sample weights to Tensorflow dataset in R

I am trying to set up a dataset in R to run a neural network using TensorFlow, but I can't seem to figure out the right code to allow sample weights to be specified. The input array is image_data and ...

D_Taylor

23

asked Jan 3 at 4:57

0 votes

0 answers

13 views

Consistent Color Coding for Organ-Specific Biodistribution Data (column features as function of timepoint)

Hi, I'm trying to implement a consistent color coding for organ-specific biodistribution data across multiple sheets in Tableau. My dataset is structured such that each organ is represented by its own ...

GalB

11

asked Jan 2 at 14:46

0 votes

2 answers

263 views

How to Generate QR Image to crystal report dynamically without add new blob field manually?

Can I add a QR image to a Crystal Report dynamically without adding a new column to the dataset and dragging it to the report? I retrieve the URL from the database, and based on that, I display the ...

ahmed salah abd elaziz elbarba

37

asked Dec 27, 2024 at 20:31

0 votes

0 answers

109 views

RuntimeError to obtain ImageNet dataset from torchvision.datasets: ILSVRC2012_devkit_t12.tar.gz is not present in the root directory or is corrupted

Given: import os from torchvision.datasets import ImageNet from torch.utils.data import DataLoader train_dataset = ImageNet( root=os.path.expanduser("~/.cache"), split='train', ...

farid

1,631

asked Dec 27, 2024 at 12:05

0 votes

1 answer

53 views

PyTorch DeiT model keeps predicting one class no matter what

We are trying to fine-tune a custom model on an imported DeiT distilled patch16 384 pretrained model. Output: Cost at epoch 0 is 4.611058227040551 Cost at epoch 1 is 0.9889081553979353 test set ...

shamilemir

1

asked Dec 19, 2024 at 4:52

1 vote

1 answer

87 views

How to join two datasets based on the id, date and approximate time?

I need to merge datasets A and B based on the date, id and nearest time. The time in the datasets are not matched. The time in B is always 0 to 10 minutes greater than the time in A. I tried left_join ...

Mee

321

asked Dec 17, 2024 at 10:53

Collectives™ on Stack Overflow