11,325 questions
Advice
0
votes
1
replies
83
views
Regression analysis
How should I handle a mass-point in the dependent variable when running OLS regression in R?
I’m working with a a household expenditure dataset (Living Costs 2019) where the dependent variable is the ...
1
vote
1
answer
80
views
Dropdown Menu - Adding an extra space in options
I'm currently in a Coding class and I'm making an app for a project. We have to insert Datasets into our app. The problem I am experiencing is that when I use my dropdown menu for options in my ...
1
vote
1
answer
146
views
Is there a way to write a previously read and updated DataSet back to Excel? [duplicate]
The idea is simple and no matter how much I have searched I couldn't find any similar example.
Using the OleDbDataAdapter class, I am able to achieve the first step:
Read a specific range in a ...
1
vote
0
answers
108
views
How to lazy load jsonl file
I am trying to build a pytorch Dataset based on some .jsonl files. The size of each .jsonl file is about 2GB, and I have 50 such files. Therefore, it would be not very practical to load all these ...
-1
votes
0
answers
24
views
javascript, how to select an element with random "dataset" attribute [duplicate]
there is a set of elements that looks like <a data-v-924a11e2 ... </a>
'924a11e2' of 'data-v-924a11e2' is random each time when page updates. And this dataset 'data-v-924a11e2' has no value. ...
0
votes
0
answers
103
views
Record Selection in Main Dataset from a Sub Dataset in Delphi
The application I'm writing has a main form where all the database fields and DBGrid that contains the entire SQLite dataset (ExpItemQuery) reside. I have designed a database search functionality into ...
0
votes
0
answers
21
views
How to name data sets by algorithm function - covariates vs target vs context etc
I have coded up a program that has a scoring target value plus other necessary values associated with that target value, plus the same features are used as dependents in my prediction engine. Up to ...
0
votes
1
answer
139
views
How to load dataset from huggingface to google colab?
I am trying to load a training dataset in my Google Colab notebook but keep getting an error.
Here is the code snippet which returns the error:
from datasets import load_dataset
ds = load_dataset(&...
1
vote
1
answer
48
views
How to access survival jasa dataset
The survival package contains a heart dataset which comes in two format, "jasa" and "jasa1".
How can I access the "jasa" format?
I tried
data("jasa", package=&...
3
votes
1
answer
57
views
Merge many .fasta files
I’m currently working with a large dataset and need help merging multiple .fasta files. Although I’m not an expert, I’ve attempted to automate this process using a Python script. However, the merging ...
0
votes
1
answer
54
views
Efficient multi-host TPU dataset processing
I want to train LLM on TPUv4-32 using JAX/Flax. The dataset is stored in a mounted google storage bucket. The dataset (Red-Pajama-v2) consists of 5000 shards, which are stored in .json.gz files: ~/...
3
votes
2
answers
106
views
Few questions on a medical dataset
I am currently working on a medical dataset containing whole slide images (~300 images).
These images are pretty big (.tif, average of 7k by 7k pixels). I am currently dividing each of these images ...
0
votes
0
answers
70
views
Hugging Face applying Transformation on nested to datasets without loading into memory
I am trying to apply below transformation for preparing my datasets for fine tuning using unsloth huggingface. It requires the dataset to be in following format.
def convert_to_conversation(sample):
...
0
votes
1
answer
38
views
why TensorDataset devide the data to minibatches?
Why TensorDataset devide the data to minibatches? For example, when putting in it 2D array, instead of yielding 2D tensors as batches, it sets the required batches to be minibatches, and its actual &...
1
vote
1
answer
54
views
Is there a limit to the number of columns that can be accessed in a wix dataset using velo?
I am trying to help my church have a lending library. Currently, we have a Google sheet inserted in the Wix site. I am trying to make a better version.
The idea is to import all the books as a .csv ...
1
vote
1
answer
100
views
Map reduce script usage limit exceeds in the reduce stage in Netsuite
I hope you are doing well!
I have developed map reduce script to send dataset results as a csv file using map reduce script. If there is huge data, script exceeds usgae limit in the reduce stage. I ...
0
votes
1
answer
40
views
Discrepancy in number of elements outputted by torch Dataset and DataLoader
I have a custom Subset:
class TestSubset2(Subset):
def __init__(self, dataset, indices, days=False):
super().__init__(dataset, indices)
self.days = days
def __getitem__(self, ...
0
votes
2
answers
142
views
ADF Lookup Activity doesn't work with DataSet parameters
I am trying to use lookup activity but I want to use a dataset which is parameterized. I thought it will be simple but I think it doesn't support dataset parameters for a azure database source.
Can ...
0
votes
1
answer
31
views
SSRS Report Manager not returning dataset result
How can I return the expected dataset (stored procedure) results from SSRS report Manager? SSRS report Manager is returning a different dataset (stored procedure) result than what's expected from ...
0
votes
1
answer
50
views
Azure DF error: Unable to parse expression
I am trying to use a dataset parameter set in the pipeline to make my blob path dynamic for each data flow I've created. However, just testing this first data flow, I keep getting an error saying '...
0
votes
1
answer
69
views
NetSuite Saved Search vs. Dataset: Date Discrepancy
I'm encountering a discrepancy between the Created date displayed in a NetSuite Saved Search and a NetSuite Dataset. The saved search shows one date, while the dataset shows the same record with the ...
1
vote
1
answer
62
views
How to retrieve SPICE storage space occupied per dataset
While I am trying to retrieve the SPICE storage copied per dataset in AWS Quicksight.
I came across the following: AWS does not provide a direct per-dataset breakdown in the console, but you can use ...
0
votes
2
answers
50
views
Spark Java Structured Streaming filters
I have the following data below:
+-------+----------+------------+---------+---------------------+-----------+
|id |resource id|resource name|event-desc|event-date |ipaddress1 |
+-------+---...
1
vote
1
answer
75
views
Between in dplyr with lm function
I am testing for outliers using the iris dataset
mod <- lm(Sepal.Width ~ Sepal.Length*Species, data = iris)
I use rstudent() to calculate the studentized residuals, and add an indicator whether ...
1
vote
1
answer
70
views
boxplot across multiple columns in R
I'm new to R and having next to nothing experience with it I'm struggling with what may be a pretty easy problem. I have a dataset of acceptability judgments provided by a group of 30 participants on ...
0
votes
1
answer
89
views
Tensorflow Keras model with interleave is having issues fitting
I'm working on a CNN project where I make use of TensorFlow and Keras. In addition, I use interleave as the dataset is rather large (too large to load into the RAM with our resources). But there's an ...
0
votes
1
answer
99
views
pytorch code messes up my RAM when using torch.zeros()
I have a function to measure the allocated ram by python in megabytes:
def getram(): print(psutil.Process(os.getpid()).memory_info().rss / 1024**2)
And also I have:
device = "cuda"
My ...
0
votes
1
answer
46
views
tensorflow dataset loop endless will model.predict
I use this code to setup my dataset for trainning and predict:
train_dataset = train_dataset.batch(train_batch_sz)
train_dataset = train_dataset.repeat().prefetch(5)
test_dataset_sim = test_dataset
...
0
votes
1
answer
116
views
loading a dataset with datasets.load_dataset is hanging
I'm trying to load some data using datasets.load_datasets. It runs correctly on a head node. The issue is happening on a slurm node. I'm using a conda env with datasets installed.
When I run on head ...
0
votes
1
answer
89
views
How to import .xlsx in Jupyter Notebook from iPhone Carnets App?
I would like to try to work the dataset in Jupyter Notebook from the iPhone Carnets App, but I can’t search the correct path.
If someone can help me.
The dataset I download from Kaggle. Are free files ...
0
votes
0
answers
14
views
How to update a list of items in a dataset in Oracle Analytics
We are just beginning to use Oracle Analytics. We manage large amounts of data and much of what my team does is runs queries for lists of items. This can be 10, 100, or 1000s of different item ...
0
votes
1
answer
73
views
How to email dataset results in a CSV format using scheduled script?
I am using a NetSuite dataset (N/dataset API in SuiteScript 2.1) to retrieve existing dataset results. The dataset returns results in JSON format, including columns like PO Status, Billing Status, Sub ...
0
votes
1
answer
263
views
The FAISS indexing and the dataset indexing don't match
I'm trying to compute the recall after performing a HNSW search in FAISS. By recall, I mean the following metric:
Recall = TP / (TP + FN)
Where I consider an image as a True Positive (TP) if it ...
1
vote
6
answers
113
views
Automating averages and stdevs for large dataset in set intervals of rows
I am currently trying to make a dataset which refers to a series of data in another tab, of which it takes 20 points (vertically) and averages them and gives a stdev then two open rows and then again ...
1
vote
0
answers
100
views
How to interpolate a multidimensional xarray DataArray?
I am using the xarray library and I have some doubts/questions.
I have this dataset::
ds
<xarray.Dataset> Size: 2GB
Dimensions: (Latitude: 364, Longitude: 246, Lon_u: 247, Lat_v: 364,
...
0
votes
0
answers
15
views
How to manage a history of data sets feeding into a PBI dashboard?
I have a simple PowerBI dashboard that links to several Excel spreadsheets. I take a cut of data monthly, but want to save and allow the user to select previous iterations using a drop down slicer. ...
0
votes
1
answer
65
views
Trouble with Filtering a .NET DataTable
I have the following code:
myDataSet.myTable.DefaultView.RowFilter = "Code <> 'X'"
After the line executes, I can see the updated result where no rows contain an X for the Code column,...
0
votes
1
answer
63
views
About pytorch model input sample size
I'm newbie in pytorch and AI.
I found a repo (https://github.com/kacperbednarz1997/AMC_nets) where author compares various models for automatic modulation classification (AMC).
I get all datasets used ...
0
votes
1
answer
47
views
override pytorch Dataset efficiently
I want to inherit the torch.utils.data.Dataset class to load my custom image dataset, let's say for a classification task. here is the example of official pytorch website in this link:
import os
...
0
votes
0
answers
28
views
Improving performance of DataSet.ReadXml by means of DataTable.BeginLoadData: DataSet.EnforceConstraints is left false
In Loading a DataSet from XML, I've found the following statement:
If you call ReadXml to load a very large file, you may encounter slow performance. To ensure best performance for ReadXml, on a ...
0
votes
0
answers
59
views
Unexpected Feature ID in Yahoo! Webscope ydata-frontpage-todaymodule-clicks-v1_0 Dataset
I'm working with the Yahoo! Webscope dataset ydata-frontpage-todaymodule-clicks-v1_0 (specifically, the click logs for the first ten days in May 2009). The dataset description states that each user ...
0
votes
1
answer
76
views
Issue with Merging Time-Series datasets for consistent Time Intervals
I am currently working on a project where I have to first merge two datasets:
The first dataset contains weather data in 30 minute intervals. The second dataset contains minute-level data with PV ...
0
votes
0
answers
14
views
How to quantify differences between data sets in SQL Server?
I'm comparing two datasets in SQL Server (tables of the same schema) using row hashing (for example, using CheckSum() or HashBytes()).
At this point, I can tell which records are identical and which ...
0
votes
1
answer
61
views
Dataset uploaded and verified, but cannot use read.csv
I have uploaded a csv I created. This dataset is visible in Kaggle under "My datasets" - I can open and inspect it.
I have tried various ways of linking a dataset to my notebook.
clicking ...
2
votes
0
answers
68
views
How to supply sample weights to Tensorflow dataset in R
I am trying to set up a dataset in R to run a neural network using TensorFlow, but I can't seem to figure out the right code to allow sample weights to be specified.
The input array is image_data and ...
0
votes
0
answers
13
views
Consistent Color Coding for Organ-Specific Biodistribution Data (column features as function of timepoint)
Hi,
I'm trying to implement a consistent color coding for organ-specific biodistribution data across multiple sheets in Tableau. My dataset is structured such that each organ is represented by its own ...
0
votes
2
answers
263
views
How to Generate QR Image to crystal report dynamically without add new blob field manually?
Can I add a QR image to a Crystal Report dynamically without adding a new column to the
dataset and dragging it to the report? I retrieve the URL from the database, and based on
that, I display the ...
0
votes
0
answers
109
views
RuntimeError to obtain ImageNet dataset from torchvision.datasets: ILSVRC2012_devkit_t12.tar.gz is not present in the root directory or is corrupted
Given:
import os
from torchvision.datasets import ImageNet
from torch.utils.data import DataLoader
train_dataset = ImageNet(
root=os.path.expanduser("~/.cache"),
split='train',
...
0
votes
1
answer
53
views
PyTorch DeiT model keeps predicting one class no matter what
We are trying to fine-tune a custom model on an imported DeiT distilled patch16 384 pretrained model.
Output:
Cost at epoch 0 is 4.611058227040551
Cost at epoch 1 is 0.9889081553979353
test set ...
1
vote
1
answer
87
views
How to join two datasets based on the id, date and approximate time?
I need to merge datasets A and B based on the date, id and nearest time.
The time in the datasets are not matched. The time in B is always 0 to 10 minutes greater than the time in A.
I tried left_join ...