2,648 questions
1
vote
0
answers
115
views
Unable to load an hdf5 model file in TensorFlow / Keras
I was given an hdf5 model file that was build with tensorflow / keras. Training data is no more available.
Note: all Python code snippets shown hereunder are run against Python 3.9.23 inside a ...
0
votes
1
answer
85
views
Error when trying to access data in h5 file using h5py
I am trying to read an h5 file using python
with h5py.File(filename, 'r') as file:
print("file.keys() = ", file.keys())
a_group_key = list(file.keys())[0]
data = list(file[...
0
votes
1
answer
49
views
How to read LZO compressed pandas HDF5 files?
Here is the situation: I have data saved into pandas HDF5 files. Some data is compressed using lzo and some using blosc:zstd. Under RHEL-7, I was able to read both types of files. Then, I was ...
0
votes
1
answer
253
views
OSError: Can't synchronously read data (can't open directory) when trying to index retrieve values from a hdf5 file
I generated a hdf5 file with the code below:
# Create forecast training data file
import h5py
import hdf5plugin
#mmap_array = np.memmap('video_prediction_224.dat', dtype='uint8', mode='w+', shape=(...
0
votes
0
answers
73
views
HDF5 Write Performance Degrades Over Time When Converting from LMDB (~3.7M entries)
I’m experiencing significant slow-downs when converting data from LMDB to HDF5 format. While the conversion starts off quickly, performance degrades substantially partway through the process.
...
1
vote
0
answers
27
views
Using HDFql to filter out groups that have specific attribute values
I am trying to use the HDFql C++ library to select groups that have attributes with specific values. There are an arbitrary number of groups in the root group, and each group has the same attribute ...
0
votes
2
answers
67
views
Write pandas df to reusable hdf5 or use other data format?
I have data that can have different sized arrays per row like:
data = {
'a': [np.array([1.,2.]), np.array([6.,7.,.6]), np.array([np.nan])],
'b': np.array([99., 66., 88.])
}
df = pd.DataFrame(...
2
votes
1
answer
77
views
Creating HDF5 attributes with 2D Dataspace in Rust
I work with HDF5 files which contain datasets aggregated from smaller HDF5 files. Using the HDF Group tool nagg I can deaggregate the larger files into their original component files. However, nagg ...
0
votes
0
answers
59
views
Issues with hyperslab in fortran HDF5
I'm trying to write two 3×3 matrices (array1 and array2) into a 3D HDF5 dataset with shape (3,3,2), where each matrix occupies a different slice along the third dimension. However, the second slice ...
2
votes
0
answers
164
views
HDF5 and Standard-Compliant Fortran
I am writing Fortran code to save data to the HDF5 file format. In all the examples provided with HDF5 that I looked at, I couldn't help but notice the use of the C_LOC(x) intrinsic function with ...
0
votes
0
answers
30
views
How to setup a conda environment on a Mac laptop to compile and run a hybrid C++/Python code with MPI support
I am trying to compile a CFD code on a Mac. The code is parallelized using MPI, and is written with a combination of C++ and Fortran. In addition, it runs a Python interpreter to handle IO tasks via ...
0
votes
1
answer
65
views
Dynamically update h5py VDS from data written to in parallel -- multiple writer multiple reader case
I am trying to simulate the creation of multiple h5 files in parallel while trying to read out the results live via a dynamically updated H5 VDS in swmr mode. All h5 files are processing different ...
-4
votes
1
answer
133
views
How to save simulation parameters in HDF5 file with h5py?
I'm using H5PY to store a high number of simulations' outputs. Now of course these simulations are parametrized, hence I need to also store what parameters were used for which simulation output.
At ...
0
votes
1
answer
133
views
Writing complex Pandas DataFrame to HDF5 using h5py
I have a Pandas DataFrame with mixed scalar and array-like data of different raw types (int, float, str). The DataFrame's types look like this:
'col1', dtype('float64')
'col2', dtype('O') <-- array,...
4
votes
0
answers
73
views
How can I load an HDF5 file, including columns of strings, into R?
I have a bunch of pandas dataframes I'd like to save in one file, and each dataset gets appended to multiple times. I decided that an HD5 file would be a good way to store them, and this is code ...
2
votes
1
answer
118
views
Cannot write a variable length string attribute to HDF5 file
Given the following function, I am trying to write a variable length string to an open HDF5 file. However, the H5Awrite(...) line causes an access violation inside the HDF5 stack, which I can't see. ...
0
votes
0
answers
76
views
Writing uint8_t to HDF5 (in binary form)
Is there a way to write the binary representation of a uint8_t to HDF5? For example, if the value is 4, the resulting value in HDF5 should be 00000100. This binary value is located in a compound ...
0
votes
2
answers
163
views
Replace table in HDF5 file with a modified table
I have an existing HDF5 file with multiple tables. I want to modify this HDF5 file: in one of the tables I want to drop some rows entirely, and modify values in the remaining rows.
I tried the ...
1
vote
1
answer
101
views
mpicc: command not found while installing hdf5 library
Trying to install hdf5-1.14.5 with zlib-1.3.1 but I cant configure it.
I have CC=mpicc and FC=mpif90 with gcc wrapper. Both the compilers can compile a run simple mpi programs.
mpicc -version
gcc (...
2
votes
1
answer
177
views
How to efficiently read HDF5 file using NetCDF-java
What is an efficient way to use the NetCDF-java API to read an HDF5 giving a raster variable consisting of a structure in the following form?
Structure {
float depth;
float uncertainty;
} values(...
0
votes
1
answer
93
views
Accessing HDF5's shape is much slower (45 times) when the hdf5 file is larger
I am having significant read speed reductions when accessing data recursively for a very large hdf5 with lots of datasets inside.
There are two HDF5, "small.hdf5" and "large.hdf5".
...
2
votes
0
answers
185
views
How to convert NASA GPM data from HDF5 to GeoTIFF using python or gdal?
I'm trying to convert an HDF5 file to GeoTIFF, but I am having trouble. It's specifically an HDF5 file from this source, I have had success using gdal_translate on other HDF5 files. An example file ...
0
votes
0
answers
102
views
Does h5py+MPI ("parallel HDF5") offer simultaneous writes, or just parallelized sequential writes?
I implemented a data ingestion tool in Python using h5py+MPI to ingest data from a large number of plaintest files into an equal number of HDF5 datasets. However, I'm finding that using multiple ...
0
votes
1
answer
119
views
Unable to open .h5 files during runtime of C program
I am writing a simulation in C, and want to output the data during runtime to .h5 files using the HDF5 library. The following code is used to output the files:
void output_data(int t, int n_output) {
...
0
votes
0
answers
109
views
how can I recover (or understand the h5debug output of) my hdf5 file?
I have a hdf5 file that is so large I have to use my home fileserver to write the data (4.04TB, according to macOS's Finder). It is a collection of logits that takes several hours to calculate, and ...
0
votes
2
answers
103
views
Concatenating netcdf and hdf5 files
I have two data files in two formats, one is netcdf, the other is hdf5. I wish to combine them and export the combined data as hdf5. My current solution includes converting each to a python dictionary ...
0
votes
0
answers
232
views
Extract flow direction/flow arrow/velocity vector from HEC-RAS
When it comes to flood simulation, knowing the direction of flow is very important. However, HEC-RAS's RasMapper only give the absolute value of velocity. It can only show the arrow on its board (as ...
3
votes
0
answers
360
views
Problems loading a model saved in .h5 after upgrading Tensorflow from 2.15.0 to 2.17.0
I created an LSTM model with the tensorflow library version 2.15.0.
I now have to use the model created in a new project, but in this project I have to use all the updated libraries and also the last ...
0
votes
1
answer
287
views
how to set a proper chunk size in hdf5
according to this answer, a proper chunk size is important for optimizing I/O perfromance.
I am 3000 jpg images, whose size vary from 180kB to 220kB. I am going to save them as bytes.
I know 2 methods ...
0
votes
1
answer
176
views
data loss when using xarray to hdf5 format
I am trying to create an hdf5 file for storing some generated data. The saving part is fine (I think) but when it comes to retrieving, some of the data input appears to be lost. I will provide the ...
0
votes
1
answer
169
views
CMake successfully finds HL component for HDF5, but doesn't list a library for it
I am getting the following error when running make in my build directory
/usr/bin/ld: cannot find -lhdf5_hl
collect2: error: ld returned 1 exit status
error: command 'x86_64-linux-gnu-g++' failed with ...
0
votes
1
answer
75
views
How to create numpy records array with numerical entries without dtype name
I am trying to create a numpy records array to match data that I am reading from an HDF5 file. The dtype of the HDF5 dataset (dataset) has a dtype of np.dtype(('u1', (3,))). The dtype of dataset[0] is ...
0
votes
0
answers
67
views
how to append large data set to a netcdf file - python
I have a 10,000 x 10,000 data set that I would like to store in a dataset format. I was looking into HDF5, but I found myself looking into Xarray for data storage. I do not have a beefy computer and, ...
0
votes
1
answer
56
views
HDF5 simple dataset only writing half of the elements
I have a larger data conversion utility this is a small piece of. It converts from an old format to memory, then writes memory to an HDF5 based output format. Below is a function, WriteFloatDataset, ...
0
votes
1
answer
1k
views
Can't open or create hdf5 files with h5py
I'm trying to work with hdf5 files for ml, and I keep getting this error whenever I create or open an hdf5 file. I'm using a miniconda environment in vscode on windows.
import h5py
hdf5_filepath = '...
0
votes
1
answer
60
views
does HDF5 resource need to be freed/closed before I use it again
I am writing an .h5 file in C++ and using some API calls as below. The HDF5 docs state that the identifier returned from these functions should be released to prevent leaks.
In the code I have, there ...
0
votes
0
answers
136
views
Homebrew libraries & CMake on MacOS Sonoma: ld does not find symbols despite correct specified architecture
Problem
I am trying to port a C++ project from Linux to MacOs sonoma with the arm64 architecture.
Configuring is done via CMAKE and the following toolchain
GCC 14 from Homebrew
HDF5 (installed via ...
0
votes
1
answer
280
views
dataloader very slow with HDFf5 data
I've got a very large dataset in HDF5 format, which I cannot load in memory all at once. I'm using a custom dataset from Torch.
Here's the code:
import time
from utils import get_vocab_and_skipgrams
...
0
votes
1
answer
416
views
HDF Error when reading a NetCDF file as part of tests
My code saves and analyzes data in NetCDF4 format. I have no problem whatsoever with the analysis.
However, when I run unit tests in tox I get a ton of HDF and OS errors, e.g.:
https://github.com/...
1
vote
1
answer
83
views
Continue counting after 8-bit counter is full in dict
Lets start off that this is my first python program, so I will have some misconceptions on how parts of the code work.
Data is extracted from a 8-bit memory, this data is saved in a HDF5 file and I ...
0
votes
0
answers
808
views
python - can't open directory /usr/local/hdf5/lib/plugin
Hello everyone i've a Mac Book pro M3 and I need to open an HDF5 dataset in python, here is my code:
import h5py
import hdf5plugin
file_path = './blink.hdf5'
with h5py.File(file_path, 'r+') as ...
0
votes
1
answer
122
views
HDF5/h5py performance reading datasets
I have millions of images saved in a HDF5 file that I want to read in random order as fast as possible. I compared read time in sequential (alphabetical) vs random order. I found random order time is ...
1
vote
1
answer
215
views
Writing a pandas dataframe to hdf creates an index even though I specified index=False. Why?
I have a dataframe in pandas I need to write to hdf using the df.to_hdf call. I have ensured that I don't have a column named 'Index' and I have put index=False in the call, but the index column ...
1
vote
1
answer
122
views
Loading MATLAB data in Python with h5py causes permuted dimensions
I'm using the following code to load a MATLAB file into Python
import h5py
import numpy as np
filepath = 'file.mat'
arrays = {}
f = h5py.File(filepath)
for k, v in f.items():
arrays[k] = np.array(...
1
vote
0
answers
90
views
How to add JHDF 5 aarch64 support?
I'm currently exploring a benchmark from TPC - TPC-AI
This benchmark uses Parallel Data Generation Framework (PDGF) in java for generating the input datasets. It looks like one of the dependencies is ...
0
votes
1
answer
146
views
How to merge multiple H5 to one H5 file with Python and h5py? [closed]
I am new to Python coding. I want to merge data from 2 H5 files to a main H5 file. My goal is to add all objects in the SRRXX/SRR630/* groups in each source file (file names in list h5_files) to the ...
0
votes
0
answers
73
views
Is there an algorithm for thoroughly shuffling big datasets in-place with limited memory?
I have a large HDF5 file containing a single table. The table itself contains collections of ordered data that themselves are quite large but are stored contiguously in blocks. I need to shuffle the ...
0
votes
1
answer
243
views
Creating HDF5 virtual dataset for dynamic data using h5py
I have a HDF5 file which contains three 1D arrays in different datasets. This file is created using h5py in Python and the 1D arrays are continually being appended to (ie growing). For simplicity, let’...
0
votes
0
answers
37
views
Trying to edit an .hdf5 file but when recording video it doesn't seem to want to copy
This is a project done in google colab. I'm given a .hdf5 file with robot trajectory demonstrations. The assignment is to film new trajectories using architectures we've learned about in class. I've ...
-1
votes
1
answer
292
views
Is it possible to convert DM4 files to HDF5 files using Python?
I am currently looking for a way to extract data such as graphs and images from DM4 files and save them in HDF5 files with a hierarchical structure. In my search, I tried using Hyperspy, but it didn't ...