h5py – Tarik Billa

Check if node exists in h5py

January 4, 2024 by Tarik

e = “/some/path” in h5File does it. This is very briefly mentioned in the Group documentation.

This is actually one of the use-cases of HDF5. If you just want to be able to access all the datasets from a single file, and don’t care how they’re actually stored on disk, you can use external links. From the HDF5 website: External links allow a group to include objects in another HDF5 file … Read more

Incremental writes to hdf5 with h5py

December 5, 2023 by Tarik

Per the FAQ, you can expand the dataset using dset.resize. For example, import os import h5py import numpy as np path=”/tmp/out.h5″ os.remove(path) with h5py.File(path, “a”) as f: dset = f.create_dataset(‘voltage284’, (10**5,), maxshape=(None,), dtype=”i8″, chunks=(10**4,)) dset[:] = np.random.random(dset.shape) print(dset.shape) # (100000,) for i in range(3): dset.resize(dset.shape[0]+10**4, axis=0) dset[-10**4:] = np.random.random(10**4) print(dset.shape) # (110000,) # (120000,) # … Read more

How to install h5py (needed for Keras) on MacOS with M1?

September 25, 2023 by Tarik

This works for me: $ brew install hdf5 $ export HDF5_DIR=”$(brew –prefix hdf5)” $ pip install –no-binary=h5py h5py

How to store dictionary in HDF5 dataset

September 21, 2023 by Tarik

I found two ways to this: I) transform datetime object to string and use it as dataset name h = h5py.File(‘myfile.hdf5’) for k, v in d.items(): h.create_dataset(k.strftime(‘%Y-%m-%dT%H:%M:%SZ’), data=np.array(v, dtype=np.int8)) where data can be accessed by quering key strings (datasets name). For example: for ds in h.keys(): if ‘2012-04’ in ds: print(h[ds].value) II) transform datetime object … Read more

Read HDF5 file into numpy array

September 18, 2023 by Tarik

The easiest thing is to use the .value attribute of the HDF5 dataset. >>> hf = h5py.File(‘/path/to/file’, ‘r’) >>> data = hf.get(‘dataset_name’).value # `data` is now an ndarray. You can also slice the dataset, which produces an actual ndarray with the requested data: >>> hf[‘dataset_name’][:10] # produces ndarray as well But keep in mind that … Read more

How to list all datasets in h5py file?

August 22, 2023 by Tarik

You have to use the keys method. This will give you a List of unicode strings of your dataset and group names. For example: Datasetnames=hf.keys() Another gui based method would be to use HDFView. https://support.hdfgroup.org/products/java/release/download.html

Error opening file in H5PY (File signature not found)

June 3, 2023 by Tarik

Usually the message File signature not found indicates either: 1. Your file is corrupted. … is what I think is most likely. You said you’ve opened the files before. Maybe you forgot closing your file-handle which can corrupt the file. Try checking the file with the HDF5 utility h5debug (available on command line if you’ve … Read more

Experience with using h5py to do analytical work on big data in Python?

May 28, 2023 by Tarik

We use Python in conjunction with h5py, numpy/scipy and boost::python to do data analysis. Our typical datasets have sizes of up to a few hundred GBs. HDF5 advantages: data can be inspected conveniently using the h5view application, h5py/ipython and the h5* commandline tools APIs are available for different platforms and languages structure data using groups … Read more

How to overwrite array inside h5 file using h5py

May 11, 2023 by Tarik

You want to assign values, not create a dataset: f1 = h5py.File(file_name, ‘r+’) # open the file data = f1[‘meas/frame1/data’] # load the data data[…] = X1 # assign new values to data f1.close() # close the file To confirm the changes were properly made and saved: f1 = h5py.File(file_name, ‘r’) np.allclose(f1[‘meas/frame1/data’].value, X1) #True