Do not make the same mistakes we did! Here you find a list of common pitfalls we have witnessed when working with EMD files in python.
- Note that the first dimension is saved in the dataset called dim1, there is no dim0.
- The dimensions of a dataset in
h5py
are in ascending order 1,2,..n or x,y,..n. When working with images as numpy arrays however, the usual way to order the dimensions is as y,x corresponding to rows and columns on the screen. To interchange these with the EMD file, one has to flip x and y directions by using for examplenp.transpose()
. - To correctly save strings using
h5py
the use of fixed-width byte strings is encouraged. Saving python string objects can lead to encoding errors or worse. Just parse your string throughnp.string_('example')
. To convert back just decode it to UTF8 likeb'example'.decode('utf-8')
- Remember, that you can create significant memory leakages in python, if you are not careful about assigning variables. There is a difference between
h5py_dset = data
andh5py_dset = data[:]
, which becomes interesting when you repeatedly read indata
in a loop. - There have been reports about performance issues in
h5py
related tonumpy
indexing. Feel free to do a quick internet research.