“HDF” stands for Hierarchical Data Format.
Reading and viewing HDF5
HDF5 files can be read from a variety of readers. The HDF5 group provides a free HDF5 viewing client called HDFView. In Python, HDF5 files can be traversed and read using the h5py package.
History
It was originally introduced in 1988 by the National Center for Supercomputing Applications, at the University of Illinois, USA. The task force responsible for developing the HDF format was spun off to a non-profit corporation in 2004, called the HDF Group. They currently maintain and develop the format specification from the University of Illinois Research Park in Champaign, Illinois. For more information about the HDF Group and the history of HDF, see this page.
The HDF group has released several high level APIs to read and write HDF files. These include both low level APIs for programming languages such as FORTRAN, C and C++, as well as high level languages like Python, MATLAB and IDL. Binaries, source code, and documentation for the current version of HDF are all available on the extensive HDF5 website.
HDF5 file components and structure
Each HDF5 file contains three different component types; groups, datasets and attributes. These are organized into tree hierarchies, like nested folders and files. The highest level group is called the “root group.” Groups can contain additional groups or two types of members: attributes and datasets. The primary difference between attributes and datasets is the length of the stored information. Generally, attributes should only be used to store small pieces of information such a single number or string. Datasets may contain any amount of information and are optimized towards larger amounts of data. Note that attributes may also belong directly to datasets.
Throughout this site we’ll diagram HDF5 files as shown below, with a unique color for each of the HDF5 file components, and with lines connecting parent groups and datasets to their contained child groups, datasets and attributes.
