nbodykit.io.hdf

Functions

find_datasets(info, attrs, name, obj) Recursively add a ColumnInfo named tuple to the info dict

Classes

ColumnInfo(size, dtype, dset)

Attributes

HDFFile(path[, root, exclude]) A file object to handle the reading of columns of data from a h5py HDF5 file.
class nbodykit.io.hdf.ColumnInfo(size, dtype, dset)

Attributes

dset Alias for field number 2
dtype Alias for field number 1
size Alias for field number 0

Methods

count(…)
index((value, [start, …) Raises ValueError if the value is not present.
__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

static __new__(_cls, size, dtype, dset)

Create new instance of ColumnInfo(size, dtype, dset)

__repr__()

Return a nicely formatted representation string

dset

Alias for field number 2

dtype

Alias for field number 1

size

Alias for field number 0

class nbodykit.io.hdf.HDFFile(path, root='/', exclude=[])[source]

A file object to handle the reading of columns of data from a h5py HDF5 file.

See http://docs.h5py.org for documentation on h5py.

Parameters:
  • path (str) – the file path to load
  • root (str, optional) – the start path in the HDF file, loading all data below this path
  • exclude (list of str, optional) – list of path names to exclude; these can be absolute paths, or paths relative to root

Attributes

columns A list of the names of the columns in the file.
dtype A numpy.dtype object holding the data types of each column in the file.
ncol The number of data columns in the file.
shape The shape of the file, which defaults to (size, )
size The size of the file, i.e., number of rows

Methods

asarray() Return a view of the file, where the fields of the
get_dask(column[, blocksize]) Return the specified column as a dask array, which
keys() Aliased function to return columns
read(columns, start, stop[, step]) Read the specified column(s) over the given range
read(columns, start, stop, step=1)[source]

Read the specified column(s) over the given range

‘start’ and ‘stop’ should be between 0 and size, which is the total size of the file

Parameters:
  • columns (str, list of str) – the name of the column(s) to return
  • start (int) – the row integer to start reading at
  • stop (int) – the row integer to stop reading at
  • step (int, optional) – the step size to use when reading; default is 1
Returns:

structured array holding the requested columns over the specified range of rows

Return type:

numpy.array

nbodykit.io.hdf.find_datasets(info, attrs, name, obj)[source]

Recursively add a ColumnInfo named tuple to the info dict if obj is a Dataset

When obj is a structured array with named fields, a ColumnInfo tuple will be added for each of the named fields