nbodykit.io.bigfile

Classes

Automatic
BigFile(path[, exclude, header, dataset]) A file object to handle the reading of columns of data from a bigfile file.
class nbodykit.io.bigfile.BigFile(path, exclude=None, header=<class 'nbodykit.io.bigfile.Automatic'>, dataset='./')[source]

A file object to handle the reading of columns of data from a bigfile file.

bigfile is a reproducible, massively parallel IO library for large, hierarchical datasets, and it is the default format of the FastPM and the MP-Gadget simulations.

See also: https://github.com/rainwoodman/bigfile

Parameters:
  • path (str) – the name of the directory holding the bigfile data
  • exclude (list of str, optional) – the data sets to exlude from loading within bigfile; default is the header
  • header (str, optional) – the path to the header; default is to use a column ‘Header’. It is relative to the file, not the dataset.
  • dataset (str) – load a specific dataset from the bigfile; default is to starting from the root.

Attributes

columns A list of the names of the columns in the file.
dtype A numpy.dtype object holding the data types of each column in the file.
ncol The number of data columns in the file.
shape The shape of the file, which defaults to (size, )
size The size of the file, i.e., number of rows

Methods

asarray() Return a view of the file, where the fields of the
get_dask(column[, blocksize]) Return the specified column as a dask array, which
keys() Aliased function to return columns
read(columns, start, stop[, step]) Read the specified column(s) over the given range,
read(columns, start, stop, step=1)[source]

Read the specified column(s) over the given range, as a dictionary

‘start’ and ‘stop’ should be between 0 and size, which is the total size of the binary file (in particles)