nbodykit.io.binary¶

Functions

getsize(filename, header_size, rowsize) The default method to determine the size of the binary file

Classes

BinaryFile(path, dtype[, offsets, …]) A file object to handle the reading of columns of data from a binary file.

class nbodykit.io.binary.BinaryFile(path, dtype, offsets=None, header_size=0, size=None)[source]¶

A file object to handle the reading of columns of data from a binary file.

Warning

This assumes the data is stored in a column-major format

Parameters:

Parameters:	path (str) – the name of the binary file to load dtype (numpy.dtype or list of tuples) – the dtypes of the columns to load; this should be either a `numpy.dtype` or be able to be converted to one via a `numpy.dtype()` call offsets (dict, optional) – a dictionay specifying the byte offsets of each column in the binary file; if not supplied, the offsets are inferred from the dtype size of each column, assuming a fixed header size, and contiguous storage header_size (int, optional) – the size of the header in bytes size (int, optional) – the number of objects in the binary file; if not provided, the value is inferred from the dtype and the total size of the file in bytes

path (str) – the name of the binary file to load
dtype (numpy.dtype or list of tuples) – the dtypes of the columns to load; this should be either a numpy.dtype or be able to be converted to one via a numpy.dtype() call
offsets (dict, optional) – a dictionay specifying the byte offsets of each column in the binary file; if not supplied, the offsets are inferred from the dtype size of each column, assuming a fixed header size, and contiguous storage
header_size (int, optional) – the size of the header in bytes
size (int, optional) – the number of objects in the binary file; if not provided, the value is inferred from the dtype and the total size of the file in bytes

Attributes

`columns`	A list of the names of the columns in the file.
`dtype`	A `numpy.dtype` object holding the data types of each column in the file.
`ncol`	The number of data columns in the file.
`shape`	The shape of the file, which defaults to `(size, )`
`size`	The size of the file, i.e., number of rows

Methods

`asarray`()	Return a view of the file, where the fields of the
`get_dask`(column[, blocksize])	Return the specified column as a dask array, which
`keys`()	Aliased function to return `columns`
`read`(columns, start, stop[, step])	Read the specified column(s) over the given range

read(columns, start, stop, step=1)[source]¶

Read the specified column(s) over the given range

‘start’ and ‘stop’ should be between 0 and size, which is the total size of the binary file (in particles)

Parameters:	columns (str, list of str) – the name of the column(s) to return start (int) – the row integer to start reading at stop (int) – the row integer to stop reading at step (int, optional) – the step size to use when reading; default is 1
Returns:	structured array holding the requested columns over the specified range of rows
Return type:	numpy.array

nbodykit.io.binary.getsize(filename, header_size, rowsize)[source]¶

The default method to determine the size of the binary file

The “size” is defined as the number of rows, where each row has of size of rowsize in bytes.

Notes

This assumes the input file is not compressed
This function does not depend on the layout of the binary file, i.e., if the data is formatted in actual rows or not

Raises:	ValueError : – If the function determines a fractional number of rows
Parameters:	filename (str) – the name of the binary file header_size (int) – the size of the header in bytes, which will be skipped when determining the number of rows rowsize (int) – the size of the data in each row in bytes