nbodykit.io.tpm module

class nbodykit.io.tpm.TPMBinaryFile(path, precision='f4')[source]

Bases: nbodykit.io.binary.BinaryFile

Read snapshot binary files from Martin White’s TPM simulations.

These files are stored column-wise with a format, with a header of size 28 bytes to begin the file.

The columns are:

  • Position : ‘f4’, ‘f8’ precision
    the position data
  • Velocity : ‘f4’, ‘f8’ precision
    the velocity data
  • ID : ‘u8’ precision
    integers specfiying the particle ID
  • path (str) – the path to the binary file to load
  • precision ({'f4', 'f8'}, optional) – the string dtype specifying the precision


White M., 2002, ApJS, 579, 16


columns A list of the names of the columns in the file.
dtype A numpy.dtype object holding the data types of each column in the file.
ncol The number of data columns in the file.
shape The shape of the file, which defaults to (size, )
size The size of the file, i.e., number of rows


asarray() Return a view of the file, where the fields of the
get_dask(column[, blocksize]) Return the specified column as a dask array, which
keys() Aliased function to return columns
read(columns, start, stop[, step]) Read the specified column(s) over the given range

This function provides numpy-like array indexing of the file object.

It supports:

  1. integer, slice-indexing similar to arrays
  2. string indexing using column names in keys()
  3. array-like indexing using integer lists or boolean arrays


If a single column is being returned, a numpy array holding the data is returned, rather than a structured array with only a single field.


Return a view of the file, where the fields of the structured array are stacked in columns of a single numpy array


Start with a file object with three named columns, ra, dec, and z

>>> ff.dtype
dtype([('ra', '<f4'), ('dec', '<f4'), ('z', '<f4')])
>>> ff.shape
>>> ff.columns
['ra', 'dec', 'z']
>>> ff[:3]
array([(235.63442993164062, 59.39099884033203, 0.6225500106811523),
       (140.36181640625, -1.162310004234314, 0.5026500225067139),
       (129.96627807617188, 45.970130920410156, 0.4990200102329254)],
      dtype=(numpy.record, [('ra', '<f4'), ('dec', '<f4'), ('z', '<f4')]))

Select a subset of columns and switch the ordering and convert output to a single numpy array

>>> x = ff[['dec', 'ra']].asarray()
>>> x.dtype
>>> x.shape
(1000, 2)
>>> x.columns
['dec', 'ra']
>>> x[:3]
array([[  59.39099884,  235.63442993],
       [  -1.16231   ,  140.36181641],
       [  45.97013092,  129.96627808]], dtype=float32)

Now, select only the first column (dec)

>>> dec = x[:,0]
>>> dec[:3]
array([ 59.39099884,  -1.16231   ,  45.97013092], dtype=float32)
Returns:a file object that will return a numpy array with the columns representing the fields
Return type:FileType

A list of the names of the columns in the file.

This defaults to the named fields in the file’s dtype attribute, but differ from this if a view of the file has been returned with asarray()


A numpy.dtype object holding the data types of each column in the file.

get_dask(column, blocksize=100000)

Return the specified column as a dask array, which delays the explicit reading of the data until dask.compute() is called

The dask array is chunked into blocks of size blocksize

  • column (str) – the name of the column to return
  • blocksize (int, optional) – the size of the chunks in the dask array

the dask array holding the column, which computes the necessary functions to read the data, but delays evaluating until the user specifies

Return type:



Aliased function to return columns

logger = <logging.Logger object>

The number of data columns in the file.

read(columns, start, stop, step=1)

Read the specified column(s) over the given range

‘start’ and ‘stop’ should be between 0 and size, which is the total size of the binary file (in particles)

  • columns (str, list of str) – the name of the column(s) to return
  • start (int) – the row integer to start reading at
  • stop (int) – the row integer to stop reading at
  • step (int, optional) – the step size to use when reading; default is 1

structured array holding the requested columns over the specified range of rows

Return type:



The shape of the file, which defaults to (size, )

Multiple dimensions can be introduced into the shape if a view of the file has been returned with asarray()


The size of the file, i.e., number of rows