nbodykit.source.catalog¶

class nbodykit.source.catalog.CSVCatalog(*args, **kwargs)¶

A CatalogSource that uses CSVFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the name of the file to load names (list of str) – the names of the columns of the csv file; this should give names of all the columns in the file – pass `usecols` to select a subset of columns blocksize (int, optional) – the file will be partitioned into blocks of bytes roughly of this size dtype (dict, str, optional) – if specified as a string, assume all columns have this dtype, otherwise; each column can have a dtype entry in the dict; if not specified, the data types will be inferred from the file usecols (list, optional) – a `pandas.read_csv`; a subset of `names` to store, ignoring all other columns delim_whitespace (bool, optional) – a `pandas.read_csv` keyword; if the CSV file is space-separated, set this to `True` config – additional keyword arguments that will be passed to `pandas.read_csv()`; see the documentation of that function for a full list of possible options comm** (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the name of the file to load
names (list of str) – the names of the columns of the csv file; this should give names of all the columns in the file – pass usecols to select a subset of columns
blocksize (int, optional) – the file will be partitioned into blocks of bytes roughly of this size
dtype (dict, str, optional) – if specified as a string, assume all columns have this dtype, otherwise; each column can have a dtype entry in the dict; if not specified, the data types will be inferred from the file
usecols (list, optional) – a pandas.read_csv; a subset of names to store, ignoring all other columns
delim_whitespace (bool, optional) – a pandas.read_csv keyword; if the CSV file is space-separated, set this to True
**config – additional keyword arguments that will be passed to pandas.read_csv(); see the documentation of that function for a full list of possible options
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Examples

Please see the documentation for examples.

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.BinaryCatalog(*args, **kwargs)¶

A CatalogSource that uses BinaryFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the name of the binary file to load dtype (numpy.dtype or list of tuples) – the dtypes of the columns to load; this should be either a `numpy.dtype` or be able to be converted to one via a `numpy.dtype()` call offsets (dict, optional) – a dictionay specifying the byte offsets of each column in the binary file; if not supplied, the offsets are inferred from the dtype size of each column, assuming a fixed header size, and contiguous storage header_size (int, optional) – the size of the header in bytes size (int, optional) – the number of objects in the binary file; if not provided, the value is inferred from the dtype and the total size of the file in bytes comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the name of the binary file to load
dtype (numpy.dtype or list of tuples) – the dtypes of the columns to load; this should be either a numpy.dtype or be able to be converted to one via a numpy.dtype() call
offsets (dict, optional) – a dictionay specifying the byte offsets of each column in the binary file; if not supplied, the offsets are inferred from the dtype size of each column, assuming a fixed header size, and contiguous storage
header_size (int, optional) – the size of the header in bytes
size (int, optional) – the number of objects in the binary file; if not provided, the value is inferred from the dtype and the total size of the file in bytes
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Examples

Please see the documentation for examples.

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.BigFileCatalog(*args, **kwargs)¶

A CatalogSource that uses BigFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the name of the directory holding the bigfile data exclude (list of str, optional) – the data sets to exlude from loading within bigfile; default is the header header (str, optional) – the path to the header; default is to use a column ‘Header’. It is relative to the file, not the dataset. dataset (str) – load a specific dataset from the bigfile; default is to starting from the root. comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the name of the directory holding the bigfile data
exclude (list of str, optional) – the data sets to exlude from loading within bigfile; default is the header
header (str, optional) – the path to the header; default is to use a column ‘Header’. It is relative to the file, not the dataset.
dataset (str) – load a specific dataset from the bigfile; default is to starting from the root.
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Examples

Please see the documentation for examples.

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.HDFCatalog(*args, **kwargs)¶

A CatalogSource that uses HDFFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the file path to load root (str, optional) – the start path in the HDF file, loading all data below this path exclude (list of str, optional) – list of path names to exclude; these can be absolute paths, or paths relative to `root` comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the file path to load
root (str, optional) – the start path in the HDF file, loading all data below this path
exclude (list of str, optional) – list of path names to exclude; these can be absolute paths, or paths relative to root
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Examples

Please see the documentation for examples.

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.TPMBinaryCatalog(*args, **kwargs)¶

A CatalogSource that uses TPMBinaryFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the path to the binary file to load precision ({'f4', 'f8'}, optional) – the string dtype specifying the precision comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the path to the binary file to load
precision ({'f4', 'f8'}, optional) – the string dtype specifying the precision
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.FITSCatalog(*args, **kwargs)¶

A CatalogSource that uses FITSFile to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the file path to load ext (number or string, optional) – The extension. Either the numerical extension from zero or a string extension name. If not sent, data is read from the first HDU that has data. comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the file path to load
ext (number or string, optional) – The extension. Either the numerical extension from zero or a string extension name. If not sent, data is read from the first HDU that has data.
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Examples

Please see the documentation for examples.

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.Gadget1Catalog(*args, **kwargs)¶

A CatalogSource that uses Gadget1File to read data from disk.

Multiple files can be read at once by supplying a list of file names or a glob asterisk pattern as the path argument. See Reading Multiple Data Files at Once for examples.

Parameters:

Parameters:	path (str) – the path to the binary file to load columndefs (list) – a list of triplets (columnname, element_dtype, particle_types) ptype (int) – type of particle of interest. hdtype (list, dtype) – dtype of the header; must define Massarr and Npart comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` attrs (dict, optional) – dictionary of meta-data to store in `attrs`

path (str) – the path to the binary file to load
columndefs (list) – a list of triplets (columnname, element_dtype, particle_types)
ptype (int) – type of particle of interest.
hdtype (list, dtype) – dtype of the header; must define Massarr and Npart
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
attrs (dict, optional) – dictionary of meta-data to store in attrs

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying file source.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

class nbodykit.source.catalog.ArrayCatalog(data, comm=None, use_cache=False, **kwargs)[source]¶

A CatalogSource initialized from a dictionary or structured ndarray.

Parameters:

Parameters:	data (obj:dict or `numpy.ndarray`) – a dictionary or structured ndarray; items are interpreted as the columns of the catalog; the length of any item is used as the size of the catalog. comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False` **kwargs – additional keywords to store as meta-data in `attrs`

data (obj:dict or numpy.ndarray) – a dictionary or structured ndarray; items are interpreted as the columns of the catalog; the length of any item is used as the size of the catalog.
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False
**kwargs – additional keywords to store as meta-data in attrs

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying data array/dict.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

get_hardcolumn(col)[source]¶

Return a column from the underlying data array/dict.

Columns are returned as dask arrays.

hardcolumns¶: The union of the columns in the file and any transformed columns.

class nbodykit.source.catalog.LogNormalCatalog(Plin, nbar, BoxSize, Nmesh, bias=2.0, seed=None, cosmo=None, redshift=None, unitary_amplitude=False, inverted_phase=False, comm=None, use_cache=False)[source]¶

A CatalogSource containing biased particles that have been Poisson-sampled from a log-normal density field.

Parameters:

Parameters:	Plin (callable) – callable specifying the linear power spectrum nbar (float) – the number density of the particles in the box, assumed constant across the box; this is used when Poisson sampling the density field BoxSize (float, 3-vector of floats) – the size of the box to generate the grid on Nmesh (int) – the mesh size to use when generating the density and displacement fields, which are Poisson-sampled to particles bias (float, optional) – the desired bias of the particles; applied while applying a log-normal transformation to the density field seed (int, optional) – the global random seed; if set to `None`, the seed will be set randomly cosmo (`nbodykit.cosmology.core.Cosmology`, optional) – this must be supplied if `Plin` does not carry `cosmo` attribute redshift (float, optional) – this must be supplied if `Plin` does not carry a `redshift` attribute comm (MPI Communicator, optional) – the MPI communicator instance; default (`None`) sets to the current communicator use_cache (bool, optional) – whether to cache data read from disk; default is `False`

Plin (callable) – callable specifying the linear power spectrum
nbar (float) – the number density of the particles in the box, assumed constant across the box; this is used when Poisson sampling the density field
BoxSize (float, 3-vector of floats) – the size of the box to generate the grid on
Nmesh (int) – the mesh size to use when generating the density and displacement fields, which are Poisson-sampled to particles
bias (float, optional) – the desired bias of the particles; applied while applying a log-normal transformation to the density field
seed (int, optional) – the global random seed; if set to None, the seed will be set randomly
cosmo (nbodykit.cosmology.core.Cosmology, optional) – this must be supplied if Plin does not carry cosmo attribute
redshift (float, optional) – this must be supplied if Plin does not carry a redshift attribute
comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
use_cache (bool, optional) – whether to cache data read from disk; default is False

References

Cole and Jones, 1991 Agrawal et al. 2017

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	A list of the hard-coded columns in the CatalogSource.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Position`()	Position assumed to be in Mpc/h
`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Velocity`()	Velocity in km/s
`VelocityOffset`()	The corresponding RSD offset, in Mpc/h
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

Position()[source]¶: Position assumed to be in Mpc/h

Velocity()[source]¶: Velocity in km/s

VelocityOffset()[source]¶: The corresponding RSD offset, in Mpc/h

class nbodykit.source.catalog.UniformCatalog(nbar, BoxSize, seed=None, comm=None, use_cache=False)[source]¶

A CatalogSource that has uniformly-distributed Position and Velocity columns.

The random numbers generated do not depend on the number of available ranks.

Parameters:	nbar (float) – the desired number density of particles in the box BoxSize (float, 3-vector) – the size of the box seed (int, optional) – the random seed comm – the MPI communicator use_cache (bool, optional) – whether to cache data on disk

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	A list of the hard-coded columns in the CatalogSource.
`rng`	A `MPIRandomState` that behaves as `numpy.random.RandomState` but generates random numbers in a manner independent of the number of ranks.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Position`()	The position of particles, uniformly distributed in `BoxSize`
`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Velocity`()	The velocity of particles, uniformly distributed in `0.01 x BoxSize`
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

Position()[source]¶: The position of particles, uniformly distributed in BoxSize

Velocity()[source]¶: The velocity of particles, uniformly distributed in 0.01 x BoxSize

class nbodykit.source.catalog.RandomCatalog(csize, seed=None, comm=None, use_cache=False)[source]¶

A CatalogSource that can have columns added via a collective random number generator.

The random number generator stored as rng behaves as numpy.random.RandomState but generates random numbers only on the local rank in a manner independent of the number of ranks.

Parameters:	csize (int) – the desired collective size of the Source seed (int, optional) – the global seed for the random number generator comm (MPI communicator) – the MPI communicator; set automatically if None

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	A list of the hard-coded columns in the CatalogSource.
`rng`	A `MPIRandomState` that behaves as `numpy.random.RandomState` but generates random numbers in a manner independent of the number of ranks.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

rng¶: A MPIRandomState that behaves as numpy.random.RandomState but generates random numbers in a manner independent of the number of ranks.

class nbodykit.source.catalog.FKPCatalog(data, randoms, BoxSize=None, BoxPad=0.02, use_cache=True)[source]¶

An interface for simultaneous modeling of a data CatalogSource and a randoms CatalogSource, in the spirit of Feldman, Kaiser, and Peacock, 1994.

This main functionality of this class is:

provide a uniform interface to accessing columns from the data CatalogSource and randoms CatalogSource, using column names prefixed with “data/” or “randoms/”
compute the shared BoxSize of the source, by finding the maximum Cartesian extent of the randoms
provide an interface to a mesh object, which knows how to paint the FKP density field from the data and randoms

Parameters:

Parameters:	data (CatalogSource) – the CatalogSource of particles representing the data catalog randoms (CatalogSource) – the CatalogSource of particles representing the randoms catalog BoxSize (float, 3-vector, optional) – the size of the Cartesian box to use for the unified data and randoms; if not provided, the maximum Cartesian extent of the randoms defines the box BoxPad (float, 3-vector, optional) – optionally apply this additional buffer to the extent of the Cartesian box use_cache (bool, optional) – if `True`, use the built-in dask cache system to cache data, providing significant speed-ups; requires `cachey`

data (CatalogSource) – the CatalogSource of particles representing the data catalog
randoms (CatalogSource) – the CatalogSource of particles representing the randoms catalog
BoxSize (float, 3-vector, optional) – the size of the Cartesian box to use for the unified data and randoms; if not provided, the maximum Cartesian extent of the randoms defines the box
BoxPad (float, 3-vector, optional) – optionally apply this additional buffer to the extent of the Cartesian box
use_cache (bool, optional) – if True, use the built-in dask cache system to cache data, providing significant speed-ups; requires cachey

References

Feldman, Kaiser, and Peacock, 1994

Attributes

`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	Columns for individual species can be accessed using a `species/` prefix and the column name, i.e., `data/Position`.
`hardcolumns`	Hardcolumn of the form `species/name`
`species`	List of species names
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the FKPCatalog to a mesh, which knows how to “paint” the FKP density field.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

to_mesh(Nmesh=None, BoxSize=None, dtype='f4', interlaced=False, compensated=False, window='cic', fkp_weight='FKPWeight', comp_weight='Weight', nbar='NZ', selection='Selection', position='Position')[source]¶

Convert the FKPCatalog to a mesh, which knows how to “paint” the FKP density field.

Additional keywords to the to_mesh() function include the FKP weight column, completeness weight column, and the column specifying the number density as a function of redshift.

Parameters:

Parameters:	Nmesh (int, 3-vector, optional) – the number of cells per box side; if not specified in attrs, this must be provided BoxSize (float, 3-vector, optional) – the size of the box; if provided, this will use the default value in attrs dtype (str, dtype, optional) – the data type of the mesh when painting interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing window (str, optional) – the string name of the window to use when interpolating the particles to the mesh; see `pmesh.window.methods` for choices fkp_weight (str, optional) – the name of the column in the source specifying the FKP weight; this weight is applied to the FKP density field: `n_data - alphan_randoms` comp_weight* (str, optional) – the name of the column in the source specifying the completeness weight; this weight is applied to the individual fields, either `n_data` or `n_random` selection (str, optional) – the name of the column used to select a subset of the source when painting nbar (str, optional) – the name of the column specifying the number density as a function of redshift position (str, optional) – the name of the column that specifies the position data of the objects in the catalog

Nmesh (int, 3-vector, optional) – the number of cells per box side; if not specified in attrs, this must be provided
BoxSize (float, 3-vector, optional) – the size of the box; if provided, this will use the default value in attrs
dtype (str, dtype, optional) – the data type of the mesh when painting
interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh
compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing
window (str, optional) – the string name of the window to use when interpolating the particles to the mesh; see pmesh.window.methods for choices
fkp_weight (str, optional) – the name of the column in the source specifying the FKP weight; this weight is applied to the FKP density field: n_data - alpha*n_randoms
comp_weight (str, optional) – the name of the column in the source specifying the completeness weight; this weight is applied to the individual fields, either n_data or n_random
selection (str, optional) – the name of the column used to select a subset of the source when painting
nbar (str, optional) – the name of the column specifying the number density as a function of redshift
position (str, optional) – the name of the column that specifies the position data of the objects in the catalog

class nbodykit.source.catalog.HaloCatalog(source, cosmo, redshift, mdef='vir', mass='Mass', position='Position', velocity='Velocity')[source]¶

A wrapper CatalogSource of halo objects to interface nicely with halotools.sim_manager.UserSuppliedHaloCatalog.

Parameters:

Parameters:	source (CatalogSource) – the source holding the particles to be interpreted as halos cosmo (`Cosmology`) – the cosmology instance; redshift (float) – the redshift of the halo catalog mdef (str, optional) – string specifying mass definition, used for computing default halo radii and concentration; should be ‘vir’ or ‘XXXc’ or ‘XXXm’ where ‘XXX’ is an int specifying the overdensity mass (str, optional) – the column name specifying the mass of each halo position (str, optional) – the column name specifying the position of each halo velocity (str, optional) – the column name specifying the velocity of each halo

source (CatalogSource) – the source holding the particles to be interpreted as halos
cosmo (Cosmology) – the cosmology instance;
redshift (float) – the redshift of the halo catalog
mdef (str, optional) – string specifying mass definition, used for computing default halo radii and concentration; should be ‘vir’ or ‘XXXc’ or ‘XXXm’ where ‘XXX’ is an int specifying the overdensity
mass (str, optional) – the column name specifying the mass of each halo
position (str, optional) – the column name specifying the position of each halo
velocity (str, optional) – the column name specifying the velocity of each halo

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	A list of the hard-coded columns in the CatalogSource.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Concentration`()	The halo concentration, computed using `nbodykit.transform.HaloConcentration()`.
`Mass`()	The halo mass column, assumed to be in units of \(M_\odot/h\).
`Position`()	The halo position column, assumed to be in units of \(\mathrm{Mpc}/h\).
`Radius`()	The halo radius, computed using `nbodykit.transform.HaloRadius()`.
`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Velocity`()	The halo velocity column, assumed to be in units of km/s.
`VelocityOffset`()	The redshift-space distance offset due to the velocity in units of distance.
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_halotools`([BoxSize, selection])	Return the CatalogSource as a `halotools.sim_manager.UserSuppliedHaloCatalog`.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

Concentration()[source]¶

The halo concentration, computed using nbodykit.transform.HaloConcentration().

This uses the analytic formulas for concentration from Dutton and Maccio 2014.

Mass()[source]¶: The halo mass column, assumed to be in units of \(M_\odot/h\).

Position()[source]¶: The halo position column, assumed to be in units of \(\mathrm{Mpc}/h\).

Radius()[source]¶

The halo radius, computed using nbodykit.transform.HaloRadius().

Assumed units of \(\mathrm{Mpc}/h\).

Velocity()[source]¶: The halo velocity column, assumed to be in units of km/s.

VelocityOffset()[source]¶

The redshift-space distance offset due to the velocity in units of distance. The assumed units are \(\mathrm{Mpc}/h\).

This multiplies Velocity by \(1 / (a 100 E(z)) = 1 / (a H(z)/h)\).

to_halotools(BoxSize=None, selection='Selection')[source]¶

Return the CatalogSource as a halotools.sim_manager.UserSuppliedHaloCatalog.

The Halotools catalog only holds the local data, although halos are labeled via the halo_id column using the global index.

Parameters:	BoxSize (float, array_like, optional) – the size of the box; note that anisotropic boxes are currently not supported by halotools selection (str, optional) – the name of the column to slice the data on before converting to a halotools catalog
Returns:	cat – the Halotools halo catalog, storing the local halo data
Return type:	`halotools.sim_manager.UserSuppliedHaloCatalog`

class nbodykit.source.catalog.HODCatalog(halos, logMmin=13.031, sigma_logM=0.38, alpha=0.76, logM0=13.27, logM1=14.08, seed=None, use_cache=False, comm=None)[source]¶

A CatalogSource that uses the HOD prescription of Zheng et al 2007 to populate an input halo catalog with galaxies.

The mock population is done using halotools. See the documentation for halotools.empirical_models.Zheng07Cens and halotools.empirical_models.Zheng07Sats for further details regarding the HOD.

The columns generated in this catalog are:

Position: the galaxy position
Velocity: the galaxy velocity
VelocityOffset: the RSD velocity offset, in units of distance
conc_NFWmodel: the concentration of the halo
gal_type: the galaxy type, 0 for centrals and 1 for satellites
halo_id: the global ID of the halo this galaxy belongs to, between 0 and csize
halo_local_id: the local ID of the halo this galaxy belongs to, between 0 and size
halo_mvir: the halo mass
halo_nfw_conc: alias of conc_NFWmodel
halo_num_centrals: the number of centrals that this halo hosts, either 0 or 1
halo_num_satellites: the number of satellites that this halo hosts
halo_rvir: the halo radius
halo_upid: equal to -1; should be ignored by the user
halo_vx, halo_vy, halo_vz: the three components of the halo velocity
halo_x, halo_y, halo_z: the three components of the halo position
host_centric_distance: the distance from this galaxy to the center of the halo
vx, vy, vz: the three components of the galaxy velocity, equal to Velocity
x,y,z: the three components of the galaxy position, equal to Position

For futher details, please see the documentation.

Note

Default HOD values are from Reid et al. 2014

Parameters:

Parameters:	halos (`UserSuppliedHaloCatalog`) – the halotools table holding the halo data; this object must have the following attributes: cosmology, Lbox, redshift logMmin (float, optional) – Minimum mass required for a halo to host a central galaxy sigma_logM (float, optional) – Rate of transition from <Ncen>=0 –> <Ncen>=1 alpha (float, optional) – Power law slope of the relation between halo mass and <Nsat> logM0 (float, optional) – Low-mass cutoff in <Nsat> logM1 (float, optional) – Characteristic halo mass where <Nsat> begins to assume a power law form seed (int, optional) – the random seed to generate deterministic mocks

halos (UserSuppliedHaloCatalog) – the halotools table holding the halo data; this object must have the following attributes: cosmology, Lbox, redshift
logMmin (float, optional) – Minimum mass required for a halo to host a central galaxy
sigma_logM (float, optional) – Rate of transition from <Ncen>=0 –> <Ncen>=1
alpha (float, optional) – Power law slope of the relation between halo mass and <Nsat>
logM0 (float, optional) – Low-mass cutoff in <Nsat>
logM1 (float, optional) – Characteristic halo mass where <Nsat> begins to assume a power law form
seed (int, optional) – the random seed to generate deterministic mocks

References

Zheng et al. (2007), arXiv:0703457

Attributes

`Index`	The attribute giving the global index rank of each particle in the list.
`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`csize`	The total, collective size of the CatalogSource, i.e., summed across all ranks.
`hardcolumns`	The union of the columns in the file and any transformed columns.
`size`	The number of objects in the CatalogSource on the local rank.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`Position`()	Galaxy positions, in units of Mpc/h
`Selection`()	A boolean column that selects a subset slice of the CatalogSource.
`Value`()	When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
`Velocity`()	Galaxy velocity, in units of km/s
`VelocityOffset`()	The RSD velocity offset, in units of Mpc/h
`Weight`()	The column giving the weight to use for each particle on the mesh.
`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Return a column from the underlying data array/dict.
`gslice`(start, stop[, end, redistribute])	Execute a global slice of a CatalogSource.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`repopulate`([seed])	Update the HOD parameters and then re-populate the mock catalog
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`sort`(keys[, reverse, usecols])	Return a CatalogSource, sorted globally across all MPI ranks in ascending order by the input keys.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the CatalogSource to a MeshSource, using the specified parameters.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

__makemodel__()[source]¶

Return the Zheng 07 HOD model.

This model evaluates Eqs. 2 and 5 of Zheng et al. 2007

class nbodykit.source.catalog.MultipleSpeciesCatalog(names, *species, **kwargs)[source]¶

A CatalogSource interface for handling multiples species of particles.

This CatalogSource stores a copy of the original CatalogSource objects for each species, providing access to the columns via the format species/ where “species” is one of the species names provided.

Parameters:

Parameters:	names (list of str) – list of strings specifying the names of the various species; data columns are prefixed with “species/” where “species” is in `names` species (two* or more CatalogSource objects) – catalogs to be combined into a single catalog, which give the data for different species of particles; as many catalogs as names must be provided use_cache (bool, optional) – whether to cache data when reading; default is `True`

names (list of str) – list of strings specifying the names of the various species; data columns are prefixed with “species/” where “species” is in names
*species (two or more CatalogSource objects) – catalogs to be combined into a single catalog, which give the data for different species of particles; as many catalogs as names must be provided
use_cache (bool, optional) – whether to cache data when reading; default is True

Examples

Initialization:

>>> data = UniformCatalog(nbar=3e-5, BoxSize=512., seed=42)
>>> randoms = UniformCatalog(nbar=3e-5, BoxSize=512., seed=84)
>>> cat = MultipleSpeciesCatalog(['data', 'randoms'], data, randoms)

Accessing the Catalogs for individual species:

>>> data = cat["data"] # a copy of the original "data" object

Accessing individual columns:

>>> data_pos = cat["data/Position"]

Setting new columns:

>>> cat["data"]["new_column"] = 1.0
>>> assert "data/new_column" in cat

Attributes

`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	Columns for individual species can be accessed using a `species/` prefix and the column name, i.e., `data/Position`.
`hardcolumns`	Hardcolumn of the form `species/name`
`species`	List of species names
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the catalog to a mesh, which knows how to “paint” the the combined density field, summed over all particle species.
`view`([type])	Return a “view” of the CatalogSource object, with the returned type set by `type`.

__delitem__(col)[source]¶: Delete a column of the form species/column

__getitem__(key)[source]¶

This provides access to the underlying data in two ways:

The CatalogSource object for a species can be accessed if key is a species name.
Individual columns for a species can be accessed using the format: species/column.

__setitem__(col, value)[source]¶: Add columns to any of the species catalogs.

Note

New column names should be prefixed by ‘species/’ where ‘species’ is a name in the species attribute.

columns¶: Columns for individual species can be accessed using a species/ prefix and the column name, i.e., data/Position.

hardcolumns¶: Hardcolumn of the form species/name

species¶: List of species names

to_mesh(Nmesh=None, BoxSize=None, dtype='f4', interlaced=False, compensated=False, window='cic', weight='Weight', selection='Selection', value='Value', position='Position')[source]¶

Convert the catalog to a mesh, which knows how to “paint” the the combined density field, summed over all particle species.

Parameters:

Parameters:	Nmesh (int, 3-vector, optional) – the number of cells per box side; can be inferred from `attrs` if the value is the same for all species BoxSize (float, 3-vector, optional) – the size of the box; can be inferred from `attrs` if the value is the same for all species dtype (str, dtype, optional) – the data type of the mesh when painting interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing window (str, optional) – the string name of the window to use when interpolating the weight (str, optional) – the name of the column specifying the weight for each particle selection (str, optional) – the name of the column that specifies which (if any) slice of the CatalogSource to take value (str, optional) – the name of the column specifying the field value for each particle position (str, optional) – the name of the column that specifies the position data of the objects in the catalog

Nmesh (int, 3-vector, optional) – the number of cells per box side; can be inferred from attrs if the value is the same for all species
BoxSize (float, 3-vector, optional) – the size of the box; can be inferred from attrs if the value is the same for all species
dtype (str, dtype, optional) – the data type of the mesh when painting
interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh
compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing
window (str, optional) – the string name of the window to use when interpolating the
weight (str, optional) – the name of the column specifying the weight for each particle
selection (str, optional) – the name of the column that specifies which (if any) slice of the CatalogSource to take
value (str, optional) – the name of the column specifying the field value for each particle
position (str, optional) – the name of the column that specifies the position data of the objects in the catalog