nbodykit.source.catalog.species module¶

class nbodykit.source.catalog.species.MultipleSpeciesCatalog(names, *species, **kwargs)[source]¶

Bases: nbodykit.base.catalog.CatalogSourceBase

A CatalogSource interface for handling multiples species of particles.

Parameters:

Parameters:	names (list of str) – list of strings specifying the names of the various species; data columns are prefixed with “species/” where “species” is in `names` species (two* or more CatalogSource objects) – catalogs to be combined into a single catalog, which give the data for different species of particles; as many catalogs as names must be provided use_cache (bool, optional) – whether to cache data when reading; default is `True`

names (list of str) – list of strings specifying the names of the various species; data columns are prefixed with “species/” where “species” is in names
*species (two or more CatalogSource objects) – catalogs to be combined into a single catalog, which give the data for different species of particles; as many catalogs as names must be provided
use_cache (bool, optional) – whether to cache data when reading; default is True

Examples

>>> source1 = UniformCatalog(nbar=3e-5, BoxSize=512., seed=42)
>>> source2 = UniformCatalog(nbar=3e-5, BoxSize=512., seed=84)
>>> cat = MultipleSpeciesCatalog(['data', 'randoms'], source1, source2)

Attributes

`attrs`	A dictionary storing relevant meta-data about the CatalogSource.
`columns`	All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
`hardcolumns`	A list of the hard-coded columns in the CatalogSource.
`use_cache`	If set to `True`, use the built-in caching features of `dask` to cache data in memory.

Methods

`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`make_column`(array)	Utility function to convert a numpy array to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output, columns[, datasets, header])	Save the CatalogSource to a `bigfile.BigFile`.
`to_mesh`([Nmesh, BoxSize, dtype, interlaced, …])	Convert the catalog to a mesh, which knows how to “paint” the the combined density field, summed over all particle species.

__delitem__(col)¶: Delete a column; cannot delete a “hard-coded” column

__getitem__(key)[source]¶: This modifies the behavior of CatalogSourceBase.__getitem__() such that if key is a species name, a CatalogCopy will be returned that holds that data only for the species.

__setitem__(col, value)[source]¶: Add columns to any of the species catalogs.

Note

New column names should be prefixed by ‘species/’ where ‘species’ is a name in the species attribute.

attrs¶: A dictionary storing relevant meta-data about the CatalogSource.

columns¶: All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.

compute(*args, **kwargs)¶

Our version of dask.compute() that computes multiple delayed dask collections at once.

This should be called on the return value of read() to converts any dask arrays to numpy arrays.

If use_cache is True, this internally caches data, using dask’s built-in cache features.

Parameters:	args (object) – Any number of objects. If the object is a dask collection, it’s computed and the result is returned. Otherwise it’s passed through unchanged.

Notes

The dask default optimizer induces too many (unnecesarry) IO calls – we turn this off feature off by default. Eventually we want our own optimizer probably.

get_hardcolumn(col)¶

Construct and return a hard-coded column.

These are usually produced by calling member functions marked by the @column decorator.

Subclasses may override this method and the hardcolumns attribute to bypass the decorator logic.

hardcolumns¶

A list of the hard-coded columns in the CatalogSource.

These columns are usually member functions marked by @column decorator. Subclasses may override this method and use get_hardcolumn() to bypass the decorator logic.

logger = <logging.Logger object>¶

make_column(array)¶: Utility function to convert a numpy array to a dask.array.Array.

read(columns)¶

Return the requested columns as dask arrays.

Parameters:	columns (list of str) – the names of the requested columns
Returns:	the list of column data, in the form of dask arrays
Return type:	list of `dask.array.Array`

save(output, columns, datasets=None, header='Header')¶

Save the CatalogSource to a bigfile.BigFile.

Only the selected columns are saved and attrs are saved in header. The attrs of columns are stored in the datasets.

Parameters:	output (str) – the name of the file to write to columns (list of str) – the names of the columns to save in the file datasets (list of str, optional) – names for the data set where each column is stored; defaults to the name of the column header (str, optional) – the name of the data set holding the header information, where `attrs` is stored

to_mesh(Nmesh=None, BoxSize=None, dtype='f4', interlaced=False, compensated=False, window='cic', weight='Weight', selection='Selection', value='Value', position='Position')[source]¶

Convert the catalog to a mesh, which knows how to “paint” the the combined density field, summed over all particle species.

Parameters:

Parameters:	Nmesh (int, 3-vector, optional) – the number of cells per box side; can be inferred from `attrs` if the value is the same for all species BoxSize (float, 3-vector, optional) – the size of the box; can be inferred from `attrs` if the value is the same for all species dtype (str, dtype, optional) – the data type of the mesh when painting interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing window (str, optional) – the string name of the window to use when interpolating the weight (str, optional) – the name of the column specifying the weight for each particle value (str, optional) – the name of the column specifying the field value for each particle selection (str, optional) – the name of the column that specifies which (if any) slice of the CatalogSource to take position (str, optional) – the name of the column that specifies the position data of the objects in the catalog

Nmesh (int, 3-vector, optional) – the number of cells per box side; can be inferred from attrs if the value is the same for all species
BoxSize (float, 3-vector, optional) – the size of the box; can be inferred from attrs if the value is the same for all species
dtype (str, dtype, optional) – the data type of the mesh when painting
interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh
compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing
window (str, optional) – the string name of the window to use when interpolating the
weight (str, optional) – the name of the column specifying the weight for each particle
value (str, optional) – the name of the column specifying the field value for each particle
selection (str, optional) – the name of the column that specifies which (if any) slice of the CatalogSource to take
position (str, optional) – the name of the column that specifies the position data of the objects in the catalog

use_cache¶: If set to True, use the built-in caching features of dask to cache data in memory.

nbodykit.source.catalog.species.OnDemandColumn(source, col)[source]¶: Return a column from the source on-demand.

nbodykit.source.catalog.species.check_species_metadata(name, attrs, species)[source]¶: Check to see if there is a single value for name in the meta-data of all the species