nbodykit.source.catalog.lognormal module

class nbodykit.source.catalog.lognormal.LogNormalCatalog(Plin, nbar, BoxSize, Nmesh, bias=2.0, seed=None, cosmo=None, redshift=None, comm=None, use_cache=False)[source]

Bases: nbodykit.base.catalog.CatalogSource

A CatalogSource containing biased particles that have been Poisson-sampled from a log-normal density field.

Parameters:
  • Plin (callable) – callable specifying the linear power spectrum
  • nbar (float) – the number density of the particles in the box, assumed constant across the box; this is used when Poisson sampling the density field
  • BoxSize (float, 3-vector of floats) – the size of the box to generate the grid on
  • Nmesh (int) – the mesh size to use when generating the density and displacement fields, which are Poisson-sampled to particles
  • bias (float, optional) – the desired bias of the particles; applied while applying a log-normal transformation to the density field
  • seed (int, optional) – the global random seed; if set to None, the seed will be set randomly
  • cosmo (nbodykit.cosmology.core.Cosmology, optional) – this must be supplied if Plin does not carry cosmo attribute
  • redshift (float, optional) – this must be supplied if Plin does not carry a redshift attribute
  • comm (MPI Communicator, optional) – the MPI communicator instance; default (None) sets to the current communicator
  • use_cache (bool, optional) – whether to cache data read from disk; default is False

References

Cole and Jones, 1991 Agrawal et al. 2017

Attributes

attrs A dictionary storing relevant meta-data about the CatalogSource.
columns All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.
csize The total, collective size of the CatalogSource, i.e., summed across all ranks.
hardcolumns A list of the hard-coded columns in the CatalogSource.
size
use_cache If set to True, use the built-in caching features of dask to cache data in memory.

Methods

Position() Position assumed to be in Mpc/h
Selection() A boolean column that selects a subset slice of the CatalogSource.
Value() When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.
Velocity() Velocity in km/s
VelocityOffset() The corresponding RSD offset, in Mpc/h
Weight() The column giving the weight to use for each particle on the mesh.
compute(*args, **kwargs) Our version of dask.compute() that computes multiple delayed dask collections at once.
copy() Return a copy of the CatalogSource object
get_hardcolumn(col) Construct and return a hard-coded column.
make_column(array) Utility function to convert a numpy array to a dask.array.Array.
read(columns) Return the requested columns as dask arrays.
save(output, columns[, datasets, header]) Save the CatalogSource to a bigfile.BigFile.
to_mesh([Nmesh, BoxSize, dtype, interlaced, …]) Convert the CatalogSource to a MeshSource, using the specified parameters.
update_csize() Set the collective size, csize.
Position()[source]

Position assumed to be in Mpc/h

Selection()

A boolean column that selects a subset slice of the CatalogSource.

By default, this column is set to True for all particles.

Value()

When interpolating a CatalogSource on to a mesh, the value of this array is used as the Value that each particle contributes to a given mesh cell.

The mesh field is a weighted average of Value, with the weights given by Weight.

By default, this array is set to unity for all particles.

Velocity()[source]

Velocity in km/s

VelocityOffset()[source]

The corresponding RSD offset, in Mpc/h

Weight()

The column giving the weight to use for each particle on the mesh.

The mesh field is a weighted average of Value, with the weights given by Weight.

By default, this array is set to unity for all particles.

__delitem__(col)

Delete a column; cannot delete a “hard-coded” column

__getitem__(sel)

The following types of indexing are supported:

  1. strings specifying a column in the CatalogSource; returns a dask array holding the column data
  2. boolean arrays specifying a slice of the CatalogSource; returns a CatalogCopy holding only the revelant slice
  3. slice object specifying which particles to select
  4. list of strings specifying column names; returns a CatalogCopy holding only the selected columnss
__len__()

The local size of the CatalogSource on a given rank.

__setitem__(col, value)

Add columns to the CatalogSource, overriding any existing columns with the name col.

attrs

A dictionary storing relevant meta-data about the CatalogSource.

columns

All columns in the CatalogSource, including those hard-coded into the class’s defintion and override columns provided by the user.

compute(*args, **kwargs)

Our version of dask.compute() that computes multiple delayed dask collections at once.

This should be called on the return value of read() to converts any dask arrays to numpy arrays.

If use_cache is True, this internally caches data, using dask’s built-in cache features.

Parameters:args (object) – Any number of objects. If the object is a dask collection, it’s computed and the result is returned. Otherwise it’s passed through unchanged.

Notes

The dask default optimizer induces too many (unnecesarry) IO calls – we turn this off feature off by default. Eventually we want our own optimizer probably.

copy()

Return a copy of the CatalogSource object

Returns:the new CatalogSource object holding the copied data columns
Return type:CatalogCopy
csize

The total, collective size of the CatalogSource, i.e., summed across all ranks.

It is the sum of size across all available ranks.

get_hardcolumn(col)

Construct and return a hard-coded column.

These are usually produced by calling member functions marked by the @column decorator.

Subclasses may override this method and the hardcolumns attribute to bypass the decorator logic.

hardcolumns

A list of the hard-coded columns in the CatalogSource.

These columns are usually member functions marked by @column decorator. Subclasses may override this method and use get_hardcolumn() to bypass the decorator logic.

logger = <logging.Logger object>
make_column(array)

Utility function to convert a numpy array to a dask.array.Array.

read(columns)

Return the requested columns as dask arrays.

Parameters:columns (list of str) – the names of the requested columns
Returns:the list of column data, in the form of dask arrays
Return type:list of dask.array.Array
save(output, columns, datasets=None, header='Header')

Save the CatalogSource to a bigfile.BigFile.

Only the selected columns are saved and attrs are saved in header. The attrs of columns are stored in the datasets.

Parameters:
  • output (str) – the name of the file to write to
  • columns (list of str) – the names of the columns to save in the file
  • datasets (list of str, optional) – names for the data set where each column is stored; defaults to the name of the column
  • header (str, optional) – the name of the data set holding the header information, where attrs is stored
size
to_mesh(Nmesh=None, BoxSize=None, dtype='f4', interlaced=False, compensated=False, window='cic', weight='Weight', value='Value', selection='Selection', position='Position')

Convert the CatalogSource to a MeshSource, using the specified parameters.

Parameters:
  • Nmesh (int, optional) – the number of cells per side on the mesh; must be provided if not stored in attrs
  • BoxSize (scalar, 3-vector, optional) – the size of the box; must be provided if not stored in attrs
  • dtype (string, optional) – the data type of the mesh array
  • interlaced (bool, optional) – use the interlacing technique of Sefusatti et al. 2015 to reduce the effects of aliasing on Fourier space quantities computed from the mesh
  • compensated (bool, optional) – whether to correct for the window introduced by the grid interpolation scheme
  • window (str, optional) – the string specifying which window interpolation scheme to use; see pmesh.window.methods
  • weight (str, optional) – the name of the column specifying the weight for each particle
  • value (str, optional) – the name of the column specifying the field value for each particle
  • selection (str, optional) – the name of the column that specifies which (if any) slice of the CatalogSource to take
  • position (str, optional) – the name of the column that specifies the position data of the objects in the catalog
Returns:

mesh – a mesh object that provides an interface for gridding particle data onto a specified mesh

Return type:

CatalogMesh

update_csize()

Set the collective size, csize.

This function should be called in __init__() of a subclass, after size has been set to a valid value (not NotImplemented)

use_cache

If set to True, use the built-in caching features of dask to cache data in memory.