nbodykit.algorithms¶

class nbodykit.algorithms.ConvolvedFFTPower(first, poles, second=None, Nmesh=None, kmin=0.0, kmax=None, dk=None, use_fkp_weights=None, P0_FKP=None)[source]¶

Algorithm to compute power spectrum multipoles using FFTs for a data survey with non-trivial geometry.

Due to the geometry, the estimator computes the true power spectrum convolved with the window function (FFT of the geometry).

This estimator implemented in this class is described in detail in Hand et al. 2017 (arxiv:1704.02357). It uses the spherical harmonic addition theorem such that only \(2\ell+1\) FFTs are required to compute each multipole. This differs from the implementation in Bianchi et al. and Scoccimarro et al., which requires \((\ell+1)(\ell+2)/2\) FFTs.

Results are computed when the object is inititalized, and the result is stored in the poles attribute. Important meta-data computed during algorithm execution is stored in the attrs dict. See the documenation of run().

Note

A full tutorial on the class is available in the documentation here.

Note

Cross correlations are only supported when the FKP weight column differs between the two mesh objects, i.e., the underlying data and randoms must be the same. This allows users to compute the cross power spectrum of the same density field, weighted differently.

Parameters

first (FKPCatalog, FKPCatalogMesh) – the first source to paint the data/randoms; FKPCatalog is automatically converted to a FKPCatalogMesh, using default painting parameters
poles (list of int) – a list of integer multipole numbers ell to compute
second (FKPCatalog, FKPCatalogMesh, optional) – the second source to paint the data/randoms; cross correlations are only supported when the weight column differs between the two mesh objects, i.e., the underlying data and randoms must be the same!
kmin (float, optional) – the edge of the first wavenumber bin; default is 0
kmax (float, optional) – the limit of the last wavenumber bin; default is None, no limit.
dk (float, optional) – the spacing in wavenumber to use; if not provided; the fundamental mode of the box is used

References

Hand, Nick et al. An optimal FFT-based anisotropic power spectrum estimator, 2017
Bianchi, Davide et al., Measuring line-of-sight-dependent Fourier-space clustering using FFTs, MNRAS, 2015
Scoccimarro, Roman, Fast estimators for redshift-space clustering, Phys. Review D, 2015

Methods

`load`(output[, comm, format])	Load a saved ConvolvedFFTPower result, which has been saved to disk with `ConvolvedFFTPower.save()`.
`normalization`(name, alpha)	Compute the power spectrum normalization, using either the `data` or `randoms` source.
`run`()	Compute the power spectrum multipoles.
`save`(output)	Save the ConvolvedFFTPower result to disk.
`shotnoise`(alpha)	Compute the power spectrum shot noise, using either the `data` or `randoms` source.
`to_pkmu`(mu_edges, max_ell)	Invert the measured multipoles \(P_\ell(k)\) into power spectrum wedges, \(P(k,\mu)\).

__setstate_pre000305__(state)[source]¶: compatible version of setstate for files generated before 0.3.5

classmethod load(output, comm=None, format='current')[source]¶

Load a saved ConvolvedFFTPower result, which has been saved to disk with ConvolvedFFTPower.save().

The current MPI communicator is automatically used if the comm keyword is None

format can be ‘current’, or ‘pre000305’ for files generated before 0.3.5.

normalization(name, alpha)[source]¶

Compute the power spectrum normalization, using either the data or randoms source.

The normalization is given by:

\[A = \int d^3x \bar{n}'_1(x) \bar{n}'_2(x) w_{\mathrm{fkp},1} w_{\mathrm{fkp},2}.\]

The mean densities are assumed to be the same, so this can be converted to a summation over objects in the source, as

\[A = \sum w_{\mathrm{comp},1} \bar{n}_2 w_{\mathrm{fkp},1} w_{\mathrm{fkp},2}.\]

References

see Eqs. 13,14 of Beutler et al. 2014, “The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: testing gravity with redshift space distortions using the power spectrum multipoles”

run()[source]¶

Compute the power spectrum multipoles. This function does not return anything, but adds several attributes (see below).

edges¶

the edges of the wavenumber bins

Type: array_like

poles¶

a BinnedStatistic object that behaves similar to a structured array, with fancy slicing and re-indexing; it holds the measured multipole results, as well as the number of modes (modes) and average wavenumbers values in each bin (k)

Type: BinnedStatistic

attrs¶

dictionary holding input parameters and several important quantites computed during execution:

data.N, randoms.N :
the unweighted number of data and randoms objects
data.W, randoms.W :
the weighted number of data and randoms objects, using the column specified as the completeness weights
alpha :
the ratio of data.W to randoms.W
data.norm, randoms.norm :
the normalization of the power spectrum, computed from either the “data” or “randoms” catalog (they should be similar). See equations 13 and 14 of arxiv:1312.4611.
data.shotnoise, randoms.shotnoise :
the shot noise values for the “data” and “random” catalogs; See equation 15 of arxiv:1312.4611.
shotnoise :
the total shot noise for the power spectrum, equal to data.shotnoise + randoms.shotnoise; this should be subtracted from the monopole.
BoxSize :
the size of the Cartesian box used to grid the data and randoms objects on a Cartesian mesh.

For further details on the meta-data, see the documentation.

Type: dict

save(output)[source]¶

Save the ConvolvedFFTPower result to disk.

The format is currently json.

Parameters: output (str) – the name of the file to dump the JSON results to

shotnoise(alpha)[source]¶

Compute the power spectrum shot noise, using either the data or randoms source.

This computes:

\[S = \sum (w_\mathrm{comp} w_\mathrm{fkp})^2\]

References

see Eq. 15 of Beutler et al. 2014, “The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: testing gravity with redshift space distortions using the power spectrum multipoles”

to_pkmu(mu_edges, max_ell)[source]¶

Invert the measured multipoles \(P_\ell(k)\) into power spectrum wedges, \(P(k,\mu)\).

Parameters

mu_edges (array_like) – the edges of the \(\mu\) bins
max_ell (int) – the maximum multipole to use when computing the wedges; all even multipoles with \(ell\) less than or equal to this number are included

Returns

pkmu – a data set holding the \(P(k,\mu)\) wedges

Return type

BinnedStatistic

class nbodykit.algorithms.CylindricalGroups(source, rankby, rperp, rpar, flat_sky_los=None, periodic=False, BoxSize=None)[source]¶

Compute groups of objects using a cylindrical grouping method. We identify all satellites within a given cylindrical volume around a central object.

Results are computed when the object is inititalized, and the result is stored in the groups attribute; see the documenation of run().

Input parameters are stored in the attrs attribute dictionary.

Parameters

source (subclass of CatalogSource) – the input source of particles providing the ‘Position’ column; the grouping algorithm is run on this catalog
rperp (float) – the radius of the cylinder in the sky plane (i.e., perpendicular to the line-of-sight)
rpar (float) – the radius along the line-of-sight direction; this is 1/2 the height of the cylinder
rankby (str, list, None) – a single or list of column names to rank order the input source by before computing the cylindrical groups, such that objects ranked first are marked as CGM centrals; if None is supplied, no sorting will be done
flat_sky_los (bool, optional) – a unit vector of length 3 providing the line-of-sight direction, assuming a fixed line-of-sight across the box, e.g., [0,0,1] to use the z-axis. If None, the observer at (0,0,0) is used to compute the line-of-sight for each pair
periodic (bool, optional) – whether to use periodic boundary conditions
BoxSize (float, 3-vector, optional) – the size of the box of the input data; must be provided as a keyword or in source.attrs if periodic=True

References

Okumura, Teppei, et al. “Reconstruction of halo power spectrum from redshift-space galaxy distribution: cylinder-grouping method and halo exclusion effect”, arXiv:1611.04165, 2016.

Methods

run()

Compute the cylindrical groups, saving the results to the groups attribute

run()[source]¶

Compute the cylindrical groups, saving the results to the groups attribute

groups¶

a catalog holding the result of the grouping. The length of the catalog is equal to the length of the input size, i.e., the length is equal to the size attribute. The relevant fields are:

cgm_type :
a flag specifying the type for each object, with 0 specifying CGM central and 1 denoting CGM satellite
cgm_haloid :
The index of the CGM object this object belongs to; an integer between 0 and the total number of CGM halos
num_cgm_sats :
The number of satellites in the CGM halo

Type: ArrayCatalog

class nbodykit.algorithms.FFTCorr(first, mode, Nmesh=None, BoxSize=None, second=None, los=[0, 0, 1], Nmu=5, dr=None, rmin=0.0, rmax=None, poles=[])[source]¶

Algorithm to compute the 1d or 2d correlation and/or multipoles in a periodic box, using a Fast Fourier Transform (FFT).

This computes the power spectrum as the square of the Fourier modes of the density field, which are computed via a FFT. Then it is transformed back to obtain the correlation function.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

This is very similar to FFTPower.

Parameters

first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via to_mesh())
mode ({'1d', '2d'}) – compute either 1d or 2d power spectra
Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
los (array_like , optional) – the direction to use as the line-of-sight; must be a unit vector
Nmu (int, optional) – the number of mu bins to use from \(\mu=[0,1]\); if mode = 1d, then Nmu is set to 1
dr (float, optional) – the linear spacing of r bins to use; if not provided, the fundamental mode of the box is used; if dr=0, the bins are tight, such that each bin has a unique r value.
rmin (float, optional) – the lower edge of the first r bin to use
rmax (float, optional) – the upper limit of the last r bin to use
poles (list of int, optional) – a list of multipole numbers ell to compute \(\xi_\ell(r)\) from \(\xi(r,\mu)\)

Methods

`load`(output[, comm])	Load a saved result.
`run`()	Compute the correlation function in a periodic box, using FFTs.
`save`(output)	Save the result to disk.

classmethod load(output, comm=None)¶: Load a saved result. The result has been saved to disk with save().

run()[source]¶

Compute the correlation function in a periodic box, using FFTs.

Returns

corr (BinnedStatistic) – a BinnedStatistic object that holds the measured \(\xi(r)\) or \(\xi(r,\mu)\). It stores the following variables:
- r :
  the mean value for each r bin
- mumode=2d only
  the mean value for each mu bin
- corr :
  real array storing the correlation function
- modes :
  the number of modes averaged together in each bin
poles (BinnedStatistic or None) – a BinnedStatistic object to hold the multipole results \(\xi_\ell(r)\); if no multipoles were requested by the user, this is None. It stores the following variables:
- r :
  the mean value for each r bin
- power_L :
  complex array storing the real and imaginary components for the \(\ell=L\) multipole
- modes :
  the number of modes averaged together in each bin
corr.attrs, poles.attrs (dict) – dictionary of meta-data; in addition to storing the input parameters, it includes the following fields computed during the algorithm execution:
- shotnoisefloat
  the power Poisson shot noise, equal to \(V/N\), where \(V\) is the volume of the box and N is the total number of objects; if a cross-correlation is computed, this will be equal to zero
- N1int
  the total number of objects in the first source
- N2int
  the total number of objects in the second source

save(output)¶: Save the result to disk. The format is currently JSON.

class nbodykit.algorithms.FFTPower(first, mode, Nmesh=None, BoxSize=None, second=None, los=[0, 0, 1], Nmu=5, dk=None, kmin=0.0, kmax=None, poles=[])[source]¶

Algorithm to compute the 1d or 2d power spectrum and/or multipoles in a periodic box, using a Fast Fourier Transform (FFT).

This computes the power spectrum as the square of the Fourier modes of the density field, which are computed via a FFT.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

A full tutorial on the class is available in the documentation here.

Parameters

first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via to_mesh())
mode ({'1d', '2d'}) – compute either 1d or 2d power spectra
Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
los (array_like , optional) – the direction to use as the line-of-sight; must be a unit vector
Nmu (int, optional) – the number of mu bins to use from \(\mu=[0,1]\); if mode = 1d, then Nmu is set to 1
dk (float, optional) – the linear spacing of k bins to use; if not provided, the fundamental mode of the box is used; if dk=0 is set, use fine bins such that the modes contributing to the bin has identical modulus.
kmin (float, optional) – the lower edge of the first k bin to use
kmin – the upper limit of the last k bin to use (not exact)
poles (list of int, optional) – a list of multipole numbers ell to compute \(P_\ell(k)\) from \(P(k,\mu)\)

Methods

`load`(output[, comm])	Load a saved result.
`run`()	Compute the power spectrum in a periodic box, using FFTs.
`save`(output)	Save the result to disk.

classmethod load(output, comm=None)¶: Load a saved result. The result has been saved to disk with save().

run()[source]¶

Compute the power spectrum in a periodic box, using FFTs.

Returns

power (BinnedStatistic) – a BinnedStatistic object that holds the measured \(P(k)\) or \(P(k,\mu)\). It stores the following variables:
- k :
  the mean value for each k bin
- mumode=2d only
  the mean value for each mu bin
- power :
  complex array storing the real and imaginary components of the power
- modes :
  the number of Fourier modes averaged together in each bin
poles (BinnedStatistic or None) – a BinnedStatistic object to hold the multipole results \(P_\ell(k)\); if no multipoles were requested by the user, this is None. It stores the following variables:
- k :
  the mean value for each k bin
- power_L :
  complex array storing the real and imaginary components for the \(\ell=L\) multipole
- modes :
  the number of Fourier modes averaged together in each bin
power.attrs, poles.attrs (dict) – dictionary of meta-data; in addition to storing the input parameters, it includes the following fields computed during the algorithm execution:
- shotnoisefloat
  the power Poisson shot noise, equal to \(V/N\), where \(V\) is the volume of the box and N is the total number of objects; if a cross-correlation is computed, this will be equal to zero
- N1int
  the total number of objects in the first source
- N2int
  the total number of objects in the second source

save(output)¶: Save the result to disk. The format is currently JSON.

class nbodykit.algorithms.FFTRecon(data, ran, Nmesh, bias=1.0, f=0.0, los=[0, 0, 1], R=20, position='Position', revert_rsd_random=False, scheme='LGS', BoxSize=None)[source]¶

FFT based Lagrangian reconstruction algorithm in a periodic box.

References

Eisenstein et al, 2007 http://adsabs.harvard.edu/abs/2007ApJ…664..675E Section 3, paragraph starting with ‘Restoring in full the …’

We follow a cleaner description in Schmitfull et al 2015,

http://adsabs.harvard.edu/cgi-bin/bib_query?arXiv:1508.06972

Table I, and text below. Schemes are LGS, LF2 and LRR.

A slight difference against the paper is that Redshift distortion and bias are corrected in the linear order. The Random shifting followed Martin White’s suggestion to exclude the RSD by default. (with default revert_rsd_random=False.)

Parameters

data (CatalogSource,) – the data catalog, e.g. halos. data.attrs[‘BoxSize’] is used if argument BoxSize is not given.
ran (CatalogSource) – the random catalog, e.g. from a UniformCatalog object.
Nmesh (int) – The size of the FFT Mesh. Rule of thumb is that the size of a mesh cell shall be 2 ~ 4 times smaller than the smoothing length, R.
revert_rsd_random (boolean) – Revert the rsd for randoms as well as data. There are two conventions. either reverting rsd displacement in data displacement only(False) or in both data and randoms (True). Default is False.
R (float) – The radius of smoothing. 10 to 20 Mpc/h is usually cool.
bias (float) – The bias of the data catalog.
f (float) – The growth rate; if non-zero, correct for RSD
los (list) – The direction of the line of sight for RSD. Usually (default) [0, 0, 1].
position (string) – column to use for picking up the Position of the objects.
BoxSize (float or array_like) – the size of the periodic box, default is to infer from the data.
scheme (string) – The reconstruction scheme. LGS is the standard reconstruction (Lagrangian growth shift). LF2 is the F2 Lagrangian reconstruction. LRR is the random-random Lagrangian reconstruction.

Attributes

actions: A list of actions to apply to the density field when interpolating to the mesh.
attrs: A dictionary storing relevant meta-data about the CatalogSource.

Methods

`apply`(func[, kind, mode])	Return a view of the mesh, with `actions` updated to apply the specified function, either in Fourier space or configuration space, based on `mode`
`compute`([mode, Nmesh])	Compute / Fetch the mesh object into memory as a RealField or ComplexField object.
`preview`([axes, Nmesh, root])	Gather the mesh into as a numpy array, with (reduced) resolution.
`save`(output[, dataset, mode])	Save the mesh as a `BigFileMesh` on disk, either in real or complex space.
`to_complex_field`([out])	Convert the mesh source to the Fourier-space field, returning a `pmesh.pm.ComplexField` object.
`to_field`([mode, out])	Return the mesh as a `pmesh` Field object, either in Fourier space or configuration space, based on `mode`.
`to_real_field`()	Convert the mesh source to the configuration-space field, returning a `pmesh.pm.RealField` object.
`view`()	Return a "view" of the MeshSource, in the spirit of numpy's ndarray view.

paint
run
work_with

__finalize__(other)¶

Finalize the creation of a MeshSource object by copying over attributes from a second MeshSource.

Parameters: other (MeshSource) – the second MeshSource to copy over attributes from

__len__()¶: Length of a mesh source is zero

property actions¶

A list of actions to apply to the density field when interpolating to the mesh.

This stores tuples of (mode, func, kind); see apply() for more details.

apply(func, kind='wavenumber', mode='complex')¶

Return a view of the mesh, with actions updated to apply the specified function, either in Fourier space or configuration space, based on mode

Parameters

func (callable or a MeshFilter object) – func(x, y) where x is a list of r (k) values that broadcasts into a full array, when mode is ‘real’ (‘complex’); the value of x depends on kind. y is the value of the mesh field on the corresponding locations.
kind (string, optional) –
if a MeshFilter object is given as func, this is ignored. The kind of value in x.
- When mode is ‘complex’:
  - ’wavenumber’ means wavenumber from [- 2 pi / L * N / 2, 2 pi / L * N / 2).
  - ’circular’ means circular frequency from [- pi, pi).
  - ’index’ means [0, Nmesh )
- When mode is ‘real’:
  - ’relative’ means distance from [-0.5 Boxsize, 0.5 BoxSize).
  - ’index’ means [0, Nmesh )
mode ('complex' or 'real', optional) – if a MeshFilter object is given as func, this is ignored. whether to apply the function to the mesh in configuration space or Fourier space

Returns

a view of the mesh object with the actions attribute updated to include the new action

Return type

MeshSource

property attrs¶: A dictionary storing relevant meta-data about the CatalogSource.

compute(mode='real', Nmesh=None)¶: Compute / Fetch the mesh object into memory as a RealField or ComplexField object.

preview(axes=None, Nmesh=None, root=0)¶

Gather the mesh into as a numpy array, with (reduced) resolution. The result is broadcast to all ranks, so this uses \(\mathrm{Nmesh}^3\) per rank.

Parameters

Nmesh (int, array_like) – The desired Nmesh of the result. Be aware this function allocates memory to hold a full Nmesh on each rank.
axes (int, array_like) – The axes to project the preview onto., e.g. (0, 1)
root (int, optional) – the rank number to treat as root when gathering to a single rank

Returns

out – An numpy array holding the real density field.

Return type

array_like

save(output, dataset='Field', mode='real')¶

Save the mesh as a BigFileMesh on disk, either in real or complex space.

Parameters

output (str) – name of the bigfile file
dataset (str, optional) – name of the bigfile data set where the field is stored
mode (str, optional) – real or complex; the form of the field to store

to_complex_field(out=None)¶

Convert the mesh source to the Fourier-space field, returning a pmesh.pm.ComplexField object.

Not implemented in the base class, unless object is a view.

to_field(mode='real', out=None)¶

Return the mesh as a pmesh Field object, either in Fourier space or configuration space, based on mode.

This will call to_real_field() or to_complex_field() based on mode.

Parameters: mode ('real' or 'complex') – the return type of the field
Returns: either a RealField of ComplexField, storing the value of the field on the mesh
Return type: RealField, ComplexField

to_real_field()[source]¶

Convert the mesh source to the configuration-space field, returning a pmesh.pm.RealField object.

Not implemented in the base class, unless object is a view.

view()¶

Return a “view” of the MeshSource, in the spirit of numpy’s ndarray view.

This returns a new MeshSource whose memory is owned by self.

Note that for CatalogMesh objects, this is overidden by the CatalogSource.view function.

class nbodykit.algorithms.FKPCatalog(data, randoms, BoxSize=None, BoxPad=0.02, P0=None, nbar='NZ')[source]¶

An interface for simultaneous modeling of a data CatalogSource and a randoms CatalogSource, in the spirit of Feldman, Kaiser, and Peacock, 1994.

This main functionality of this class is:

provide a uniform interface to accessing columns from the data CatalogSource and randoms CatalogSource, using column names prefixed with “data/” or “randoms/”
compute the shared BoxSize of the source, by finding the maximum Cartesian extent of the randoms
provide an interface to a mesh object, which knows how to paint the FKP density field from the data and randoms

Parameters

data (CatalogSource) – the CatalogSource of particles representing the data catalog
randoms (CatalogSource, or None) – the CatalogSource of particles representing the randoms catalog if None is given an empty catalog is used.
BoxSize (float, 3-vector, optional) – the size of the Cartesian box to use for the unified data and randoms; if not provided, the maximum Cartesian extent of the randoms defines the box
BoxPad (float, 3-vector, optional) – optionally apply this additional buffer to the extent of the Cartesian box
nbar (str, optional) – the name of the column specifying the number density as a function of redshift. default is NZ.
P0 (float or None) – if not None, a column named FKPWeight is added to data and random based on nbar.

References

Feldman, Kaiser, and Peacock, 1994

Attributes

attrs: A dictionary storing relevant meta-data about the CatalogSource.
columns: Columns for individual species can be accessed using a species/ prefix and the column name, i.e., data/Position.
hardcolumns: Hardcolumn of the form species/name
species: List of species names

Methods

`compute`(args, *kwargs)	Our version of `dask.compute()` that computes multiple delayed dask collections at once.
`copy`()	Return a shallow copy of the object, where each column is a reference of the corresponding column in `self`.
`get_hardcolumn`(col)	Construct and return a hard-coded column.
`make_column`(array)	Utility function to convert an array-like object to a `dask.array.Array`.
`read`(columns)	Return the requested columns as dask arrays.
`save`(output[, columns, dataset, datasets, ...])	Save the CatalogSource to a `bigfile.BigFile`.
`to_mesh`([Nmesh, BoxSize, BoxCenter, dtype, ...])	Convert the FKPCatalog to a mesh, which knows how to "paint" the FKP density field.
`to_subvolumes`([domain, position, columns])	Domain Decompose a catalog, sending items to the ranks according to the supplied domain object.
`view`([type])	Return a "view" of the CatalogSource object, with the returned type set by `type`.

create_instance

__delitem__(col)¶: Delete a column of the form species/column

__finalize__(other)¶

Finalize the creation of a CatalogSource object by copying over any additional attributes from a second CatalogSource.

The idea here is to only copy over attributes that are similar to meta-data, so we do not copy some of the core attributes of the CatalogSource object.

Parameters: other – the second object to copy over attributes from; it needs to be a subclass of CatalogSourcBase for attributes to be copied
Returns: return self, with the added attributes
Return type: CatalogSource

__getitem__(key)¶

This provides access to the underlying data in two ways:

The CatalogSource object for a species can be accessed if key is a species name.
Individual columns for a species can be accessed using the format: species/column.

__setitem__(col, value)¶: Add columns to any of the species catalogs.

Note

New column names should be prefixed by ‘species/’ where ‘species’ is a name in the species attribute.

property attrs¶: A dictionary storing relevant meta-data about the CatalogSource.

property columns¶: Columns for individual species can be accessed using a species/ prefix and the column name, i.e., data/Position.

compute(*args, **kwargs)¶

Our version of dask.compute() that computes multiple delayed dask collections at once.

This should be called on the return value of read() to converts any dask arrays to numpy arrays.

. note::: If the base attribute is set, compute() will called using base instead of self.

Parameters: args (object) – Any number of objects. If the object is a dask collection, it’s computed and the result is returned. Otherwise it’s passed through unchanged.

copy()¶

Return a shallow copy of the object, where each column is a reference of the corresponding column in self.

Note

No copy of data is made.

Note

This is different from view in that the attributes dictionary of the copy no longer related to self.

Returns: a new CatalogSource that holds all of the data columns of self
Return type: CatalogSource

get_hardcolumn(col)¶

Construct and return a hard-coded column.

These are usually produced by calling member functions marked by the @column decorator.

Subclasses may override this method and the hardcolumns attribute to bypass the decorator logic.

Note

If the base attribute is set, get_hardcolumn() will called using base instead of self.

property hardcolumns¶: Hardcolumn of the form species/name

static make_column(array)¶

Utility function to convert an array-like object to a dask.array.Array.

Note

The dask array chunk size is controlled via the dask_chunk_size global option. See set_options.

Parameters: array (array_like) – an array-like object; can be a dask array, numpy array, ColumnAccessor, or other non-scalar array-like object
Returns: a dask array initialized from array
Return type: dask.array.Array

read(columns)¶

Return the requested columns as dask arrays.

Parameters: columns (list of str) – the names of the requested columns
Returns: the list of column data, in the form of dask arrays
Return type: list of dask.array.Array

save(output, columns=None, dataset=None, datasets=None, header='Header', compute=True)¶

Save the CatalogSource to a bigfile.BigFile.

Only the selected columns are saved and attrs are saved in header. The attrs of columns are stored in the datasets.

Parameters

output (str) – the name of the file to write to
columns (list of str) – the names of the columns to save in the file, or None to use all columns
dataset (str, optional) – dataset to store the columns under.
datasets (list of str, optional) – names for the data set where each column is stored; defaults to the name of the column (deprecated)
header (str, optional, or None) – the name of the data set holding the header information, where attrs is stored if header is None, do not save the header.
compute (boolean, default True) – if True, wait till the store operations finish if False, return a dictionary with column name and a future object for the store. use dask.compute() to wait for the store operations on the result.

property species¶: List of species names

to_mesh(Nmesh=None, BoxSize=None, BoxCenter=None, dtype='c16', interlaced=False, compensated=False, resampler='cic', fkp_weight='FKPWeight', comp_weight='Weight', selection='Selection', position='Position', bbox_from_species=None, window=None, nbar=None)[source]¶

Convert the FKPCatalog to a mesh, which knows how to “paint” the FKP density field.

Additional keywords to the to_mesh() function include the FKP weight column, completeness weight column, and the column specifying the number density as a function of redshift.

Parameters

Nmesh (int, 3-vector, optional) – the number of cells per box side; if not specified in attrs, this must be provided
dtype (str, dtype, optional) – the data type of the mesh when painting. dtype=’f8’ or ‘f4’ assumes Hermitian symmetry of the input field (delta(x) = delta^{*}(-x)), and stores it as an N x N x N/2+1 real array. This speeds evaluation of even multipoles but yields incorrect odd multipoles in the presence of the wide-angle effect. dtype=’c16’ or ‘c8’ stores the field as an N x N x N complex array to correctly recover the odd multipoles.
interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh
compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing
resampler (str, optional) – the string name of the resampler to use when interpolating the particles to the mesh; see pmesh.window.methods for choices
fkp_weight (str, optional) – the name of the column in the source specifying the FKP weight; this weight is applied to the FKP density field: n_data - alpha*n_randoms
comp_weight (str, optional) – the name of the column in the source specifying the completeness weight; this weight is applied to the individual fields, either n_data or n_random
selection (str, optional) – the name of the column used to select a subset of the source when painting
position (str, optional) – the name of the column that specifies the position data of the objects in the catalog
bbox_from_species (str, optional) – if given, use the species to infer a bbox. if not give, will try random, then data (if random is empty)
window (deprecated.) – use resampler=
nbar (deprecated.) – deprecated. set nbar in the call to FKPCatalog()

to_subvolumes(domain=None, position='Position', columns=None)¶

Domain Decompose a catalog, sending items to the ranks according to the supplied domain object. Using the position column as the Position.

This will read in the full position array and all of the requested columns.

Parameters

domain (pmesh.domain.GridND object, or None) – The domain to distribute the catalog. If None, try to evenly divide spatially. An easiest way to find a domain object is to use pm.domain, where pm is a pmesh.pm.ParticleMesh object.
position (string_like) – column to use to compute the position.
columns (list of string_like) – columns to include in the new catalog, if not supplied, all catalogs will be exchanged.

Returns

A decomposed catalog source, where each rank only contains objects belongs to the rank as claimed by the domain object.

self.attrs are carried over as a shallow copy to the returned object.

Return type

CatalogSource

view(type=None)¶

Return a “view” of the CatalogSource object, with the returned type set by type.

This initializes a new empty class of type type and attaches attributes to it via the __finalize__() mechanism.

Parameters: type (Python type) – the desired class type of the returned object.

nbodykit.algorithms.FKPPower¶: alias of nbodykit.algorithms.convpower.fkp.ConvolvedFFTPower

nbodykit.algorithms.FKPWeightFromNbar(P0, nbar)[source]¶

Create FKPWeight from nbar, the number density of objects per redshift.

Parameters

P0 (float) – the FKP normalization, when P0 == 0, returns 1.0, ignoring size / shape of nbar.
nbar (array_like) – the number density of objects per redshift

Returns

FKPWeight (the FKPWeight, can be assigned to a catalog as a column)
to be consumed ConvolvedFFTPower

class nbodykit.algorithms.FOF(source, linking_length, nmin, absolute=False, periodic=True, domain_factor=1)[source]¶

A friends-of-friends halo finder that computes the label for each particle, denoting which halo it belongs to.

Friends-of-friends was first used by Davis et al 1985 to define halos in hierachical structure formation of cosmological simulations. The algorithm is also known as DBSCAN in computer science. The subroutine here implements a parallel version of the FOF.

The underlying local FOF algorithm is from kdcount.cluster, which is an adaptation of the implementation in Volker Springel’s Gadget and Martin White’s PM.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

For returning a CatalogSource of the FOF halos, see find_features() and for computing a halo catalog with added analytic information for a specific redshift and cosmology, see to_halos().

Parameters

source (CatalogSource) – the source to run the FOF algorithm on; must support ‘Position’
linking_length (float) – the linking length, either in absolute units, or relative to the mean particle separation
nmin (int) – halo with fewer particles are ignored
absolute (bool, optional) – If True, the linking length is in absolute units, otherwise it is relative to the mean particle separation; default is False

Methods

`find_features`([peakcolumn])	Based on the particle labels, identify the groups, and return the center-of-mass `CMPosition`, `CMVelocity`, and Length of each feature.
`run`()	Run the FOF algorithm.
`to_halos`(particle_mass, cosmo, redshift[, ...])	Return a `HaloCatalog`, holding the center-of-mass position and velocity of each FOF halo, as well as the properly scaled mass, for a given cosmology and redshift.

find_features(peakcolumn=None)[source]¶

Based on the particle labels, identify the groups, and return the center-of-mass CMPosition, CMVelocity, and Length of each feature.

If a peakcolumn is given, the PeakPosition and PeakVelocity is also calculated for the particle at the peak value of the column.

Data is scattered evenly across all ranks.

Returns: a source holding the (‘CMPosition’, ‘CMVelocity’, ‘Length’) of each feature; optionaly, PeakPosition, PeakVelocity are also included if peakcolumn is not None
Return type: ArrayCatalog

run()[source]¶

Run the FOF algorithm. This function returns nothing, but does attach several attributes to the class instance:

attr:labels
max_labels

Note

The labels array is scattered evenly across all ranks.

labels¶

an array the label that specifies which FOF halo each particle belongs to

Type: array_like, length: size

max_label¶

the maximum label across all ranks; this represents the total number of FOF halos found

Type: int

to_halos(particle_mass, cosmo, redshift, mdef='vir', posdef='cm', peakcolumn='Density')[source]¶

Return a HaloCatalog, holding the center-of-mass position and velocity of each FOF halo, as well as the properly scaled mass, for a given cosmology and redshift.

The returned catalog also has default analytic prescriptions for halo radius and concentration.

The data is scattered evenly across all ranks.

Parameters

particle_mass (float) – the particle mass, which is used to convert the number of particles in each halo to a total mass
cosmo (nbodykit.cosmology.core.Cosmology) – the cosmology of the catalog
redshift (float) – the redshift of the catalog
mdef (str, optional) – string specifying mass definition, used for computing default halo radii and concentration; should be ‘vir’ or ‘XXXc’ or ‘XXXm’ where ‘XXX’ is an int specifying the overdensity
posdef (str, optional) – position, can be cm (center of mass) or peak (particle with maximum value on a column)
peakcolumn (str , optional) – when posdef is ‘peak’, this is the column in source for identifying particles at the peak for the position and velocity.

Returns

a HaloCatalog at the specified cosmology and redshift

Return type

HaloCatalog

class nbodykit.algorithms.FiberCollisions(ra, dec, collision_radius=0.017222222222222226, seed=None, degrees=True, comm=None)[source]¶

Run an angular FOF algorithm to determine fiber collision groups from an input catalog, and then assign fibers such that the maximum amount of object receive a fiber.

This amounts to determining the following population of objects:

population 1:
the maximal “clean” sample of objects in which each object is not angularly collided with any other object in this subsample
population 2:
the potentially-collided objects; these objects are those that are fiber collided + those that have been “resolved” due to multiple coverage in tile overlap regions

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Parameters

ra (array_like) – the right ascension coordinate column
dec (array_like) – the declination coordinate column
collision_radius (float, optional) – the size of the angular collision radius (in degrees); default is 62 arcseconds
seed (int, optional) – the random seed to use when determining which objects get fibers
degrees (bool, optional) – set to True if the units of ra and dec are degrees

References

Guo et al. 2010

Methods

run()

Run the fiber assignment algorithm.

run()[source]¶

Run the fiber assignment algorithm. This attaches the following attribute to the object:

labels

Note

The labels attribute has a 1-to-1 correspondence with the rows in the input source.

labels¶

a CatalogSource that has the following columns:

Label :
the group labels for each object in the input CatalogSource; label == 0 objects are not in a group
Collided :
a flag array specifying which objects are collided, i.e., do not receive a fiber
NeighborID :
for those objects that are collided, this gives the (global) index of the nearest neighbor on the sky (0-indexed) in the input catalog source, else it is set to -1

Type: ArrayCatalog; size: size

class nbodykit.algorithms.KDDensity(source, margin=1.0)[source]¶

Estimate a proxy density based on the distance to the nearest neighbor. The result is proportional to the density but the scale is unspecified.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Parameters

source (CatalogSource) – the input source of particles to compute the proxy density on; must specify the ‘Position’ column
margin (float, optional) – Padding region per parallel domain; relative to the mean seperation

Methods

run()

Compute the density proxy.

run()[source]¶

Compute the density proxy. This attaches the following attribute:

density

density¶

a unit-less, proxy density value for each object on the local rank. This is computed as the inverse cube of the distance to the closest, nearest neighbor

Type: array_like, length: size

class nbodykit.algorithms.ProjectedFFTPower(first, Nmesh=None, BoxSize=None, second=None, axes=(0, 1), dk=None, kmin=0.0)[source]¶

The power spectrum of a field in a periodic box, projected over certain axes.

This is not really always physically meaningful, but convenient for making sense of Lyman-Alpha forest or lensing maps.

This is usually called the 1d power spectrum or 2d power spectrum.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Parameters

first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via to_mesh())
Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
axes (tuple) – axes to measure the power on. The axes not in the list will be averaged out. For example: - (0, 1) : project to x,y and measure power - (0) : project to x and measure power.
dk (float, optional) – the linear spacing of k bins to use; if not provided, the fundamental mode of the box is used
kmin (float, optional) – the lower edge of the first k bin to use

Methods

`load`(output[, comm])	Load a saved result.
`run`()	Run the algorithm.
`save`(output)	Save the result to disk.

classmethod load(output, comm=None)¶: Load a saved result. The result has been saved to disk with save().

run()[source]¶

Run the algorithm. This attaches the following attributes to the class:

edges
power

edges¶

the edges of the wavenumber bins

Type: array_like

power¶

a BinnedStatistic object that holds the projected power. It stores the following variables:

k :
the mean value for each k bin
power :
complex array holding the real and imaginary components of the projected power
modes :
the number of Fourier modes averaged together in each bin

Type: BinnedStatistic

save(output)¶: Save the result to disk. The format is currently JSON.

class nbodykit.algorithms.RedshiftHistogram(source, fsky, cosmo, bins=None, redshift='Redshift', weight=None)[source]¶

Compute the mean number density as a function of redshift \(n(z)\) from an input CatalogSource of particles.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

The units of the number density are \((\mathrm{Mpc}/h)^{-3}\)

Parameters

source (CatalogSource) – the source of particles holding the redshift column to histogram
fsky (float) – the sky area fraction, which is used in the volume calculation when normalizing \(n(z)\)
cosmo (nbodykit.cosmology.core.Cosmology) – the cosmological parameters, which are used to compute the volume from redshift shells when normalizing \(n(z)\)
bins (int or sequence of scalars, optional) – If bins is an int, it defines the number of equal-width bins in the given range. If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths. If not provided, Scott’s rule is used to estimate the optimal bin width from the input data (default)
redshift (str, optional) – the name of the column specifying the redshift data
weight (str, optional) – the name of the column specifying weights to use when histogramming the data

Methods

`interpolate`(z[, ext])	Interpoalte dndz as a function of redshift.
`load`(output[, comm])	Load a saved RedshiftHistogram result.
`run`()	Run the algorithm, which computes the histogram.
`save`(output)	Save the RedshiftHistogram result to disk.

interpolate(z, ext='zeros')[source]¶

Interpoalte dndz as a function of redshift.

The interpolation acts as a band pass filter, removing small scale fluctuations in the estimator.

Parameters

z (array_like) – redshift
ext ('extrapolate', 'zeros', 'raise', 'const') – how to deal with values out of bound.

Returns

Return type

n(z)

classmethod load(output, comm=None)[source]¶

Load a saved RedshiftHistogram result.

The result has been saved to disk with RedshiftHistogram.save().

run()[source]¶

Run the algorithm, which computes the histogram. This function does not return anything, but adds the following attributes to the class:

bin_edges
bin_centers
dV
nbar

Note

All ranks store the same result attributes.

bin_edges¶

the edges of the redshift bins

Type: array_like

bin_centers¶

the center values of each redshift bin

Type: array_like

dV¶

the volume of each redshift shell in units of \((\mathrm{Mpc}/h)^3\)

Type: array_like

nbar¶

the values of the redshift histogram, normalized to number density (in units of \((\mathrm{Mpc}/h)^{-3}\))

Type: array_like

save(output)[source]¶

Save the RedshiftHistogram result to disk.

The format is JSON.

class nbodykit.algorithms.SimulationBox2PCF(mode, data1, edges, Nmu=None, pimax=None, data2=None, randoms1=None, randoms2=None, R1R2=None, periodic=True, BoxSize=None, los='z', weight='Weight', position='Position', show_progress=False, **config)[source]¶

Compute the two-point correlation function for data in a simulation box as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using pair counting.

This uses analytic randoms when using periodic conditions, unless a randoms catalog is specified. The “natural” estimator (DD/RR-1) is used in the former case, and the Landy-Szalay estimator (DD/RR - 2DR/RR + 1) in the latter case.

Note

When using analytic randoms, the expected counts are assumed to be unweighted.

Parameters

mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below
data1 (CatalogSource) – the data catalog;
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the \(\pi\) direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the \(\pi\) direction.
data2 (CatalogSource, optional) – the second data catalog to cross-correlate;
randoms1 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data1; if not provided, analytic randoms will be used
randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data2; if not provided, analytic randoms will be used
R1R2 (SimulationBoxPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator
periodic (bool, optional) – whether to use periodic boundary conditions
BoxSize (float, 3-vector, optional) – the size of the box; if ‘BoxSize’ is not provided in the source ‘attrs’, it must be provided here
los ('x', 'y', 'z'; int, optional) – the axis of the simulation box to treat as the line-of-sight direction; this can be provided as string identifying one of ‘x’, ‘y’, ‘z’ or the equivalent integer number of the axis
weight (str, optional) – the name of the column in the source specifying the particle weights
position (str, optional) – the name of the column in the source specifying the particle positions
show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code
**config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute correlation functions using several different coordinate choices, based on the value of the input argument mode. The choices are:

mode='1d' : compute pairs as a function of the 3D separation \(r\)
mode='2d' : compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)
mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)
mode='angular' : compute pairs as a function of angle on the sky, \(\theta\)

If mode='projected', the projected correlation function \(w_p(r_p)\) is also computed, using the input \(\pi_\mathrm{max}\) value.

Methods

`load`(output[, comm])	Load a result has been saved to disk with `save()`.
`run`()	Run the two-point correlation function algorithm.
`save`(output)	Save result as a JSON file with name `output`

classmethod load(output, comm=None)¶: Load a result has been saved to disk with save().

run()[source]¶

Run the two-point correlation function algorithm. This attaches the following attributes:

SimulationBox2PCF.D1D2
SimulationBox2PCF.D1R2
SimulationBox2PCF.D2R1
SimulationBox2PCF.R1R2
SimulationBox2PCF.corr
SimulationBox2PCF.wp (if mode='projected')

D1D2¶

the data1 - data2 pair counts

Type: BinnedStatistic

D1R2¶

the data1 - randoms2 pair counts

Type: BinnedStatistic

D2R1¶

the data2 - randoms1 pair counts

Type: BinnedStatistic

R1R2¶

the randoms1 - randoms2 pair counts

Type: BinnedStatistic

corr¶

the correlation function values, stored as the corr variable, computed from the pair counts

Type: BinnedStatistic

wp¶

the projected correlation function, \(w_p(r_p)\), computed if mode='projected'; correlation is stored as the corr variable

Type: BinnedStatistic

Notes

The D1D2, D1R2, D2R1, and R1R2 attributes are identical to the pairs attribute of SimulationBoxPairCount.

save(output)¶: Save result as a JSON file with name output

class nbodykit.algorithms.SimulationBox3PCF(source, poles, edges, BoxSize=None, periodic=True, weight='Weight', position='Position')[source]¶

Compute the multipoles of the isotropic, three-point correlation function in configuration space for data in a simulation box.

This uses the algorithm of Slepian and Eisenstein, 2015 which scales as \(\mathcal{O}(N^2)\), where \(N\) is the number of objects.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

The algorithm expects the positions of objects in a simulation box to be the Cartesian x, y, and z vectors. For survey data, in the form of right ascension, declination, and redshift, see SurveyData3PCF.

Parameters

source (CatalogSource) – the input source of particles providing the ‘Position’ column
poles (list of int) – the list of multipole numbers to compute
edges (array_like) – the edges of the bins of separation to use; length of nbins+1
BoxSize (float, 3-vector, optional) – the size of the box; if periodic boundary conditions used, and ‘BoxSize’ not provided in the source attrs, it must be provided here
periodic (bool, optional) – whether to use periodic boundary conditions when computing separations between objects
weight (str, optional) – the name of the column in the source specifying the particle weights

References

Slepian and Eisenstein, MNRAS 454, 4142-4158 (2015)

Methods

`load`(filename[, comm])	Load a result from `filename` that has been saved to disk with `save()`.
`run`([pedantic])	Compute the three-point CF multipoles.
`save`(output)	Save the `poles` result to a JSON file with name `output`.

classmethod load(filename, comm=None)¶: Load a result from filename that has been saved to disk with save().

run(pedantic=False)[source]¶

Compute the three-point CF multipoles. This attaches the following the attributes to the class:

poles

poles¶

a BinnedStatistic object to hold the multipole results; the binned statistics stores the multipoles as variables corr_0, corr_1, etc for \(\ell=0,1,\) etc. The coordinates of the binned statistic are r1 and r2, which give the separations between the three objects in CF.

Type: BinnedStatistic

save(output)¶: Save the poles result to a JSON file with name output.

class nbodykit.algorithms.SimulationBoxPairCount(mode, first, edges, BoxSize=None, periodic=True, second=None, los='z', Nmu=None, pimax=None, weight='Weight', position='Position', show_progress=False, **config)[source]¶

Count (weighted) pairs of objects in a simulation box as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using the Corrfunc package.

See the Notes below for the allowed coordinate dimensions.

The default weighting scheme uses the product of the weights for each object in a pair.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

The algorithm expects the positions of particles in a simulation box to be the Cartesian x, y, and z vectors. To compute pair counts on survey data, using right ascension, declination, and redshift, see SurveyDataPairCount.

Parameters

mode ('1d', '2d', 'projected', 'angular') – compute pair counts as a function of the specified coordinate basis; see the Notes section below for specifics
first (CatalogSource) – the first source of particles, providing the position column
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1
BoxSize (float, 3-vector, optional) – the size of the box; if ‘BoxSize’ is not provided in the source ‘attrs’, it must be provided here
periodic (bool, optional) – whether to use periodic boundary conditions
second (CatalogSource, optional) – the second source of particles to cross-correlate
los ({'x', 'y', 'z'}, int, optional) – the axis of the simulation box to treat as the line-of-sight direction; this can be provided as string identifying one of ‘x’, ‘y’, ‘z’ or the equivalent integer number of the axis
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the \(\pi\) direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the \(\pi\) direction.
weight (str, optional) – the name of the column in the source specifying the particle weights
position (str, optional) – name of the column of the position of particles
show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code
**config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute pair counts using several different coordinate choices, based on the value of the input argument mode. The choices are:

mode='1d' : compute pairs as a function of the 3D separation \(r\)
mode='2d' : compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)
mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)
mode='angular' : compute pairs as a function of angle on the sky, \(\theta\)

For angular pair counts, the observer is placed at the center of the box when converting Cartesian coordinates to angular coordinates on the unit sphere.

Methods

`load`(output[, comm])	Load a result has been saved to disk with `save()`.
`run`()	Calculate the pair counts in a simulation box.
`save`(output)	Save result as a JSON file with name `output`

classmethod load(output, comm=None)¶: Load a result has been saved to disk with save().

run()[source]¶

Calculate the pair counts in a simulation box. This adds the following attributes to the class:

SimulationBoxPairCount.pairs

pairs¶

a BinnedStatistic object holding the pair count results. The coordinate grid will be (r,), (r,mu), (rp, pi), or (theta,) when mode is ‘1d’, ‘2d’, ‘projected’, ‘angular’, respectively.

The BinnedStatistic stores the following variables:

r, rp, or theta : the mean separation value in the bin
npairs: the number of pairs in the bin
wnpairs: the average weight value in the bin; each pair contributes the product of the individual weight values

Type: BinnedStatistic

save(output)¶: Save result as a JSON file with name output

class nbodykit.algorithms.SurveyData2PCF(mode, data1, randoms1, edges, cosmo=None, Nmu=None, pimax=None, data2=None, randoms2=None, R1R2=None, ra='RA', dec='DEC', redshift='Redshift', weight='Weight', show_progress=False, **config)[source]¶

Compute the two-point correlation function for observational survey data as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using pair counting.

The Landy-Szalay estimator (DD/RR - 2 DD/RR + 1) is used to transform pair counts in to the correlation function.

Parameters

mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below
data1 (CatalogSource) – the data catalog; must have ra, dec, redshift, columns
randoms1 (CatalogSource) – the catalog specifying the un-clustered, random distribution for data1
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1
cosmo (Cosmology, optional) – the cosmology instance used to convert redshift into comoving distance; this is required for all cases except mode='angular'
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the \(\pi\) direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the \(\pi\) direction.
data2 (CatalogSource, optional) – the second data catalog to cross-correlate;
randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data2; if not specified and data2 is provied, then randoms1 will be used for both.
R1R2 (SurveyDataPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator
ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code
**config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute correlation functions using several different coordinate choices, based on the value of the input argument mode. The choices are:

mode='1d' : compute pairs as a function of the 3D separation \(r\)
mode='2d' : compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)
mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)
mode='angular' : compute pairs as a function of angle on the sky, \(\theta\)

If mode='projected', the projected correlation function \(w_p(r_p)\) is also computed, using the input \(\pi_\mathrm{max}\) value.

Methods

`load`(output[, comm])	Load a result has been saved to disk with `save()`.
`run`()	Run the two-point correlation function algorithm.
`save`(output)	Save result as a JSON file with name `output`

classmethod load(output, comm=None)¶: Load a result has been saved to disk with save().

run()[source]¶

Run the two-point correlation function algorithm. This attaches the following attributes:

SurveyData2PCF.D1D2
SurveyData2PCF.D1R2
SurveyData2PCF.D2R1
SurveyData2PCF.R1R2
SurveyData2PCF.corr
SurveyData2PCF.wp (if mode='projected')

D1D2¶

the data1 - data2 pair counts

Type: BinnedStatistic

D1R2¶

the data1 - randoms2 pair counts

Type: BinnedStatistic

D2R1¶

the data2 - randoms1 pair counts

Type: BinnedStatistic

R1R2¶

the randoms1 - randoms2 pair counts

Type: BinnedStatistic

corr¶

the correlation function values, stored as the corr variable, computed from the pair counts

Type: BinnedStatistic

wp¶

the projected correlation function, \(w_p(r_p)\), computed if mode='projected'; correlation is stored as the corr variable

Type: BinnedStatistic

Notes

The D1D2, D1R2, D2R1, and R1R2 attributes are identical to the pairs attribute of SurveyDataPairCount.

save(output)¶: Save result as a JSON file with name output

class nbodykit.algorithms.SurveyData3PCF(source, poles, edges, cosmo, domain_factor=4, ra='RA', dec='DEC', redshift='Redshift', weight='Weight')[source]¶

Compute the multipoles of the isotropic, three-point correlation function in configuration space for observational survey data.

This uses the algorithm of Slepian and Eisenstein, 2015 which scales as \(\mathcal{O}(N^2)\), where \(N\) is the number of objects.

Results are computed when the object is inititalized. See the documenation of run() for the attributes storing the results.

Note

The algorithm expects the positions of objects from a survey catalog be the sky coordinates, right ascension and declination, and redshift. For simulation box data in Cartesian coordinates, see SimulationBox3PCF.

Warning

The right ascension and declination columns should be specified in degrees.

Parameters

source (CatalogSource) – the input source of particles providing the ‘Position’ column
poles (list of int) – the list of multipole numbers to compute
edges (array_like) – the edges of the bins of separation to use; length of nbins+1
cosmo (Cosmology) – the cosmology instance used to convert redshifts into comoving distances
ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
domain_factor (int, optional) – the integer value by which to oversubscribe the domain decomposition mesh before balancing loads; this number can affect the distribution of loads on the ranks – an optimal value will lead to balanced loads

References

Slepian and Eisenstein, MNRAS 454, 4142-4158 (2015)

Methods

`load`(filename[, comm])	Load a result from `filename` that has been saved to disk with `save()`.
`run`()	Compute the three-point CF multipoles.
`save`(output)	Save the `poles` result to a JSON file with name `output`.

classmethod load(filename, comm=None)¶: Load a result from filename that has been saved to disk with save().

run()[source]¶

Compute the three-point CF multipoles. This attaches the following the attributes to the class:

poles

poles¶

Type: BinnedStatistic

save(output)¶: Save the poles result to a JSON file with name output.

class nbodykit.algorithms.SurveyDataPairCount(mode, first, edges, cosmo=None, second=None, Nmu=None, pimax=None, ra='RA', dec='DEC', redshift='Redshift', weight='Weight', show_progress=False, domain_factor=4, **config)[source]¶

Count (weighted) pairs of objects from a survey data catalog as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using the Corrfunc package.

See the Notes below for the allowed coordinate dimensions.

The default weighting scheme uses the product of the weights for each object in a pair.

Results are computed when the class is inititalized. See the documenation of run() for the attributes storing the results.

Note

The algorithm expects the positions of particles from a survey catalog be the sky coordinates, right ascension and declination, and redshift. To compute pair counts in a simulation box using Cartesian coordinates, see SimulationBoxPairCount.

Warning

The right ascension and declination columns should be specified in degrees.

Parameters

mode ('1d', '2d', 'projected', 'angular') – compute pair counts as a function of the specified coordinate basis; see the Notes section below for specifics
first (CatalogSource) – the first source of particles, providing the ‘Position’ column
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1
cosmo (Cosmology, optional) – the cosmology instance used to convert redshift into comoving distance; this is required for all cases except mode='angular'
second (CatalogSource, optional) – the second source of particles to cross-correlate
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the \(\pi\) direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the \(\pi\) direction.
ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code
domain_factor (int, optional) – the integer value by which to oversubscribe the domain decomposition mesh before balancing loads; this number can affect the distribution of loads on the ranks – an optimal value will lead to balanced loads
**config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute pair counts using several different coordinate choices, based on the value of the input argument mode. The choices are:

mode='1d' : compute pairs as a function of the 3D separation \(r\)
mode='2d' : compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)
mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)
mode='angular' : compute pairs as a function of angle on the sky, \(\theta\)

Methods

`load`(output[, comm])	Load a result has been saved to disk with `save()`.
`run`()	Calculate the pair counts of a survey data catalog.
`save`(output)	Save result as a JSON file with name `output`

classmethod load(output, comm=None)¶: Load a result has been saved to disk with save().

run()[source]¶

Calculate the pair counts of a survey data catalog. This adds the following attribute:

SurveyDataPairCount.pairs

self.pairs.attrs[‘total_wnpairs’]: The total of wnpairs.

pairs¶

The BinnedStatistic stores the following variables:

r, rp, or theta : the mean separation value in the bin
npairs: the number of pairs in the bin
wnpairs: the weighted npairs in the bin; each pair contributes the product of the individual weight values

Type: BinnedStatistic

save(output)¶: Save result as a JSON file with name output