nbodykit.algorithms¶
- class nbodykit.algorithms.ConvolvedFFTPower(first, poles, second=None, Nmesh=None, kmin=0.0, kmax=None, dk=None, use_fkp_weights=None, P0_FKP=None)[source]¶
Algorithm to compute power spectrum multipoles using FFTs for a data survey with non-trivial geometry.
Due to the geometry, the estimator computes the true power spectrum convolved with the window function (FFT of the geometry).
This estimator implemented in this class is described in detail in Hand et al. 2017 (arxiv:1704.02357). It uses the spherical harmonic addition theorem such that only \(2\ell+1\) FFTs are required to compute each multipole. This differs from the implementation in Bianchi et al. and Scoccimarro et al., which requires \((\ell+1)(\ell+2)/2\) FFTs.
Results are computed when the object is inititalized, and the result is stored in the
poles
attribute. Important meta-data computed during algorithm execution is stored in theattrs
dict. See the documenation ofrun()
.Note
A full tutorial on the class is available in the documentation here.
Note
Cross correlations are only supported when the FKP weight column differs between the two mesh objects, i.e., the underlying
data
andrandoms
must be the same. This allows users to compute the cross power spectrum of the same density field, weighted differently.- Parameters
first (FKPCatalog, FKPCatalogMesh) – the first source to paint the data/randoms; FKPCatalog is automatically converted to a FKPCatalogMesh, using default painting parameters
poles (list of int) – a list of integer multipole numbers
ell
to computesecond (FKPCatalog, FKPCatalogMesh, optional) – the second source to paint the data/randoms; cross correlations are only supported when the weight column differs between the two mesh objects, i.e., the underlying
data
andrandoms
must be the same!kmin (float, optional) – the edge of the first wavenumber bin; default is 0
kmax (float, optional) – the limit of the last wavenumber bin; default is None, no limit.
dk (float, optional) – the spacing in wavenumber to use; if not provided; the fundamental mode of the box is used
References
Hand, Nick et al. An optimal FFT-based anisotropic power spectrum estimator, 2017
Bianchi, Davide et al., Measuring line-of-sight-dependent Fourier-space clustering using FFTs, MNRAS, 2015
Scoccimarro, Roman, Fast estimators for redshift-space clustering, Phys. Review D, 2015
Methods
load
(output[, comm, format])Load a saved ConvolvedFFTPower result, which has been saved to disk with
ConvolvedFFTPower.save()
.normalization
(name, alpha)Compute the power spectrum normalization, using either the
data
orrandoms
source.run
()Compute the power spectrum multipoles.
save
(output)Save the ConvolvedFFTPower result to disk.
shotnoise
(alpha)Compute the power spectrum shot noise, using either the
data
orrandoms
source.to_pkmu
(mu_edges, max_ell)Invert the measured multipoles \(P_\ell(k)\) into power spectrum wedges, \(P(k,\mu)\).
- __setstate_pre000305__(state)[source]¶
compatible version of setstate for files generated before 0.3.5
- classmethod load(output, comm=None, format='current')[source]¶
Load a saved ConvolvedFFTPower result, which has been saved to disk with
ConvolvedFFTPower.save()
.The current MPI communicator is automatically used if the
comm
keyword isNone
format can be ‘current’, or ‘pre000305’ for files generated before 0.3.5.
- normalization(name, alpha)[source]¶
Compute the power spectrum normalization, using either the
data
orrandoms
source.The normalization is given by:
\[A = \int d^3x \bar{n}'_1(x) \bar{n}'_2(x) w_{\mathrm{fkp},1} w_{\mathrm{fkp},2}.\]The mean densities are assumed to be the same, so this can be converted to a summation over objects in the source, as
\[A = \sum w_{\mathrm{comp},1} \bar{n}_2 w_{\mathrm{fkp},1} w_{\mathrm{fkp},2}.\]References
see Eqs. 13,14 of Beutler et al. 2014, “The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: testing gravity with redshift space distortions using the power spectrum multipoles”
- run()[source]¶
Compute the power spectrum multipoles. This function does not return anything, but adds several attributes (see below).
- edges¶
the edges of the wavenumber bins
- Type
array_like
- poles¶
a BinnedStatistic object that behaves similar to a structured array, with fancy slicing and re-indexing; it holds the measured multipole results, as well as the number of modes (
modes
) and average wavenumbers values in each bin (k
)- Type
- attrs¶
dictionary holding input parameters and several important quantites computed during execution:
- data.N, randoms.N :
the unweighted number of data and randoms objects
- data.W, randoms.W :
the weighted number of data and randoms objects, using the column specified as the completeness weights
- alpha :
the ratio of
data.W
torandoms.W
- data.norm, randoms.norm :
the normalization of the power spectrum, computed from either the “data” or “randoms” catalog (they should be similar). See equations 13 and 14 of arxiv:1312.4611.
- data.shotnoise, randoms.shotnoise :
the shot noise values for the “data” and “random” catalogs; See equation 15 of arxiv:1312.4611.
- shotnoise :
the total shot noise for the power spectrum, equal to
data.shotnoise
+randoms.shotnoise
; this should be subtracted from the monopole.
- BoxSize :
the size of the Cartesian box used to grid the data and randoms objects on a Cartesian mesh.
For further details on the meta-data, see the documentation.
- Type
- save(output)[source]¶
Save the ConvolvedFFTPower result to disk.
The format is currently json.
- Parameters
output (str) – the name of the file to dump the JSON results to
- shotnoise(alpha)[source]¶
Compute the power spectrum shot noise, using either the
data
orrandoms
source.This computes:
\[S = \sum (w_\mathrm{comp} w_\mathrm{fkp})^2\]References
see Eq. 15 of Beutler et al. 2014, “The clustering of galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: testing gravity with redshift space distortions using the power spectrum multipoles”
- to_pkmu(mu_edges, max_ell)[source]¶
Invert the measured multipoles \(P_\ell(k)\) into power spectrum wedges, \(P(k,\mu)\).
- Parameters
mu_edges (array_like) – the edges of the \(\mu\) bins
max_ell (int) – the maximum multipole to use when computing the wedges; all even multipoles with \(ell\) less than or equal to this number are included
- Returns
pkmu – a data set holding the \(P(k,\mu)\) wedges
- Return type
- class nbodykit.algorithms.CylindricalGroups(source, rankby, rperp, rpar, flat_sky_los=None, periodic=False, BoxSize=None)[source]¶
Compute groups of objects using a cylindrical grouping method. We identify all satellites within a given cylindrical volume around a central object.
Results are computed when the object is inititalized, and the result is stored in the
groups
attribute; see the documenation ofrun()
.Input parameters are stored in the
attrs
attribute dictionary.- Parameters
source (subclass of
CatalogSource
) – the input source of particles providing the ‘Position’ column; the grouping algorithm is run on this catalogrperp (float) – the radius of the cylinder in the sky plane (i.e., perpendicular to the line-of-sight)
rpar (float) – the radius along the line-of-sight direction; this is 1/2 the height of the cylinder
rankby (str, list,
None
) – a single or list of column names to rank order the input source by before computing the cylindrical groups, such that objects ranked first are marked as CGM centrals; ifNone
is supplied, no sorting will be doneflat_sky_los (bool, optional) – a unit vector of length 3 providing the line-of-sight direction, assuming a fixed line-of-sight across the box, e.g., [0,0,1] to use the z-axis. If
None
, the observer at (0,0,0) is used to compute the line-of-sight for each pairperiodic (bool, optional) – whether to use periodic boundary conditions
BoxSize (float, 3-vector, optional) – the size of the box of the input data; must be provided as a keyword or in
source.attrs
ifperiodic=True
References
Okumura, Teppei, et al. “Reconstruction of halo power spectrum from redshift-space galaxy distribution: cylinder-grouping method and halo exclusion effect”, arXiv:1611.04165, 2016.
Methods
run
()Compute the cylindrical groups, saving the results to the
groups
attribute- run()[source]¶
Compute the cylindrical groups, saving the results to the
groups
attribute- groups¶
a catalog holding the result of the grouping. The length of the catalog is equal to the length of the input size, i.e., the length is equal to the
size
attribute. The relevant fields are:- cgm_type :
a flag specifying the type for each object, with 0 specifying CGM central and 1 denoting CGM satellite
- cgm_haloid :
The index of the CGM object this object belongs to; an integer between 0 and the total number of CGM halos
- num_cgm_sats :
The number of satellites in the CGM halo
- Type
- class nbodykit.algorithms.FFTCorr(first, mode, Nmesh=None, BoxSize=None, second=None, los=[0, 0, 1], Nmu=5, dr=None, rmin=0.0, rmax=None, poles=[])[source]¶
Algorithm to compute the 1d or 2d correlation and/or multipoles in a periodic box, using a Fast Fourier Transform (FFT).
This computes the power spectrum as the square of the Fourier modes of the density field, which are computed via a FFT. Then it is transformed back to obtain the correlation function.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
This is very similar to
FFTPower
.- Parameters
first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via
to_mesh()
)mode ({'1d', '2d'}) – compute either 1d or 2d power spectra
Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
los (array_like , optional) – the direction to use as the line-of-sight; must be a unit vector
Nmu (int, optional) – the number of mu bins to use from \(\mu=[0,1]\); if mode = 1d, then
Nmu
is set to 1dr (float, optional) – the linear spacing of
r
bins to use; if not provided, the fundamental mode of the box is used; if dr=0, the bins are tight, such that each bin has a unique r value.rmin (float, optional) – the lower edge of the first
r
bin to usermax (float, optional) – the upper limit of the last
r
bin to usepoles (list of int, optional) – a list of multipole numbers
ell
to compute \(\xi_\ell(r)\) from \(\xi(r,\mu)\)
Methods
load
(output[, comm])Load a saved result.
run
()Compute the correlation function in a periodic box, using FFTs.
save
(output)Save the result to disk.
- classmethod load(output, comm=None)¶
Load a saved result. The result has been saved to disk with
save()
.
- run()[source]¶
Compute the correlation function in a periodic box, using FFTs.
- Returns
corr (
BinnedStatistic
) – a BinnedStatistic object that holds the measured \(\xi(r)\) or \(\xi(r,\mu)\). It stores the following variables:- r :
the mean value for each
r
bin
- mu
mode=2d
only the mean value for each
mu
bin
- mu
- corr :
real array storing the correlation function
- modes :
the number of modes averaged together in each bin
poles (
BinnedStatistic
orNone
) – a BinnedStatistic object to hold the multipole results \(\xi_\ell(r)\); if no multipoles were requested by the user, this isNone
. It stores the following variables:- r :
the mean value for each
r
bin
- power_L :
complex array storing the real and imaginary components for the \(\ell=L\) multipole
- modes :
the number of modes averaged together in each bin
corr.attrs, poles.attrs (dict) – dictionary of meta-data; in addition to storing the input parameters, it includes the following fields computed during the algorithm execution:
- shotnoisefloat
the power Poisson shot noise, equal to \(V/N\), where \(V\) is the volume of the box and N is the total number of objects; if a cross-correlation is computed, this will be equal to zero
- N1int
the total number of objects in the first source
- N2int
the total number of objects in the second source
- save(output)¶
Save the result to disk. The format is currently JSON.
- class nbodykit.algorithms.FFTPower(first, mode, Nmesh=None, BoxSize=None, second=None, los=[0, 0, 1], Nmu=5, dk=None, kmin=0.0, kmax=None, poles=[])[source]¶
Algorithm to compute the 1d or 2d power spectrum and/or multipoles in a periodic box, using a Fast Fourier Transform (FFT).
This computes the power spectrum as the square of the Fourier modes of the density field, which are computed via a FFT.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
A full tutorial on the class is available in the documentation here.
- Parameters
first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via
to_mesh()
)mode ({'1d', '2d'}) – compute either 1d or 2d power spectra
Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
los (array_like , optional) – the direction to use as the line-of-sight; must be a unit vector
Nmu (int, optional) – the number of mu bins to use from \(\mu=[0,1]\); if mode = 1d, then
Nmu
is set to 1dk (float, optional) – the linear spacing of
k
bins to use; if not provided, the fundamental mode of the box is used; if dk=0 is set, use fine bins such that the modes contributing to the bin has identical modulus.kmin (float, optional) – the lower edge of the first
k
bin to usekmin – the upper limit of the last
k
bin to use (not exact)poles (list of int, optional) – a list of multipole numbers
ell
to compute \(P_\ell(k)\) from \(P(k,\mu)\)
Methods
load
(output[, comm])Load a saved result.
run
()Compute the power spectrum in a periodic box, using FFTs.
save
(output)Save the result to disk.
- classmethod load(output, comm=None)¶
Load a saved result. The result has been saved to disk with
save()
.
- run()[source]¶
Compute the power spectrum in a periodic box, using FFTs.
- Returns
power (
BinnedStatistic
) – a BinnedStatistic object that holds the measured \(P(k)\) or \(P(k,\mu)\). It stores the following variables:- k :
the mean value for each
k
bin
- mu
mode=2d
only the mean value for each
mu
bin
- mu
- power :
complex array storing the real and imaginary components of the power
- modes :
the number of Fourier modes averaged together in each bin
poles (
BinnedStatistic
orNone
) – a BinnedStatistic object to hold the multipole results \(P_\ell(k)\); if no multipoles were requested by the user, this isNone
. It stores the following variables:- k :
the mean value for each
k
bin
- power_L :
complex array storing the real and imaginary components for the \(\ell=L\) multipole
- modes :
the number of Fourier modes averaged together in each bin
power.attrs, poles.attrs (dict) – dictionary of meta-data; in addition to storing the input parameters, it includes the following fields computed during the algorithm execution:
- shotnoisefloat
the power Poisson shot noise, equal to \(V/N\), where \(V\) is the volume of the box and N is the total number of objects; if a cross-correlation is computed, this will be equal to zero
- N1int
the total number of objects in the first source
- N2int
the total number of objects in the second source
- save(output)¶
Save the result to disk. The format is currently JSON.
- class nbodykit.algorithms.FFTRecon(data, ran, Nmesh, bias=1.0, f=0.0, los=[0, 0, 1], R=20, position='Position', revert_rsd_random=False, scheme='LGS', BoxSize=None)[source]¶
FFT based Lagrangian reconstruction algorithm in a periodic box.
References
Eisenstein et al, 2007 http://adsabs.harvard.edu/abs/2007ApJ…664..675E Section 3, paragraph starting with ‘Restoring in full the …’
We follow a cleaner description in Schmitfull et al 2015,
Table I, and text below. Schemes are LGS, LF2 and LRR.
A slight difference against the paper is that Redshift distortion and bias are corrected in the linear order. The Random shifting followed Martin White’s suggestion to exclude the RSD by default. (with default revert_rsd_random=False.)
- Parameters
data (CatalogSource,) – the data catalog, e.g. halos. data.attrs[‘BoxSize’] is used if argument BoxSize is not given.
ran (CatalogSource) – the random catalog, e.g. from a UniformCatalog object.
Nmesh (int) – The size of the FFT Mesh. Rule of thumb is that the size of a mesh cell shall be 2 ~ 4 times smaller than the smoothing length, R.
revert_rsd_random (boolean) – Revert the rsd for randoms as well as data. There are two conventions. either reverting rsd displacement in data displacement only(False) or in both data and randoms (True). Default is False.
R (float) – The radius of smoothing. 10 to 20 Mpc/h is usually cool.
bias (float) – The bias of the data catalog.
f (float) – The growth rate; if non-zero, correct for RSD
los (list) – The direction of the line of sight for RSD. Usually (default) [0, 0, 1].
position (string) – column to use for picking up the Position of the objects.
BoxSize (float or array_like) – the size of the periodic box, default is to infer from the data.
scheme (string) – The reconstruction scheme. LGS is the standard reconstruction (Lagrangian growth shift). LF2 is the F2 Lagrangian reconstruction. LRR is the random-random Lagrangian reconstruction.
- Attributes
Methods
apply
(func[, kind, mode])Return a view of the mesh, with
actions
updated to apply the specified function, either in Fourier space or configuration space, based onmode
compute
([mode, Nmesh])Compute / Fetch the mesh object into memory as a RealField or ComplexField object.
preview
([axes, Nmesh, root])Gather the mesh into as a numpy array, with (reduced) resolution.
save
(output[, dataset, mode])Save the mesh as a
BigFileMesh
on disk, either in real or complex space.to_complex_field
([out])Convert the mesh source to the Fourier-space field, returning a
pmesh.pm.ComplexField
object.to_field
([mode, out])Return the mesh as a
pmesh
Field object, either in Fourier space or configuration space, based onmode
.Convert the mesh source to the configuration-space field, returning a
pmesh.pm.RealField
object.view
()Return a "view" of the MeshSource, in the spirit of numpy's ndarray view.
paint
run
work_with
- __finalize__(other)¶
Finalize the creation of a MeshSource object by copying over attributes from a second MeshSource.
- Parameters
other (MeshSource) – the second MeshSource to copy over attributes from
- __len__()¶
Length of a mesh source is zero
- property actions¶
A list of actions to apply to the density field when interpolating to the mesh.
This stores tuples of
(mode, func, kind)
; seeapply()
for more details.
- apply(func, kind='wavenumber', mode='complex')¶
Return a view of the mesh, with
actions
updated to apply the specified function, either in Fourier space or configuration space, based onmode
- Parameters
func (callable or a
MeshFilter
object) – func(x, y) where x is a list ofr
(k
) values that broadcasts into a full array, whenmode
is ‘real’ (‘complex’); the value of x depends onkind
.y
is the value of the mesh field on the corresponding locations.kind (string, optional) –
if a MeshFilter object is given as func, this is ignored. The kind of value in x.
When
mode
is ‘complex’:’wavenumber’ means wavenumber from [- 2 pi / L * N / 2, 2 pi / L * N / 2).
’circular’ means circular frequency from [- pi, pi).
’index’ means [0, Nmesh )
When
mode
is ‘real’:’relative’ means distance from [-0.5 Boxsize, 0.5 BoxSize).
’index’ means [0, Nmesh )
mode ('complex' or 'real', optional) – if a MeshFilter object is given as func, this is ignored. whether to apply the function to the mesh in configuration space or Fourier space
- Returns
a view of the mesh object with the
actions
attribute updated to include the new action- Return type
- property attrs¶
A dictionary storing relevant meta-data about the CatalogSource.
- compute(mode='real', Nmesh=None)¶
Compute / Fetch the mesh object into memory as a RealField or ComplexField object.
- preview(axes=None, Nmesh=None, root=0)¶
Gather the mesh into as a numpy array, with (reduced) resolution. The result is broadcast to all ranks, so this uses \(\mathrm{Nmesh}^3\) per rank.
- Parameters
- Returns
out – An numpy array holding the real density field.
- Return type
array_like
- save(output, dataset='Field', mode='real')¶
Save the mesh as a
BigFileMesh
on disk, either in real or complex space.
- to_complex_field(out=None)¶
Convert the mesh source to the Fourier-space field, returning a
pmesh.pm.ComplexField
object.Not implemented in the base class, unless object is a view.
- to_field(mode='real', out=None)¶
Return the mesh as a
pmesh
Field object, either in Fourier space or configuration space, based onmode
.This will call
to_real_field()
orto_complex_field()
based onmode
.- Parameters
mode ('real' or 'complex') – the return type of the field
- Returns
either a RealField of ComplexField, storing the value of the field on the mesh
- Return type
RealField
,ComplexField
- to_real_field()[source]¶
Convert the mesh source to the configuration-space field, returning a
pmesh.pm.RealField
object.Not implemented in the base class, unless object is a view.
- view()¶
Return a “view” of the MeshSource, in the spirit of numpy’s ndarray view.
This returns a new MeshSource whose memory is owned by
self
.Note that for CatalogMesh objects, this is overidden by the
CatalogSource.view
function.
- class nbodykit.algorithms.FKPCatalog(data, randoms, BoxSize=None, BoxPad=0.02, P0=None, nbar='NZ')[source]¶
An interface for simultaneous modeling of a
data
CatalogSource and arandoms
CatalogSource, in the spirit of Feldman, Kaiser, and Peacock, 1994.This main functionality of this class is:
provide a uniform interface to accessing columns from the
data
CatalogSource andrandoms
CatalogSource, using column names prefixed with “data/” or “randoms/”compute the shared
BoxSize
of the source, by finding the maximum Cartesian extent of therandoms
provide an interface to a mesh object, which knows how to paint the FKP density field from the
data
andrandoms
- Parameters
data (CatalogSource) – the CatalogSource of particles representing the data catalog
randoms (CatalogSource, or None) – the CatalogSource of particles representing the randoms catalog if None is given an empty catalog is used.
BoxSize (float, 3-vector, optional) – the size of the Cartesian box to use for the unified data and randoms; if not provided, the maximum Cartesian extent of the randoms defines the box
BoxPad (float, 3-vector, optional) – optionally apply this additional buffer to the extent of the Cartesian box
nbar (str, optional) – the name of the column specifying the number density as a function of redshift. default is NZ.
P0 (float or None) – if not None, a column named FKPWeight is added to data and random based on nbar.
References
- Attributes
attrs
A dictionary storing relevant meta-data about the CatalogSource.
columns
Columns for individual species can be accessed using a
species/
prefix and the column name, i.e.,data/Position
.hardcolumns
Hardcolumn of the form
species/name
species
List of species names
Methods
compute
(*args, **kwargs)Our version of
dask.compute()
that computes multiple delayed dask collections at once.copy
()Return a shallow copy of the object, where each column is a reference of the corresponding column in
self
.get_hardcolumn
(col)Construct and return a hard-coded column.
make_column
(array)Utility function to convert an array-like object to a
dask.array.Array
.read
(columns)Return the requested columns as dask arrays.
save
(output[, columns, dataset, datasets, ...])Save the CatalogSource to a
bigfile.BigFile
.to_mesh
([Nmesh, BoxSize, BoxCenter, dtype, ...])Convert the FKPCatalog to a mesh, which knows how to "paint" the FKP density field.
to_subvolumes
([domain, position, columns])Domain Decompose a catalog, sending items to the ranks according to the supplied domain object.
view
([type])Return a "view" of the CatalogSource object, with the returned type set by
type
.create_instance
- __delitem__(col)¶
Delete a column of the form
species/column
- __finalize__(other)¶
Finalize the creation of a CatalogSource object by copying over any additional attributes from a second CatalogSource.
The idea here is to only copy over attributes that are similar to meta-data, so we do not copy some of the core attributes of the
CatalogSource
object.- Parameters
other – the second object to copy over attributes from; it needs to be a subclass of CatalogSourcBase for attributes to be copied
- Returns
return
self
, with the added attributes- Return type
- __getitem__(key)¶
This provides access to the underlying data in two ways:
The CatalogSource object for a species can be accessed if
key
is a species name.Individual columns for a species can be accessed using the format:
species/column
.
- __setitem__(col, value)¶
Add columns to any of the species catalogs.
Note
New column names should be prefixed by ‘species/’ where ‘species’ is a name in the
species
attribute.
- property attrs¶
A dictionary storing relevant meta-data about the CatalogSource.
- property columns¶
Columns for individual species can be accessed using a
species/
prefix and the column name, i.e.,data/Position
.
- compute(*args, **kwargs)¶
Our version of
dask.compute()
that computes multiple delayed dask collections at once.This should be called on the return value of
read()
to converts any dask arrays to numpy arrays.- . note::
If the
base
attribute is set,compute()
will called usingbase
instead ofself
.
- Parameters
args (object) – Any number of objects. If the object is a dask collection, it’s computed and the result is returned. Otherwise it’s passed through unchanged.
- copy()¶
Return a shallow copy of the object, where each column is a reference of the corresponding column in
self
.Note
No copy of data is made.
Note
This is different from view in that the attributes dictionary of the copy no longer related to
self
.- Returns
a new CatalogSource that holds all of the data columns of
self
- Return type
- get_hardcolumn(col)¶
Construct and return a hard-coded column.
These are usually produced by calling member functions marked by the
@column
decorator.Subclasses may override this method and the hardcolumns attribute to bypass the decorator logic.
Note
If the
base
attribute is set,get_hardcolumn()
will called usingbase
instead ofself
.
- property hardcolumns¶
Hardcolumn of the form
species/name
- static make_column(array)¶
Utility function to convert an array-like object to a
dask.array.Array
.Note
The dask array chunk size is controlled via the
dask_chunk_size
global option. Seeset_options
.- Parameters
array (array_like) – an array-like object; can be a dask array, numpy array, ColumnAccessor, or other non-scalar array-like object
- Returns
a dask array initialized from
array
- Return type
- read(columns)¶
Return the requested columns as dask arrays.
- Parameters
columns (list of str) – the names of the requested columns
- Returns
the list of column data, in the form of dask arrays
- Return type
list of
dask.array.Array
- save(output, columns=None, dataset=None, datasets=None, header='Header', compute=True)¶
Save the CatalogSource to a
bigfile.BigFile
.Only the selected columns are saved and
attrs
are saved inheader
. The attrs of columns are stored in the datasets.- Parameters
output (str) – the name of the file to write to
columns (list of str) – the names of the columns to save in the file, or None to use all columns
dataset (str, optional) – dataset to store the columns under.
datasets (list of str, optional) – names for the data set where each column is stored; defaults to the name of the column (deprecated)
header (str, optional, or None) – the name of the data set holding the header information, where
attrs
is stored if header is None, do not save the header.compute (boolean, default True) – if True, wait till the store operations finish if False, return a dictionary with column name and a future object for the store. use dask.compute() to wait for the store operations on the result.
- property species¶
List of species names
- to_mesh(Nmesh=None, BoxSize=None, BoxCenter=None, dtype='c16', interlaced=False, compensated=False, resampler='cic', fkp_weight='FKPWeight', comp_weight='Weight', selection='Selection', position='Position', bbox_from_species=None, window=None, nbar=None)[source]¶
Convert the FKPCatalog to a mesh, which knows how to “paint” the FKP density field.
Additional keywords to the
to_mesh()
function include the FKP weight column, completeness weight column, and the column specifying the number density as a function of redshift.- Parameters
Nmesh (int, 3-vector, optional) – the number of cells per box side; if not specified in attrs, this must be provided
dtype (str, dtype, optional) – the data type of the mesh when painting. dtype=’f8’ or ‘f4’ assumes Hermitian symmetry of the input field (delta(x) = delta^{*}(-x)), and stores it as an N x N x N/2+1 real array. This speeds evaluation of even multipoles but yields incorrect odd multipoles in the presence of the wide-angle effect. dtype=’c16’ or ‘c8’ stores the field as an N x N x N complex array to correctly recover the odd multipoles.
interlaced (bool, optional) – whether to use interlacing to reduce aliasing when painting the particles on the mesh
compensated (bool, optional) – whether to apply a Fourier-space transfer function to account for the effects of the gridding + aliasing
resampler (str, optional) – the string name of the resampler to use when interpolating the particles to the mesh; see
pmesh.window.methods
for choicesfkp_weight (str, optional) – the name of the column in the source specifying the FKP weight; this weight is applied to the FKP density field:
n_data - alpha*n_randoms
comp_weight (str, optional) – the name of the column in the source specifying the completeness weight; this weight is applied to the individual fields, either
n_data
orn_random
selection (str, optional) – the name of the column used to select a subset of the source when painting
position (str, optional) – the name of the column that specifies the position data of the objects in the catalog
bbox_from_species (str, optional) – if given, use the species to infer a bbox. if not give, will try random, then data (if random is empty)
window (deprecated.) – use resampler=
nbar (deprecated.) – deprecated. set nbar in the call to FKPCatalog()
- to_subvolumes(domain=None, position='Position', columns=None)¶
Domain Decompose a catalog, sending items to the ranks according to the supplied domain object. Using the position column as the Position.
This will read in the full position array and all of the requested columns.
- Parameters
domain (
pmesh.domain.GridND
object, or None) – The domain to distribute the catalog. If None, try to evenly divide spatially. An easiest way to find a domain object is to use pm.domain, where pm is apmesh.pm.ParticleMesh
object.position (string_like) – column to use to compute the position.
columns (list of string_like) – columns to include in the new catalog, if not supplied, all catalogs will be exchanged.
- Returns
A decomposed catalog source, where each rank only contains objects belongs to the rank as claimed by the domain object.
self.attrs are carried over as a shallow copy to the returned object.
- Return type
- view(type=None)¶
Return a “view” of the CatalogSource object, with the returned type set by
type
.This initializes a new empty class of type
type
and attaches attributes to it via the__finalize__()
mechanism.- Parameters
type (Python type) – the desired class type of the returned object.
- nbodykit.algorithms.FKPPower¶
alias of
nbodykit.algorithms.convpower.fkp.ConvolvedFFTPower
- nbodykit.algorithms.FKPWeightFromNbar(P0, nbar)[source]¶
Create FKPWeight from nbar, the number density of objects per redshift.
- Parameters
P0 (float) – the FKP normalization, when P0 == 0, returns 1.0, ignoring size / shape of nbar.
nbar (array_like) – the number density of objects per redshift
- Returns
FKPWeight (the FKPWeight, can be assigned to a catalog as a column)
to be consumed
ConvolvedFFTPower
- class nbodykit.algorithms.FOF(source, linking_length, nmin, absolute=False, periodic=True, domain_factor=1)[source]¶
A friends-of-friends halo finder that computes the label for each particle, denoting which halo it belongs to.
Friends-of-friends was first used by Davis et al 1985 to define halos in hierachical structure formation of cosmological simulations. The algorithm is also known as DBSCAN in computer science. The subroutine here implements a parallel version of the FOF.
The underlying local FOF algorithm is from
kdcount.cluster
, which is an adaptation of the implementation in Volker Springel’s Gadget and Martin White’s PM.Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.For returning a CatalogSource of the FOF halos, see
find_features()
and for computing a halo catalog with added analytic information for a specific redshift and cosmology, seeto_halos()
.- Parameters
source (CatalogSource) – the source to run the FOF algorithm on; must support ‘Position’
linking_length (float) – the linking length, either in absolute units, or relative to the mean particle separation
nmin (int) – halo with fewer particles are ignored
absolute (bool, optional) – If True, the linking length is in absolute units, otherwise it is relative to the mean particle separation; default is False
Methods
find_features
([peakcolumn])Based on the particle labels, identify the groups, and return the center-of-mass
CMPosition
,CMVelocity
, and Length of each feature.run
()Run the FOF algorithm.
to_halos
(particle_mass, cosmo, redshift[, ...])Return a
HaloCatalog
, holding the center-of-mass position and velocity of each FOF halo, as well as the properly scaled mass, for a given cosmology and redshift.- find_features(peakcolumn=None)[source]¶
Based on the particle labels, identify the groups, and return the center-of-mass
CMPosition
,CMVelocity
, and Length of each feature.If a
peakcolumn
is given, thePeakPosition
andPeakVelocity
is also calculated for the particle at the peak value of the column.Data is scattered evenly across all ranks.
- Returns
a source holding the (‘CMPosition’, ‘CMVelocity’, ‘Length’) of each feature; optionaly,
PeakPosition
,PeakVelocity
are also included ifpeakcolumn
is not None- Return type
- run()[source]¶
Run the FOF algorithm. This function returns nothing, but does attach several attributes to the class instance:
attr:labels
max_labels
Note
The
labels
array is scattered evenly across all ranks.- labels¶
an array the label that specifies which FOF halo each particle belongs to
- Type
array_like, length:
size
- to_halos(particle_mass, cosmo, redshift, mdef='vir', posdef='cm', peakcolumn='Density')[source]¶
Return a
HaloCatalog
, holding the center-of-mass position and velocity of each FOF halo, as well as the properly scaled mass, for a given cosmology and redshift.The returned catalog also has default analytic prescriptions for halo radius and concentration.
The data is scattered evenly across all ranks.
- Parameters
particle_mass (float) – the particle mass, which is used to convert the number of particles in each halo to a total mass
cosmo (
nbodykit.cosmology.core.Cosmology
) – the cosmology of the catalogredshift (float) – the redshift of the catalog
mdef (str, optional) – string specifying mass definition, used for computing default halo radii and concentration; should be ‘vir’ or ‘XXXc’ or ‘XXXm’ where ‘XXX’ is an int specifying the overdensity
posdef (str, optional) – position, can be cm (center of mass) or peak (particle with maximum value on a column)
peakcolumn (str , optional) – when posdef is ‘peak’, this is the column in source for identifying particles at the peak for the position and velocity.
- Returns
a HaloCatalog at the specified cosmology and redshift
- Return type
- class nbodykit.algorithms.FiberCollisions(ra, dec, collision_radius=0.017222222222222226, seed=None, degrees=True, comm=None)[source]¶
Run an angular FOF algorithm to determine fiber collision groups from an input catalog, and then assign fibers such that the maximum amount of object receive a fiber.
This amounts to determining the following population of objects:
- population 1:
the maximal “clean” sample of objects in which each object is not angularly collided with any other object in this subsample
- population 2:
the potentially-collided objects; these objects are those that are fiber collided + those that have been “resolved” due to multiple coverage in tile overlap regions
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.- Parameters
ra (array_like) – the right ascension coordinate column
dec (array_like) – the declination coordinate column
collision_radius (float, optional) – the size of the angular collision radius (in degrees); default is 62 arcseconds
seed (int, optional) – the random seed to use when determining which objects get fibers
degrees (bool, optional) – set to True if the units of
ra
anddec
are degrees
References
Methods
run
()Run the fiber assignment algorithm.
- run()[source]¶
Run the fiber assignment algorithm. This attaches the following attribute to the object:
Note
The
labels
attribute has a 1-to-1 correspondence with the rows in the input source.- labels¶
a CatalogSource that has the following columns:
- Label :
the group labels for each object in the input CatalogSource; label == 0 objects are not in a group
- Collided :
a flag array specifying which objects are collided, i.e., do not receive a fiber
- NeighborID :
for those objects that are collided, this gives the (global) index of the nearest neighbor on the sky (0-indexed) in the input catalog
source
, else it is set to -1
- Type
ArrayCatalog
; size:size
- class nbodykit.algorithms.KDDensity(source, margin=1.0)[source]¶
Estimate a proxy density based on the distance to the nearest neighbor. The result is proportional to the density but the scale is unspecified.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.- Parameters
source (CatalogSource) – the input source of particles to compute the proxy density on; must specify the ‘Position’ column
margin (float, optional) – Padding region per parallel domain; relative to the mean seperation
Methods
run
()Compute the density proxy.
- class nbodykit.algorithms.ProjectedFFTPower(first, Nmesh=None, BoxSize=None, second=None, axes=(0, 1), dk=None, kmin=0.0)[source]¶
The power spectrum of a field in a periodic box, projected over certain axes.
This is not really always physically meaningful, but convenient for making sense of Lyman-Alpha forest or lensing maps.
This is usually called the 1d power spectrum or 2d power spectrum.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.- Parameters
first (CatalogSource, MeshSource) – the source for the first field; if a CatalogSource is provided, it is automatically converted to MeshSource using the default painting parameters (via
to_mesh()
)Nmesh (int, optional) – the number of cells per side in the particle mesh used to paint the source
BoxSize (int, 3-vector, optional) – the size of the box
second (CatalogSource, MeshSource, optional) – the second source for cross-correlations
axes (tuple) – axes to measure the power on. The axes not in the list will be averaged out. For example: - (0, 1) : project to x,y and measure power - (0) : project to x and measure power.
dk (float, optional) – the linear spacing of
k
bins to use; if not provided, the fundamental mode of the box is usedkmin (float, optional) – the lower edge of the first
k
bin to use
Methods
load
(output[, comm])Load a saved result.
run
()Run the algorithm.
save
(output)Save the result to disk.
- classmethod load(output, comm=None)¶
Load a saved result. The result has been saved to disk with
save()
.
- run()[source]¶
Run the algorithm. This attaches the following attributes to the class:
- edges¶
the edges of the wavenumber bins
- Type
array_like
- power¶
a BinnedStatistic object that holds the projected power. It stores the following variables:
- k :
the mean value for each
k
bin
- power :
complex array holding the real and imaginary components of the projected power
- modes :
the number of Fourier modes averaged together in each bin
- Type
- save(output)¶
Save the result to disk. The format is currently JSON.
- class nbodykit.algorithms.RedshiftHistogram(source, fsky, cosmo, bins=None, redshift='Redshift', weight=None)[source]¶
Compute the mean number density as a function of redshift \(n(z)\) from an input CatalogSource of particles.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
The units of the number density are \((\mathrm{Mpc}/h)^{-3}\)
- Parameters
source (CatalogSource) – the source of particles holding the redshift column to histogram
fsky (float) – the sky area fraction, which is used in the volume calculation when normalizing \(n(z)\)
cosmo (
nbodykit.cosmology.core.Cosmology
) – the cosmological parameters, which are used to compute the volume from redshift shells when normalizing \(n(z)\)bins (int or sequence of scalars, optional) – If bins is an int, it defines the number of equal-width bins in the given range. If bins is a sequence, it defines the bin edges, including the rightmost edge, allowing for non-uniform bin widths. If not provided, Scott’s rule is used to estimate the optimal bin width from the input data (default)
redshift (str, optional) – the name of the column specifying the redshift data
weight (str, optional) – the name of the column specifying weights to use when histogramming the data
Methods
interpolate
(z[, ext])Interpoalte dndz as a function of redshift.
load
(output[, comm])Load a saved RedshiftHistogram result.
run
()Run the algorithm, which computes the histogram.
save
(output)Save the RedshiftHistogram result to disk.
- interpolate(z, ext='zeros')[source]¶
Interpoalte dndz as a function of redshift.
The interpolation acts as a band pass filter, removing small scale fluctuations in the estimator.
- Parameters
z (array_like) – redshift
ext ('extrapolate', 'zeros', 'raise', 'const') – how to deal with values out of bound.
- Returns
n
- Return type
n(z)
- classmethod load(output, comm=None)[source]¶
Load a saved RedshiftHistogram result.
The result has been saved to disk with
RedshiftHistogram.save()
.
- run()[source]¶
Run the algorithm, which computes the histogram. This function does not return anything, but adds the following attributes to the class:
Note
All ranks store the same result attributes.
- bin_edges¶
the edges of the redshift bins
- Type
array_like
- bin_centers¶
the center values of each redshift bin
- Type
array_like
- dV¶
the volume of each redshift shell in units of \((\mathrm{Mpc}/h)^3\)
- Type
array_like
- nbar¶
the values of the redshift histogram, normalized to number density (in units of \((\mathrm{Mpc}/h)^{-3}\))
- Type
array_like
- class nbodykit.algorithms.SimulationBox2PCF(mode, data1, edges, Nmu=None, pimax=None, data2=None, randoms1=None, randoms2=None, R1R2=None, periodic=True, BoxSize=None, los='z', weight='Weight', position='Position', show_progress=False, **config)[source]¶
Compute the two-point correlation function for data in a simulation box as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using pair counting.
This uses analytic randoms when using periodic conditions, unless a randoms catalog is specified. The “natural” estimator (DD/RR-1) is used in the former case, and the Landy-Szalay estimator (DD/RR - 2DR/RR + 1) in the latter case.
Note
When using analytic randoms, the expected counts are assumed to be unweighted.
- Parameters
mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below
data1 (CatalogSource) – the data catalog;
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on
mode
, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if
mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when
mode='projected'
. Distances along the \(\pi\) direction are binned with unit depth. For instance, ifpimax=40
, then 40 bins will be created along the \(\pi\) direction.data2 (CatalogSource, optional) – the second data catalog to cross-correlate;
randoms1 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for
data1
; if not provided, analytic randoms will be usedrandoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for
data2
; if not provided, analytic randoms will be usedR1R2 (SimulationBoxPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator
periodic (bool, optional) – whether to use periodic boundary conditions
BoxSize (float, 3-vector, optional) – the size of the box; if ‘BoxSize’ is not provided in the source ‘attrs’, it must be provided here
los ('x', 'y', 'z'; int, optional) – the axis of the simulation box to treat as the line-of-sight direction; this can be provided as string identifying one of ‘x’, ‘y’, ‘z’ or the equivalent integer number of the axis
weight (str, optional) – the name of the column in the source specifying the particle weights
position (str, optional) – the name of the column in the source specifying the particle positions
show_progress (bool, optional) – if
True
, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code**config (key/value pairs) – additional keywords to pass to the
Corrfunc
function
Notes
This class can compute correlation functions using several different coordinate choices, based on the value of the input argument
mode
. The choices are:mode='1d'
: compute pairs as a function of the 3D separation \(r\)mode='2d'
: compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)mode='projected'
: compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)mode='angular'
: compute pairs as a function of angle on the sky, \(\theta\)
If
mode='projected'
, the projected correlation function \(w_p(r_p)\) is also computed, using the input \(\pi_\mathrm{max}\) value.Methods
load
(output[, comm])Load a result has been saved to disk with
save()
.run
()Run the two-point correlation function algorithm.
save
(output)Save result as a JSON file with name
output
- run()[source]¶
Run the two-point correlation function algorithm. This attaches the following attributes:
SimulationBox2PCF.wp
(ifmode='projected'
)
- D1D2¶
the data1 - data2 pair counts
- Type
- D1R2¶
the data1 - randoms2 pair counts
- Type
- D2R1¶
the data2 - randoms1 pair counts
- Type
- R1R2¶
the randoms1 - randoms2 pair counts
- Type
- corr¶
the correlation function values, stored as the
corr
variable, computed from the pair counts- Type
- wp¶
the projected correlation function, \(w_p(r_p)\), computed if
mode='projected'
; correlation is stored as thecorr
variable- Type
Notes
The
D1D2
,D1R2
,D2R1
, andR1R2
attributes are identical to thepairs
attribute ofSimulationBoxPairCount
.
- save(output)¶
Save result as a JSON file with name
output
- class nbodykit.algorithms.SimulationBox3PCF(source, poles, edges, BoxSize=None, periodic=True, weight='Weight', position='Position')[source]¶
Compute the multipoles of the isotropic, three-point correlation function in configuration space for data in a simulation box.
This uses the algorithm of Slepian and Eisenstein, 2015 which scales as \(\mathcal{O}(N^2)\), where \(N\) is the number of objects.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
The algorithm expects the positions of objects in a simulation box to be the Cartesian
x
,y
, andz
vectors. For survey data, in the form of right ascension, declination, and redshift, seeSurveyData3PCF
.- Parameters
source (CatalogSource) – the input source of particles providing the ‘Position’ column
poles (list of int) – the list of multipole numbers to compute
edges (array_like) – the edges of the bins of separation to use; length of nbins+1
BoxSize (float, 3-vector, optional) – the size of the box; if periodic boundary conditions used, and ‘BoxSize’ not provided in the source
attrs
, it must be provided hereperiodic (bool, optional) – whether to use periodic boundary conditions when computing separations between objects
weight (str, optional) – the name of the column in the source specifying the particle weights
References
Slepian and Eisenstein, MNRAS 454, 4142-4158 (2015)
Methods
load
(filename[, comm])Load a result from
filename
that has been saved to disk withsave()
.run
([pedantic])Compute the three-point CF multipoles.
save
(output)Save the
poles
result to a JSON file with nameoutput
.- classmethod load(filename, comm=None)¶
Load a result from
filename
that has been saved to disk withsave()
.
- run(pedantic=False)[source]¶
Compute the three-point CF multipoles. This attaches the following the attributes to the class:
- poles¶
a BinnedStatistic object to hold the multipole results; the binned statistics stores the multipoles as variables
corr_0
,corr_1
, etc for \(\ell=0,1,\) etc. The coordinates of the binned statistic arer1
andr2
, which give the separations between the three objects in CF.- Type
- class nbodykit.algorithms.SimulationBoxPairCount(mode, first, edges, BoxSize=None, periodic=True, second=None, los='z', Nmu=None, pimax=None, weight='Weight', position='Position', show_progress=False, **config)[source]¶
Count (weighted) pairs of objects in a simulation box as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using the
Corrfunc
package.See the Notes below for the allowed coordinate dimensions.
The default weighting scheme uses the product of the weights for each object in a pair.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
The algorithm expects the positions of particles in a simulation box to be the Cartesian
x
,y
, andz
vectors. To compute pair counts on survey data, using right ascension, declination, and redshift, seeSurveyDataPairCount
.- Parameters
mode ('1d', '2d', 'projected', 'angular') – compute pair counts as a function of the specified coordinate basis; see the Notes section below for specifics
first (CatalogSource) – the first source of particles, providing the position column
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on
mode
, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1BoxSize (float, 3-vector, optional) – the size of the box; if ‘BoxSize’ is not provided in the source ‘attrs’, it must be provided here
periodic (bool, optional) – whether to use periodic boundary conditions
second (CatalogSource, optional) – the second source of particles to cross-correlate
los ({'x', 'y', 'z'}, int, optional) – the axis of the simulation box to treat as the line-of-sight direction; this can be provided as string identifying one of ‘x’, ‘y’, ‘z’ or the equivalent integer number of the axis
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if
mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when
mode='projected'
. Distances along the \(\pi\) direction are binned with unit depth. For instance, ifpimax=40
, then 40 bins will be created along the \(\pi\) direction.weight (str, optional) – the name of the column in the source specifying the particle weights
position (str, optional) – name of the column of the position of particles
show_progress (bool, optional) – if
True
, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code**config (key/value pairs) – additional keywords to pass to the
Corrfunc
function
Notes
This class can compute pair counts using several different coordinate choices, based on the value of the input argument
mode
. The choices are:mode='1d'
: compute pairs as a function of the 3D separation \(r\)mode='2d'
: compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)mode='projected'
: compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)mode='angular'
: compute pairs as a function of angle on the sky, \(\theta\)
For angular pair counts, the observer is placed at the center of the box when converting Cartesian coordinates to angular coordinates on the unit sphere.
Methods
load
(output[, comm])Load a result has been saved to disk with
save()
.run
()Calculate the pair counts in a simulation box.
save
(output)Save result as a JSON file with name
output
- run()[source]¶
Calculate the pair counts in a simulation box. This adds the following attributes to the class:
- pairs¶
a BinnedStatistic object holding the pair count results. The coordinate grid will be
(r,)
,(r,mu)
,(rp, pi)
, or(theta,)
whenmode
is ‘1d’, ‘2d’, ‘projected’, ‘angular’, respectively.The BinnedStatistic stores the following variables:
r
,rp
, ortheta
: the mean separation value in the binnpairs
: the number of pairs in the binwnpairs
: the average weight value in the bin; each pair contributes the product of the individual weight values
- Type
- save(output)¶
Save result as a JSON file with name
output
- class nbodykit.algorithms.SurveyData2PCF(mode, data1, randoms1, edges, cosmo=None, Nmu=None, pimax=None, data2=None, randoms2=None, R1R2=None, ra='RA', dec='DEC', redshift='Redshift', weight='Weight', show_progress=False, **config)[source]¶
Compute the two-point correlation function for observational survey data as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using pair counting.
The Landy-Szalay estimator (DD/RR - 2 DD/RR + 1) is used to transform pair counts in to the correlation function.
- Parameters
mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below
data1 (CatalogSource) – the data catalog; must have ra, dec, redshift, columns
randoms1 (CatalogSource) – the catalog specifying the un-clustered, random distribution for
data1
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on
mode
, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1cosmo (
Cosmology
, optional) – the cosmology instance used to convert redshift into comoving distance; this is required for all cases exceptmode='angular'
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if
mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when
mode='projected'
. Distances along the \(\pi\) direction are binned with unit depth. For instance, ifpimax=40
, then 40 bins will be created along the \(\pi\) direction.data2 (CatalogSource, optional) – the second data catalog to cross-correlate;
randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for
data2
; if not specified anddata2
is provied, thenrandoms1
will be used for both.R1R2 (SurveyDataPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator
ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
show_progress (bool, optional) – if
True
, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code**config (key/value pairs) – additional keywords to pass to the
Corrfunc
function
Notes
This class can compute correlation functions using several different coordinate choices, based on the value of the input argument
mode
. The choices are:mode='1d'
: compute pairs as a function of the 3D separation \(r\)mode='2d'
: compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)mode='projected'
: compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)mode='angular'
: compute pairs as a function of angle on the sky, \(\theta\)
If
mode='projected'
, the projected correlation function \(w_p(r_p)\) is also computed, using the input \(\pi_\mathrm{max}\) value.Methods
load
(output[, comm])Load a result has been saved to disk with
save()
.run
()Run the two-point correlation function algorithm.
save
(output)Save result as a JSON file with name
output
- run()[source]¶
Run the two-point correlation function algorithm. This attaches the following attributes:
SurveyData2PCF.wp
(ifmode='projected'
)
- D1D2¶
the data1 - data2 pair counts
- Type
- D1R2¶
the data1 - randoms2 pair counts
- Type
- D2R1¶
the data2 - randoms1 pair counts
- Type
- R1R2¶
the randoms1 - randoms2 pair counts
- Type
- corr¶
the correlation function values, stored as the
corr
variable, computed from the pair counts- Type
- wp¶
the projected correlation function, \(w_p(r_p)\), computed if
mode='projected'
; correlation is stored as thecorr
variable- Type
Notes
The
D1D2
,D1R2
,D2R1
, andR1R2
attributes are identical to thepairs
attribute ofSurveyDataPairCount
.
- save(output)¶
Save result as a JSON file with name
output
- class nbodykit.algorithms.SurveyData3PCF(source, poles, edges, cosmo, domain_factor=4, ra='RA', dec='DEC', redshift='Redshift', weight='Weight')[source]¶
Compute the multipoles of the isotropic, three-point correlation function in configuration space for observational survey data.
This uses the algorithm of Slepian and Eisenstein, 2015 which scales as \(\mathcal{O}(N^2)\), where \(N\) is the number of objects.
Results are computed when the object is inititalized. See the documenation of
run()
for the attributes storing the results.Note
The algorithm expects the positions of objects from a survey catalog be the sky coordinates, right ascension and declination, and redshift. For simulation box data in Cartesian coordinates, see
SimulationBox3PCF
.Warning
The right ascension and declination columns should be specified in degrees.
- Parameters
source (CatalogSource) – the input source of particles providing the ‘Position’ column
poles (list of int) – the list of multipole numbers to compute
edges (array_like) – the edges of the bins of separation to use; length of nbins+1
cosmo (
Cosmology
) – the cosmology instance used to convert redshifts into comoving distancesra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
domain_factor (int, optional) – the integer value by which to oversubscribe the domain decomposition mesh before balancing loads; this number can affect the distribution of loads on the ranks – an optimal value will lead to balanced loads
References
Slepian and Eisenstein, MNRAS 454, 4142-4158 (2015)
Methods
load
(filename[, comm])Load a result from
filename
that has been saved to disk withsave()
.run
()Compute the three-point CF multipoles.
save
(output)Save the
poles
result to a JSON file with nameoutput
.- classmethod load(filename, comm=None)¶
Load a result from
filename
that has been saved to disk withsave()
.
- run()[source]¶
Compute the three-point CF multipoles. This attaches the following the attributes to the class:
- poles¶
a BinnedStatistic object to hold the multipole results; the binned statistics stores the multipoles as variables
corr_0
,corr_1
, etc for \(\ell=0,1,\) etc. The coordinates of the binned statistic arer1
andr2
, which give the separations between the three objects in CF.- Type
- class nbodykit.algorithms.SurveyDataPairCount(mode, first, edges, cosmo=None, second=None, Nmu=None, pimax=None, ra='RA', dec='DEC', redshift='Redshift', weight='Weight', show_progress=False, domain_factor=4, **config)[source]¶
Count (weighted) pairs of objects from a survey data catalog as a function of \(r\), \((r,\mu)\), \((r_p, \pi)\), or \(\theta\) using the
Corrfunc
package.See the Notes below for the allowed coordinate dimensions.
The default weighting scheme uses the product of the weights for each object in a pair.
Results are computed when the class is inititalized. See the documenation of
run()
for the attributes storing the results.Note
The algorithm expects the positions of particles from a survey catalog be the sky coordinates, right ascension and declination, and redshift. To compute pair counts in a simulation box using Cartesian coordinates, see
SimulationBoxPairCount
.Warning
The right ascension and declination columns should be specified in degrees.
- Parameters
mode ('1d', '2d', 'projected', 'angular') – compute pair counts as a function of the specified coordinate basis; see the Notes section below for specifics
first (CatalogSource) – the first source of particles, providing the ‘Position’ column
edges (array_like) – the separation bin edges along the first coordinate dimension; depending on
mode
, the options are \(r\), \(r_p\), or \(\theta\). Expected units for distances are \(\mathrm{Mpc}/h\) and degrees for angles. Length of nbins+1cosmo (
Cosmology
, optional) – the cosmology instance used to convert redshift into comoving distance; this is required for all cases exceptmode='angular'
second (CatalogSource, optional) – the second source of particles to cross-correlate
Nmu (int, optional) – the number of \(\mu\) bins, ranging from 0 to 1; requred if
mode='2d'
pimax (float, optional) – The maximum separation along the line-of-sight when
mode='projected'
. Distances along the \(\pi\) direction are binned with unit depth. For instance, ifpimax=40
, then 40 bins will be created along the \(\pi\) direction.ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’
dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’
redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’
weight (str, optional) – the name of the column in the source specifying the object weights
show_progress (bool, optional) – if
True
, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the codedomain_factor (int, optional) – the integer value by which to oversubscribe the domain decomposition mesh before balancing loads; this number can affect the distribution of loads on the ranks – an optimal value will lead to balanced loads
**config (key/value pairs) – additional keywords to pass to the
Corrfunc
function
Notes
This class can compute pair counts using several different coordinate choices, based on the value of the input argument
mode
. The choices are:mode='1d'
: compute pairs as a function of the 3D separation \(r\)mode='2d'
: compute pairs as a function of the 3D separation \(r\) and the cosine of the angle to the line-of-sight, \(\mu\)mode='projected'
: compute pairs as a function of distance perpendicular and parallel to the line-of-sight, \(r_p\) and \(\pi\)mode='angular'
: compute pairs as a function of angle on the sky, \(\theta\)
Methods
load
(output[, comm])Load a result has been saved to disk with
save()
.run
()Calculate the pair counts of a survey data catalog.
save
(output)Save result as a JSON file with name
output
- run()[source]¶
Calculate the pair counts of a survey data catalog. This adds the following attribute:
self.pairs.attrs[‘total_wnpairs’]: The total of wnpairs.
- pairs¶
a BinnedStatistic object holding the pair count results. The coordinate grid will be
(r,)
,(r,mu)
,(rp, pi)
, or(theta,)
whenmode
is ‘1d’, ‘2d’, ‘projected’, ‘angular’, respectively.The BinnedStatistic stores the following variables:
r
,rp
, ortheta
: the mean separation value in the binnpairs
: the number of pairs in the binwnpairs
: the weighted npairs in the bin; each pair contributes the product of the individual weight values
- Type
- save(output)¶
Save result as a JSON file with name
output