# nbodykit.algorithms.paircount_tpcf.tpcf¶

Classes

 BasePairCount2PCF(mode, data1, edges[, Nmu, ...]) Base class for two-point correlation function algorithms that use pair counting. SimulationBox2PCF(mode, data1, edges[, Nmu, ...]) Compute the two-point correlation function for data in a simulation box as a function of $$r$$, $$(r,\mu)$$, $$(r_p, \pi)$$, or $$\theta$$ using pair counting. SurveyData2PCF(mode, data1, randoms1, edges) Compute the two-point correlation function for observational survey data as a function of $$r$$, $$(r,\mu)$$, $$(r_p, \pi)$$, or $$\theta$$ using pair counting.
class nbodykit.algorithms.paircount_tpcf.tpcf.BasePairCount2PCF(mode, data1, edges, Nmu=None, pimax=None, randoms1=None, randoms2=None, data2=None, R1R2=None, **kws)[source]

Base class for two-point correlation function algorithms that use pair counting. The API largely follows that of SimulationBoxPairCount and SurveyDataPairCount.

Parameters
• mode ('1d', '2d', 'projected', 'angular') – the type of two point correlation function to compute

• data1 (CatalogSource) – the data catalog; must have a ‘Position’ column

• edges (array_like) – the bin edges along the first binning dimension

• Nmu (int, optional) – when mode is ‘2d’, the number of mu bins, ranging from 0 to 1

• pimax (float, optional) – when mode is ‘projected’, the maximum separation along the line-of-sight

• randoms1 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data1; if not provided, analytic randoms will be used

• randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data2; if not provided, analytic randoms will be used

• data2 (CatalogSource, optional) – the second data catalog to cross-correlate; must have a ‘Position’ column

• R1R2 (SimulationBoxPairCount, SurveyDataPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator

• **kws – additional keyword arguments passed to the appropriate pair counting class

Methods

 load(output[, comm]) Load a result has been saved to disk with save(). Run the two-point correlation function algorithm. save(output) Save result as a JSON file with name output

Load a result has been saved to disk with save().

run()[source]

Run the two-point correlation function algorithm.

There are two cases here:

1. If no randoms were provided, and the data is in a simulation box with periodic boundary conditions, the natural estimator $$DD/RR - 1$$ is used.

2. If randoms were provided, the Landy-Szalay estimator is used: $$(D_1 D_2 - D_1 R_2 - D_2 R_1 + R_1 R_2) / R_1 R_2$$

Raises
• ValueError : – if periodic boundary conditions were not requested, and randoms1 is None

• ValueError : – if periodic boundary conditions were not requested, and data2 is not None, but randoms2 is None

save(output)[source]

Save result as a JSON file with name output

class nbodykit.algorithms.paircount_tpcf.tpcf.SimulationBox2PCF(mode, data1, edges, Nmu=None, pimax=None, data2=None, randoms1=None, randoms2=None, R1R2=None, periodic=True, BoxSize=None, los='z', weight='Weight', position='Position', show_progress=False, **config)[source]

Compute the two-point correlation function for data in a simulation box as a function of $$r$$, $$(r,\mu)$$, $$(r_p, \pi)$$, or $$\theta$$ using pair counting.

This uses analytic randoms when using periodic conditions, unless a randoms catalog is specified. The “natural” estimator (DD/RR-1) is used in the former case, and the Landy-Szalay estimator (DD/RR - 2DR/RR + 1) in the latter case.

Note

When using analytic randoms, the expected counts are assumed to be unweighted.

Parameters
• mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below

• data1 (CatalogSource) – the data catalog;

• edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are $$r$$, $$r_p$$, or $$\theta$$. Expected units for distances are $$\mathrm{Mpc}/h$$ and degrees for angles. Length of nbins+1

• Nmu (int, optional) – the number of $$\mu$$ bins, ranging from 0 to 1; requred if mode='2d'

• pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the $$\pi$$ direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the $$\pi$$ direction.

• data2 (CatalogSource, optional) – the second data catalog to cross-correlate;

• randoms1 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data1; if not provided, analytic randoms will be used

• randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data2; if not provided, analytic randoms will be used

• R1R2 (SimulationBoxPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator

• periodic (bool, optional) – whether to use periodic boundary conditions

• BoxSize (float, 3-vector, optional) – the size of the box; if ‘BoxSize’ is not provided in the source ‘attrs’, it must be provided here

• los ('x', 'y', 'z'; int, optional) – the axis of the simulation box to treat as the line-of-sight direction; this can be provided as string identifying one of ‘x’, ‘y’, ‘z’ or the equivalent integer number of the axis

• weight (str, optional) – the name of the column in the source specifying the particle weights

• position (str, optional) – the name of the column in the source specifying the particle positions

• show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code

• **config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute correlation functions using several different coordinate choices, based on the value of the input argument mode. The choices are:

• mode='1d' : compute pairs as a function of the 3D separation $$r$$

• mode='2d' : compute pairs as a function of the 3D separation $$r$$ and the cosine of the angle to the line-of-sight, $$\mu$$

• mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, $$r_p$$ and $$\pi$$

• mode='angular' : compute pairs as a function of angle on the sky, $$\theta$$

If mode='projected', the projected correlation function $$w_p(r_p)$$ is also computed, using the input $$\pi_\mathrm{max}$$ value.

Methods

 load(output[, comm]) Load a result has been saved to disk with save(). Run the two-point correlation function algorithm. save(output) Save result as a JSON file with name output

Load a result has been saved to disk with save().

run()[source]

Run the two-point correlation function algorithm. This attaches the following attributes:

D1D2

the data1 - data2 pair counts

Type

BinnedStatistic

D1R2

the data1 - randoms2 pair counts

Type

BinnedStatistic

D2R1

the data2 - randoms1 pair counts

Type

BinnedStatistic

R1R2

the randoms1 - randoms2 pair counts

Type

BinnedStatistic

corr

the correlation function values, stored as the corr variable, computed from the pair counts

Type

BinnedStatistic

wp

the projected correlation function, $$w_p(r_p)$$, computed if mode='projected'; correlation is stored as the corr variable

Type

BinnedStatistic

Notes

The D1D2, D1R2, D2R1, and R1R2 attributes are identical to the pairs attribute of SimulationBoxPairCount.

save(output)

Save result as a JSON file with name output

class nbodykit.algorithms.paircount_tpcf.tpcf.SurveyData2PCF(mode, data1, randoms1, edges, cosmo=None, Nmu=None, pimax=None, data2=None, randoms2=None, R1R2=None, ra='RA', dec='DEC', redshift='Redshift', weight='Weight', show_progress=False, **config)[source]

Compute the two-point correlation function for observational survey data as a function of $$r$$, $$(r,\mu)$$, $$(r_p, \pi)$$, or $$\theta$$ using pair counting.

The Landy-Szalay estimator (DD/RR - 2 DD/RR + 1) is used to transform pair counts in to the correlation function.

Parameters
• mode ('1d', '2d', 'projected', 'angular') – the type of two-point correlation function to compute; see the Notes below

• data1 (CatalogSource) – the data catalog; must have ra, dec, redshift, columns

• randoms1 (CatalogSource) – the catalog specifying the un-clustered, random distribution for data1

• edges (array_like) – the separation bin edges along the first coordinate dimension; depending on mode, the options are $$r$$, $$r_p$$, or $$\theta$$. Expected units for distances are $$\mathrm{Mpc}/h$$ and degrees for angles. Length of nbins+1

• cosmo (Cosmology, optional) – the cosmology instance used to convert redshift into comoving distance; this is required for all cases except mode='angular'

• Nmu (int, optional) – the number of $$\mu$$ bins, ranging from 0 to 1; requred if mode='2d'

• pimax (float, optional) – The maximum separation along the line-of-sight when mode='projected'. Distances along the $$\pi$$ direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the $$\pi$$ direction.

• data2 (CatalogSource, optional) – the second data catalog to cross-correlate;

• randoms2 (CatalogSource, optional) – the catalog specifying the un-clustered, random distribution for data2; if not specified and data2 is provied, then randoms1 will be used for both.

• R1R2 (SurveyDataPairCount, optional) – if provided, random pairs R1R2 are not recalculated in the Landy-Szalay estimator

• ra (str, optional) – the name of the column in the source specifying the right ascension coordinates in units of degrees; default is ‘RA’

• dec (str, optional) – the name of the column in the source specifying the declination coordinates; default is ‘DEC’

• redshift (str, optional) – the name of the column in the source specifying the redshift coordinates; default is ‘Redshift’

• weight (str, optional) – the name of the column in the source specifying the object weights

• show_progress (bool, optional) – if True, perform the pair counting calculation in 10 iterations, logging the progress after each iteration; this is useful for understanding the scaling of the code

• **config (key/value pairs) – additional keywords to pass to the Corrfunc function

Notes

This class can compute correlation functions using several different coordinate choices, based on the value of the input argument mode. The choices are:

• mode='1d' : compute pairs as a function of the 3D separation $$r$$

• mode='2d' : compute pairs as a function of the 3D separation $$r$$ and the cosine of the angle to the line-of-sight, $$\mu$$

• mode='projected' : compute pairs as a function of distance perpendicular and parallel to the line-of-sight, $$r_p$$ and $$\pi$$

• mode='angular' : compute pairs as a function of angle on the sky, $$\theta$$

If mode='projected', the projected correlation function $$w_p(r_p)$$ is also computed, using the input $$\pi_\mathrm{max}$$ value.

Methods

 load(output[, comm]) Load a result has been saved to disk with save(). Run the two-point correlation function algorithm. save(output) Save result as a JSON file with name output

Load a result has been saved to disk with save().

run()[source]

Run the two-point correlation function algorithm. This attaches the following attributes:

D1D2

the data1 - data2 pair counts

Type

BinnedStatistic

D1R2

the data1 - randoms2 pair counts

Type

BinnedStatistic

D2R1

the data2 - randoms1 pair counts

Type

BinnedStatistic

R1R2

the randoms1 - randoms2 pair counts

Type

BinnedStatistic

corr

the correlation function values, stored as the corr variable, computed from the pair counts

Type

BinnedStatistic

wp

the projected correlation function, $$w_p(r_p)$$, computed if mode='projected'; correlation is stored as the corr variable

Type

BinnedStatistic

Notes

The D1D2, D1R2, D2R1, and R1R2 attributes are identical to the pairs attribute of SurveyDataPairCount.

save(output)

Save result as a JSON file with name output