nbodykit.algorithms.paircount_tpcf.estimators

Functions

LandySzalayEstimator(pair_counter, data1, ...)

Compute the correlation function from data/randoms using the Landy - Szalay estimator to compute the correlation function.

NaturalEstimator(D1D2)

Internal function to computing the correlation function using analytic randoms and the so-called "natural" correlation function estimator, \(DD/RR - 1\).

Classes

AnalyticUniformRandoms(mode, dims, edges, ...)

Internal class to compute analytic pair counts for uniformly distributed randoms.

WedgeBinnedStatistic(dims, edges, data[, ...])

A BinnedStatistic of wedges that can be converted to multiples.

class nbodykit.algorithms.paircount_tpcf.estimators.AnalyticUniformRandoms(mode, dims, edges, BoxSize)[source]

Internal class to compute analytic pair counts for uniformly distributed randoms.

This can compute the expected randoms counts for several coordinate choices, based on mode. The relevant volume/area calculations for the various coordinate modes are:

  • mode=’1d’: volume of sphere

  • mode=’2d’: volume of spherical sector

  • mode=’projected’: volume of cylinder

  • mode=’angular’: area of spherical cap

Methods

__call__(NR1[, NR2])

Evaluate the expected randoms pair counts, and the total_weight, returns as an object that looks like the result of paircount.

get_filling_factor()

This gives the ratio of the volume (or area) occupied by each bin to the global volume (or area).

__call__(NR1, NR2=None)[source]

Evaluate the expected randoms pair counts, and the total_weight, returns as an object that looks like the result of paircount.

get_filling_factor()[source]

This gives the ratio of the volume (or area) occupied by each bin to the global volume (or area).

It is different based on the value of mode.

nbodykit.algorithms.paircount_tpcf.estimators.LandySzalayEstimator(pair_counter, data1, data2, randoms1, randoms2, R1R2=None, logger=None, **kwargs)[source]

Compute the correlation function from data/randoms using the Landy - Szalay estimator to compute the correlation function.

Parameters
Returns

D1D2, D1R2, D2R1, R1R2, CF – the various terms of the LS estimator + the correlation function result

Return type

BinnedStatistic

References

http://adsabs.harvard.edu/abs/1993ApJ…412…64L

nbodykit.algorithms.paircount_tpcf.estimators.NaturalEstimator(D1D2)[source]

Internal function to computing the correlation function using analytic randoms and the so-called “natural” correlation function estimator, \(DD/RR - 1\).

class nbodykit.algorithms.paircount_tpcf.estimators.WedgeBinnedStatistic(dims, edges, data, fields_to_sum=[], coords=None, **kwargs)[source]

A BinnedStatistic of wedges that can be converted to multiples.

Attributes
shape

The shape of the coordinate grid

variables

Alias to return the names of the variables stored in data

Methods

average(dim, **kwargs)

Compute the average of each variable over the specified dimension.

copy([cls])

Returns a copy of the BinnedStatistic, optionally change the type to cls.

from_json(filename[, key, dims, edges])

Initialize a BinnedStatistic from a JSON file.

from_plaintext(dims, filename, **kwargs)

Initialize a BinnedStatistic from a plaintext file

reindex(dim, spacing[, weights, force, ...])

Reindex the dimension dim by averaging over multiple coordinate bins, optionally weighting by weights.

rename_variable(old_name, new_name)

Rename a variable in data from old_name to new_name.

sel([method])

Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).

squeeze([dim])

Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.

take(*masks, **indices)

Take a subset of a BinnedStatistic from given list of indices.

to_json(filename)

Write a BinnedStatistic from a JSON file.

to_poles(poles)

Invert the measured wedges \(\xi(r,mu)\) into correlation multipoles, \(\xi_\ell(r)\).

from_state

classmethod __construct_direct__(data, mask, **kwargs)

Shortcut around __init__ for internal use to construct and return a new class instance. The returned object should be identical to that returned by __init__.

Notes

  • Useful for returning new instances with sliced data/mask

  • The keyword arguments required to create a full, unbroken instance are dims, coords, edges, and attrs

Parameters

data

__copy_attrs__()

Return a copy of all necessary attributes associated with the BinnedStatistic. This dictionary + data and mask are all that’s required to reconstruct a new class

__finalize__(data, mask, indices)

Finalize and return a new instance from a slice of the current object (returns a copy)

__getitem__(key)

Index- or string- based indexing

Notes

  • If a single string is passed, the key is intrepreted as a variable or coordinate, and the corresponding array is returned

  • If a list of strings is passed, then a new BinnedStatistic holding only the variable names in key is returned

  • Integer-based indexing or slices similar to numpy indexing will slice data, returning a new BinnedStatistic holding the newly sliced data and coordinate grid

  • Scalar indexes (i.e., integers) used to index a certain dimension will “squeeze” that dimension, removing it from the coordinate grid

__setitem__(key, data)

Add a new variable with the name key to the class using data

__slice_edges__(indices)

Internal function to slice the edges attribute with the specified indices, which specify the included coordinate bins

average(dim, **kwargs)

Compute the average of each variable over the specified dimension.

Parameters
  • dim (str) – The name of the dimension to average over

  • **kwargs – Additional keywords to pass to BinnedStatistic.reindex(). See the documentation for BinnedStatistic.reindex() for valid keywords.

Returns

averaged – A new BinnedStatistic, with data averaged along one dimension, which reduces the number of dimension by one

Return type

BinnedStatistic

copy(cls=None)

Returns a copy of the BinnedStatistic, optionally change the type to cls. cls must be a subclass of BinnedStatistic.

classmethod from_json(filename, key='data', dims=None, edges=None, **kwargs)

Initialize a BinnedStatistic from a JSON file.

The JSON file should contain a dictionary, where the data to load is stored as the key entry, with an edges entry specifying bin edges, and optionally, a attrs entry giving a dict of meta-data

Note

This uses nbodykit.utils.JSONDecoder to load the JSON file

Parameters
  • filename (str) – the name of the file to load

  • key (str, optional) – the name of the key in the JSON file holding the data to load

  • dims (list, optional) – list of names specifying the dimensions, i.e., ['k'] or ['k', 'mu']; must be supplied if not given in the JSON file

Returns

dset – the BinnedStatistic holding the data from file

Return type

BinnedStatistic

classmethod from_plaintext(dims, filename, **kwargs)

Initialize a BinnedStatistic from a plaintext file

Note

Deprecated in nbodykit 0.2.x Storage of BinnedStatistic objects as plaintext ASCII files is no longer supported; See BinnedStatistic.from_json()

Parameters
  • dims (list) – list of names specifying the dimensions, i.e., ['k'] or ['k', 'mu']

  • filename (str) – the name of the file to load

Returns

dset – the BinnedStatistic holding the data from file

Return type

BinnedStatistic

reindex(dim, spacing, weights=None, force=True, return_spacing=False, fields_to_sum=[])

Reindex the dimension dim by averaging over multiple coordinate bins, optionally weighting by weights.

Returns a new BinnedStatistic holding the re-binned data.

Notes

  • We can only re-bin to an integral factor of the current dimension size in order to inaccuracies when re-binning to overlapping bins

  • Variables specified in fields_to_sum will be summed when re-indexing, instead of averaging

Parameters
  • dim (str) – The name of the dimension to average over

  • spacing (float) – The desired spacing for the re-binned data. If force = True, the spacing used will be the closest value to this value, such that the new bins are N times larger, when N is an integer

  • weights (array_like or str, optional (None)) – An array to weight the data by before re-binning, or if a string is provided, the name of a data column to use as weights

  • force (bool, optional) – If True, force the spacing to be a value such that the new bins are N times larger, when N is an integer, otherwise, raise an exception. Default is True

  • return_spacing (bool, optional) – If True, return the new spacing as the second return value. Default is False.

  • fields_to_sum (list) – the name of fields that will be summed when reindexing, instead of averaging

Returns

  • rebinned (BinnedStatistic) – A new BinnedStatistic instance, which holds the rebinned coordinate grid and data variables

  • spacing (float, optional) – If return_spacing is True, the new coordinate spacing will be returned

rename_variable(old_name, new_name)

Rename a variable in data from old_name to new_name.

Note that this procedure is performed in-place (does not return a new BinnedStatistic)

Parameters
  • old_name (str) – the name of the old varibale to rename

  • new_name (str) – the desired new variable name

Raises

ValueError – If old_name is not present in variables

sel(method=None, **indexers)

Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).

Notes

Scalar values used to index a specific dimension will result in that dimension being squeezed. To keep a dimension of unit length, use a list to index (see examples below).

Parameters
  • method ({None, 'nearest'}) – The method to use for inexact matches; if set to None, require an exact coordinate match, otherwise match the nearest coordinate

  • **indexers – the pairs of dimension name and coordinate value used to index the BinnedStatistic

Returns

sliced – a new BinnedStatistic holding the sliced data and coordinate grid

Return type

BinnedStatistic

Examples

>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=0.4)
<BinnedStatistic: dims: (mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=[0.4])
<BinnedStatistic: dims: (k: 1, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5)
<BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
property shape

The shape of the coordinate grid

squeeze(dim=None)

Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.

The behavior is similar to that of numpy.squeeze().

Parameters

dim (str, optional) – The name of the dimension to squeeze. If no dimension is provided, then the one dimension with unit length will be squeezed

Returns

squeezed – a new BinnedStatistic instance, squeezed along one dimension

Return type

BinnedStatistic

Raises

ValueError – If the specified dimension does not have length one, or no dimension is specified and multiple dimensions have length one

Examples

>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')>
>>> pkmu.squeeze() # squeeze the mu dimension
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
take(*masks, **indices)

Take a subset of a BinnedStatistic from given list of indices. This is more powerful but more verbose than sel. Also the result is never squeezed, even if only a single item along the direction is used.

Parameters
  • masks (array_like (boolean)) – a list of masks that are of the same shape as the data.

  • indices (dict (string : array_like)) – mapping from axes (by name, dim) to items to select (list/array_like). Each item is a valid selector for numpy’s fancy indexing.

Return type

new BinnedStatistic, where only items selected by all axes are kept.

Examples

>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>

# similar to pkmu.sel(k > 0.4), select the bin centers >>> pkmu.take(k=pkmu.coords[‘k’] > 0.4) <BinnedStatistic: dims: (mu: 5), variables: (‘mu’, ‘k’, ‘power’)>

# also similar to pkmu.sel(k > 0.4), select the bin averages >>> pkmu.take(pkmu[‘k’] > 0.4) <BinnedStatistic: dims: (k: 30), variables: (‘mu’, ‘k’, ‘power’)>

# impossible with sel. >>> pkmu.take(pkmu[‘modes’] > 0)

to_json(filename)

Write a BinnedStatistic from a JSON file.

Note

This uses nbodykit.utils.JSONEncoder to write the JSON file

Parameters

filename (str) – the name of the file to write

to_poles(poles)[source]

Invert the measured wedges \(\xi(r,mu)\) into correlation multipoles, \(\xi_\ell(r)\).

To select a mu_range, use

poles = self.sel(mu=slice(*mu_range), method='nearest').to_poles(poles)
Parameters

poles (array_like) – the list of multipoles to compute

Returns

xi_ell – a data set holding the \(\xi_\ell(r)\) multipoles

Return type

BinnedStatistic

property variables

Alias to return the names of the variables stored in data