nbodykit.algorithms.paircount_tpcf.estimators¶
Functions
|
Compute the correlation function from data/randoms using the Landy - Szalay estimator to compute the correlation function. |
|
Internal function to computing the correlation function using analytic randoms and the so-called "natural" correlation function estimator, \(DD/RR - 1\). |
Classes
|
Internal class to compute analytic pair counts for uniformly distributed randoms. |
|
A BinnedStatistic of wedges that can be converted to multiples. |
- class nbodykit.algorithms.paircount_tpcf.estimators.AnalyticUniformRandoms(mode, dims, edges, BoxSize)[source]¶
Internal class to compute analytic pair counts for uniformly distributed randoms.
This can compute the expected randoms counts for several coordinate choices, based on
mode
. The relevant volume/area calculations for the various coordinate modes are:mode=’1d’: volume of sphere
mode=’2d’: volume of spherical sector
mode=’projected’: volume of cylinder
mode=’angular’: area of spherical cap
Methods
__call__
(NR1[, NR2])Evaluate the expected randoms pair counts, and the total_weight, returns as an object that looks like the result of paircount.
This gives the ratio of the volume (or area) occupied by each bin to the global volume (or area).
- nbodykit.algorithms.paircount_tpcf.estimators.LandySzalayEstimator(pair_counter, data1, data2, randoms1, randoms2, R1R2=None, logger=None, **kwargs)[source]¶
Compute the correlation function from data/randoms using the Landy - Szalay estimator to compute the correlation function.
- Parameters
pair_counter (SimulationBoxPairCount, SurveyDataPairCount) – the pair counting algorithm class
data1 (CatalogSource) – the first data source
data2 (CatalogSource, None) – the second data catalog to cross-correlate; can be None for auto-correlations
randoms1 (CatalogSource) – the randoms catalog corresponding to
data1
randoms2 (CatalogSource, None) – the second randoms catalog; can be None for auto-correlations
R1R2 (SimulationBoxPairCount, SurveyDataPairCount, optional) – if provided, random pairs R1R2 are not recalculated
**kwargs – the parameters passed to the
pair_counter
class to count pairs
- Returns
D1D2, D1R2, D2R1, R1R2, CF – the various terms of the LS estimator + the correlation function result
- Return type
References
- nbodykit.algorithms.paircount_tpcf.estimators.NaturalEstimator(D1D2)[source]¶
Internal function to computing the correlation function using analytic randoms and the so-called “natural” correlation function estimator, \(DD/RR - 1\).
- class nbodykit.algorithms.paircount_tpcf.estimators.WedgeBinnedStatistic(dims, edges, data, fields_to_sum=[], coords=None, **kwargs)[source]¶
A BinnedStatistic of wedges that can be converted to multiples.
- Attributes
Methods
average
(dim, **kwargs)Compute the average of each variable over the specified dimension.
copy
([cls])Returns a copy of the BinnedStatistic, optionally change the type to cls.
from_json
(filename[, key, dims, edges])Initialize a BinnedStatistic from a JSON file.
from_plaintext
(dims, filename, **kwargs)Initialize a BinnedStatistic from a plaintext file
reindex
(dim, spacing[, weights, force, ...])Reindex the dimension
dim
by averaging over multiple coordinate bins, optionally weighting byweights
.rename_variable
(old_name, new_name)Rename a variable in
data
fromold_name
tonew_name
.sel
([method])Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).
squeeze
([dim])Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.
take
(*masks, **indices)Take a subset of a BinnedStatistic from given list of indices.
to_json
(filename)Write a BinnedStatistic from a JSON file.
to_poles
(poles)Invert the measured wedges \(\xi(r,mu)\) into correlation multipoles, \(\xi_\ell(r)\).
from_state
- classmethod __construct_direct__(data, mask, **kwargs)¶
Shortcut around __init__ for internal use to construct and return a new class instance. The returned object should be identical to that returned by __init__.
Notes
Useful for returning new instances with sliced data/mask
The keyword arguments required to create a full, unbroken instance are dims, coords, edges, and attrs
- Parameters
data –
- __copy_attrs__()¶
Return a copy of all necessary attributes associated with the BinnedStatistic. This dictionary + data and mask are all that’s required to reconstruct a new class
- __finalize__(data, mask, indices)¶
Finalize and return a new instance from a slice of the current object (returns a copy)
- __getitem__(key)¶
Index- or string- based indexing
Notes
If a single string is passed, the key is intrepreted as a variable or coordinate, and the corresponding array is returned
If a list of strings is passed, then a new BinnedStatistic holding only the variable names in key is returned
Integer-based indexing or slices similar to numpy indexing will slice data, returning a new BinnedStatistic holding the newly sliced data and coordinate grid
Scalar indexes (i.e., integers) used to index a certain dimension will “squeeze” that dimension, removing it from the coordinate grid
- __setitem__(key, data)¶
Add a new variable with the name key to the class using data
- __slice_edges__(indices)¶
Internal function to slice the edges attribute with the specified indices, which specify the included coordinate bins
- average(dim, **kwargs)¶
Compute the average of each variable over the specified dimension.
- Parameters
dim (str) – The name of the dimension to average over
**kwargs – Additional keywords to pass to
BinnedStatistic.reindex()
. See the documentation forBinnedStatistic.reindex()
for valid keywords.
- Returns
averaged – A new BinnedStatistic, with data averaged along one dimension, which reduces the number of dimension by one
- Return type
- copy(cls=None)¶
Returns a copy of the BinnedStatistic, optionally change the type to cls. cls must be a subclass of BinnedStatistic.
- classmethod from_json(filename, key='data', dims=None, edges=None, **kwargs)¶
Initialize a BinnedStatistic from a JSON file.
The JSON file should contain a dictionary, where the data to load is stored as the
key
entry, with anedges
entry specifying bin edges, and optionally, aattrs
entry giving a dict of meta-dataNote
This uses
nbodykit.utils.JSONDecoder
to load the JSON file- Parameters
- Returns
dset – the BinnedStatistic holding the data from file
- Return type
- classmethod from_plaintext(dims, filename, **kwargs)¶
Initialize a BinnedStatistic from a plaintext file
Note
Deprecated in nbodykit 0.2.x Storage of BinnedStatistic objects as plaintext ASCII files is no longer supported; See
BinnedStatistic.from_json()
- Parameters
dims (list) – list of names specifying the dimensions, i.e.,
['k']
or['k', 'mu']
filename (str) – the name of the file to load
- Returns
dset – the BinnedStatistic holding the data from file
- Return type
- reindex(dim, spacing, weights=None, force=True, return_spacing=False, fields_to_sum=[])¶
Reindex the dimension
dim
by averaging over multiple coordinate bins, optionally weighting byweights
.Returns a new BinnedStatistic holding the re-binned data.
Notes
We can only re-bin to an integral factor of the current dimension size in order to inaccuracies when re-binning to overlapping bins
Variables specified in fields_to_sum will be summed when re-indexing, instead of averaging
- Parameters
dim (str) – The name of the dimension to average over
spacing (float) – The desired spacing for the re-binned data. If force = True, the spacing used will be the closest value to this value, such that the new bins are N times larger, when N is an integer
weights (array_like or str, optional (None)) – An array to weight the data by before re-binning, or if a string is provided, the name of a data column to use as weights
force (bool, optional) – If True, force the spacing to be a value such that the new bins are N times larger, when N is an integer, otherwise, raise an exception. Default is True
return_spacing (bool, optional) – If True, return the new spacing as the second return value. Default is False.
fields_to_sum (list) – the name of fields that will be summed when reindexing, instead of averaging
- Returns
rebinned (BinnedStatistic) – A new BinnedStatistic instance, which holds the rebinned coordinate grid and data variables
spacing (float, optional) – If return_spacing is True, the new coordinate spacing will be returned
- rename_variable(old_name, new_name)¶
Rename a variable in
data
fromold_name
tonew_name
.Note that this procedure is performed in-place (does not return a new BinnedStatistic)
- sel(method=None, **indexers)¶
Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).
Notes
Scalar values used to index a specific dimension will result in that dimension being squeezed. To keep a dimension of unit length, use a list to index (see examples below).
- Parameters
method ({None, 'nearest'}) – The method to use for inexact matches; if set to None, require an exact coordinate match, otherwise match the nearest coordinate
**indexers – the pairs of dimension name and coordinate value used to index the BinnedStatistic
- Returns
sliced – a new BinnedStatistic holding the sliced data and coordinate grid
- Return type
Examples
>>> pkmu <BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=0.4) <BinnedStatistic: dims: (mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=[0.4]) <BinnedStatistic: dims: (k: 1, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5) <BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
- property shape¶
The shape of the coordinate grid
- squeeze(dim=None)¶
Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.
The behavior is similar to that of
numpy.squeeze()
.- Parameters
dim (str, optional) – The name of the dimension to squeeze. If no dimension is provided, then the one dimension with unit length will be squeezed
- Returns
squeezed – a new BinnedStatistic instance, squeezed along one dimension
- Return type
- Raises
ValueError – If the specified dimension does not have length one, or no dimension is specified and multiple dimensions have length one
Examples
>>> pkmu <BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')> >>> pkmu.squeeze() # squeeze the mu dimension <BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
- take(*masks, **indices)¶
Take a subset of a BinnedStatistic from given list of indices. This is more powerful but more verbose than sel. Also the result is never squeezed, even if only a single item along the direction is used.
- Parameters
masks (array_like (boolean)) – a list of masks that are of the same shape as the data.
indices (dict (string : array_like)) – mapping from axes (by name, dim) to items to select (list/array_like). Each item is a valid selector for numpy’s fancy indexing.
- Return type
new BinnedStatistic, where only items selected by all axes are kept.
Examples
>>> pkmu <BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
# similar to pkmu.sel(k > 0.4), select the bin centers >>> pkmu.take(k=pkmu.coords[‘k’] > 0.4) <BinnedStatistic: dims: (mu: 5), variables: (‘mu’, ‘k’, ‘power’)>
# also similar to pkmu.sel(k > 0.4), select the bin averages >>> pkmu.take(pkmu[‘k’] > 0.4) <BinnedStatistic: dims: (k: 30), variables: (‘mu’, ‘k’, ‘power’)>
# impossible with sel. >>> pkmu.take(pkmu[‘modes’] > 0)
- to_json(filename)¶
Write a BinnedStatistic from a JSON file.
Note
This uses
nbodykit.utils.JSONEncoder
to write the JSON file- Parameters
filename (str) – the name of the file to write
- to_poles(poles)[source]¶
Invert the measured wedges \(\xi(r,mu)\) into correlation multipoles, \(\xi_\ell(r)\).
To select a mu_range, use
poles = self.sel(mu=slice(*mu_range), method='nearest').to_poles(poles)
- Parameters
poles (array_like) – the list of multipoles to compute
- Returns
xi_ell – a data set holding the \(\xi_\ell(r)\) multipoles
- Return type
- property variables¶
Alias to return the names of the variables stored in data