Functions
bin_ndarray(ndarray, new_shape[, weights, …]) |
Bins an ndarray in all axes based on the target shape, by summing or averaging. |
Classes
BinnedStatistic(dims, edges, data[, …]) |
Lightweight class to hold statistics binned at fixed coordinates. |
nbodykit.binned_statistic.BinnedStatistic(dims, edges, data, fields_to_sum=[], **kwargs)[source]¶Lightweight class to hold statistics binned at fixed coordinates.
For example, this class could hold a grid of (r, mu) or (k, mu) bins for a correlation function or power spectrum measurement.
It is modeled after the syntax of xarray.Dataset, and is designed
to hold correlation function or power spectrum results (in 1D or 2D)
| Parameters: |
|
|---|
Examples
The following example shows how to read a power spectrum measurement from a JSON file, as output by nbodykit, assuming the JSON file holds a dictionary with a ‘power’ entry holding the relevant data
>>> filename = 'test_data.json'
>>> pk = BinnedStatistic.from_json(['k'], filename, 'power')
In older versions of nbodykit, results were written using plaintext ASCII files. Although now deprecated, this type of files can be read using:
>>> filename = 'test_data.dat'
>>> dset = BinnedStatistic.from_plaintext(['k'], filename)
Data variables can be accessed in a dict-like fashion:
>>> power = pkmu['power'] # returns power data variable
Array-like indexing of a BinnedStatistic returns a new BinnedStatistic
holding the sliced data:
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu[:,0] # select first mu column
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
Additional data variables can be added to the BinnedStatistic via:
>>> modes = numpy.ones((200, 5))
>>> pkmu['modes'] = modes
Coordinate-based indexing is possible through sel():
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5)
<BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
squeeze() will explicitly squeeze the specified dimension
(of length one) such that the resulting instance has one less dimension:
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')>
>>> pkmu.squeeze(dim='mu') # can also just call pkmu.squeeze()
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
average() returns a new BinnedStatistic holding the
data averaged over one dimension
reindex() will re-bin the coordinate arrays along the specified
dimension
Attributes
shape |
The shape of the coordinate grid |
variables |
Alias to return the names of the variables stored in data |
Methods
average(dim, **kwargs) |
Compute the average of each variable over the specified dimension. |
copy() |
Returns a copy of the BinnedStatistic |
from_json(filename[, key, dims, edges]) |
Initialize a BinnedStatistic from a JSON file. |
from_plaintext(dims, filename, **kwargs) |
Initialize a BinnedStatistic from a plaintext file |
reindex(dim, spacing[, weights, force, …]) |
Reindex the dimension dim by averaging over multiple coordinate bins, optionally weighting by weights. |
rename_variable(old_name, new_name) |
Rename a variable in data from old_name to new_name. |
sel([method]) |
Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s). |
squeeze([dim]) |
Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic. |
to_json(filename) |
Write a BinnedStatistic from a JSON file. |
__construct_direct__(data, mask, **kwargs)[source]¶Shortcut around __init__ for internal use to construct and return a new class instance. The returned object should be identical to that returned by __init__.
Notes
| Parameters: | data – |
|---|
__copy_attrs__()[source]¶Return a copy of all necessary attributes associated with the BinnedStatistic. This dictionary + data and mask are all that’s required to reconstruct a new class
__finalize__(data, mask, indices)[source]¶Finalize and return a new instance from a slice of the current object (returns a copy)
__getitem__(key)[source]¶Index- or string- based indexing
Notes
__slice_edges__(indices)[source]¶Internal function to slice the edges attribute with the specified indices, which specify the included coordinate bins
average(dim, **kwargs)[source]¶Compute the average of each variable over the specified dimension.
| Parameters: |
|
|---|---|
| Returns: | averaged – A new BinnedStatistic, with data averaged along one dimension, which reduces the number of dimension by one |
| Return type: |
from_json(filename, key='data', dims=None, edges=None, **kwargs)[source]¶Initialize a BinnedStatistic from a JSON file.
The JSON file should contain a dictionary, where the data to load is stored
as the key entry, with an edges entry specifying bin edges, and
optionally, a attrs entry giving a dict of meta-data
Note
This uses nbodykit.utils.JSONDecoder to load the
JSON file
| Parameters: | |
|---|---|
| Returns: | dset – the BinnedStatistic holding the data from file |
| Return type: |
from_plaintext(dims, filename, **kwargs)[source]¶Initialize a BinnedStatistic from a plaintext file
Note
Deprecated in nbodykit 0.2.x
Storage of BinnedStatistic objects as plaintext ASCII files is no longer supported;
See BinnedStatistic.from_json()
| Parameters: |
|
|---|---|
| Returns: | dset – the BinnedStatistic holding the data from file |
| Return type: |
reindex(dim, spacing, weights=None, force=True, return_spacing=False, fields_to_sum=[])[source]¶Reindex the dimension dim by averaging over multiple coordinate bins,
optionally weighting by weights.
Returns a new BinnedStatistic holding the re-binned data.
Notes
| Parameters: |
|
|---|---|
| Returns: |
|
rename_variable(old_name, new_name)[source]¶Rename a variable in data from old_name to new_name.
Note that this procedure is performed in-place (does not return a new BinnedStatistic)
| Parameters: | |
|---|---|
| Raises: |
|
sel(method=None, **indexers)[source]¶Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).
Notes
Scalar values used to index a specific dimension will result in that dimension being squeezed. To keep a dimension of unit length, use a list to index (see examples below).
| Parameters: |
|
|---|---|
| Returns: | sliced – a new BinnedStatistic holding the sliced data and coordinate grid |
| Return type: |
Examples
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=0.4)
<BinnedStatistic: dims: (mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=[0.4])
<BinnedStatistic: dims: (k: 1, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5)
<BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
shape¶The shape of the coordinate grid
squeeze(dim=None)[source]¶Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.
The behavior is similar to that of numpy.squeeze().
| Parameters: | dim (str, optional) – The name of the dimension to squeeze. If no dimension is provided, then the one dimension with unit length will be squeezed |
|---|---|
| Returns: | squeezed – a new BinnedStatistic instance, squeezed along one dimension |
| Return type: | BinnedStatistic |
| Raises: | ValueError – If the specified dimension does not have length one, or
no dimension is specified and multiple dimensions have
length one |
Examples
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')>
>>> pkmu.squeeze() # squeeze the mu dimension
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
to_json(filename)[source]¶Write a BinnedStatistic from a JSON file.
Note
This uses nbodykit.utils.JSONEncoder to write the
JSON file
| Parameters: | filename (str) – the name of the file to write |
|---|
variables¶Alias to return the names of the variables stored in data
nbodykit.binned_statistic.bin_ndarray(ndarray, new_shape, weights=None, operation=<function mean>)[source]¶Bins an ndarray in all axes based on the target shape, by summing or averaging.
| Parameters: |
|
|---|
Notes
Examples
>>> m = numpy.arange(0,100,1).reshape((10,10))
>>> n = bin_ndarray(m, new_shape=(5,5), operation=numpy.sum)
>>> print(n)
[[ 22 30 38 46 54]
[102 110 118 126 134]
[182 190 198 206 214]
[262 270 278 286 294]
[342 350 358 366 374]]