nbodykit.binned_statistic.
BinnedStatistic
(dims, edges, data, fields_to_sum=[], **kwargs)[source]¶Bases: object
Lightweight class to hold statistics binned at fixed coordinates.
For example, this class could hold a grid of (r, mu) or (k, mu) bins for a correlation function or power spectrum measurement.
It is modeled after the syntax of xarray.Dataset
, and is designed
to hold correlation function or power spectrum results (in 1D or 2D)
Parameters: |
|
---|
Examples
The following example shows how to read a power spectrum measurement from a JSON file, as output by nbodykit, assuming the JSON file holds a dictionary with a ‘power’ entry holding the relevant data
>>> filename = 'test_data.json'
>>> pk = BinnedStatistic.from_json(['k'], filename, 'power')
In older versions of nbodykit, results were written using plaintext ASCII files. Although now deprecated, this type of files can be read using:
>>> filename = 'test_data.dat'
>>> dset = BinnedStatistic.from_plaintext(['k'], filename)
Data variables can be accessed in a dict-like fashion:
>>> power = pkmu['power'] # returns power data variable
Array-like indexing of a BinnedStatistic
returns a new BinnedStatistic
holding the sliced data:
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu[:,0] # select first mu column
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
Additional data variables can be added to the BinnedStatistic
via:
>>> modes = numpy.ones((200, 5))
>>> pkmu['modes'] = modes
Coordinate-based indexing is possible through sel()
:
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5)
<BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
squeeze()
will explicitly squeeze the specified dimension
(of length one) such that the resulting instance has one less dimension:
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')>
>>> pkmu.squeeze(dim='mu') # can also just call pkmu.squeeze()
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
average()
returns a new BinnedStatistic
holding the
data averaged over one dimension
reindex()
will re-bin the coordinate arrays along the specified
dimension
Attributes
shape |
The shape of the coordinate grid |
variables |
Alias to return the names of the variables stored in data |
Methods
average (dim, **kwargs) |
Compute the average of each variable over the specified dimension. |
copy () |
Returns a copy of the BinnedStatistic |
from_json (filename[, key, dims, edges]) |
Initialize a BinnedStatistic from a JSON file. |
from_plaintext (dims, filename, **kwargs) |
Initialize a BinnedStatistic from a plaintext file |
reindex (dim, spacing[, weights, force, …]) |
Reindex the dimension dim by averaging over multiple coordinate bins, optionally weighting by weights . |
rename_variable (old_name, new_name) |
Rename a variable in data from old_name to new_name . |
sel ([method]) |
Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s). |
squeeze ([dim]) |
Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic. |
to_json (filename) |
Write a BinnedStatistic from a JSON file. |
__construct_direct__
(data, mask, **kwargs)[source]¶Shortcut around __init__ for internal use to construct and return a new class instance. The returned object should be identical to that returned by __init__.
Notes
Parameters: | data – |
---|
__copy_attrs__
()[source]¶Return a copy of all necessary attributes associated with the BinnedStatistic. This dictionary + data and mask are all that’s required to reconstruct a new class
__finalize__
(data, mask, indices)[source]¶Finalize and return a new instance from a slice of the current object (returns a copy)
__getitem__
(key)[source]¶Index- or string- based indexing
Notes
__slice_edges__
(indices)[source]¶Internal function to slice the edges attribute with the specified indices, which specify the included coordinate bins
average
(dim, **kwargs)[source]¶Compute the average of each variable over the specified dimension.
Parameters: |
|
---|---|
Returns: | averaged – A new BinnedStatistic, with data averaged along one dimension, which reduces the number of dimension by one |
Return type: |
from_json
(filename, key='data', dims=None, edges=None, **kwargs)[source]¶Initialize a BinnedStatistic from a JSON file.
The JSON file should contain a dictionary, where the data to load is stored
as the key
entry, with an edges
entry specifying bin edges, and
optionally, a attrs
entry giving a dict of meta-data
Note
This uses nbodykit.utils.JSONDecoder
to load the
JSON file
Parameters: | |
---|---|
Returns: | dset – the BinnedStatistic holding the data from file |
Return type: |
from_plaintext
(dims, filename, **kwargs)[source]¶Initialize a BinnedStatistic from a plaintext file
Note
Deprecated in nbodykit 0.2.x
Storage of BinnedStatistic objects as plaintext ASCII files is no longer supported;
See BinnedStatistic.from_json()
Parameters: |
|
---|---|
Returns: | dset – the BinnedStatistic holding the data from file |
Return type: |
reindex
(dim, spacing, weights=None, force=True, return_spacing=False, fields_to_sum=[])[source]¶Reindex the dimension dim
by averaging over multiple coordinate bins,
optionally weighting by weights
.
Returns a new BinnedStatistic holding the re-binned data.
Notes
Parameters: |
|
---|---|
Returns: |
|
rename_variable
(old_name, new_name)[source]¶Rename a variable in data
from old_name
to new_name
.
Note that this procedure is performed in-place (does not return a new BinnedStatistic)
Parameters: | |
---|---|
Raises: |
|
sel
(method=None, **indexers)[source]¶Return a new BinnedStatistic indexed by coordinate values along the specified dimension(s).
Notes
Scalar values used to index a specific dimension will result in that dimension being squeezed. To keep a dimension of unit length, use a list to index (see examples below).
Parameters: |
|
---|---|
Returns: | sliced – a new BinnedStatistic holding the sliced data and coordinate grid |
Return type: |
Examples
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=0.4)
<BinnedStatistic: dims: (mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=[0.4])
<BinnedStatistic: dims: (k: 1, mu: 5), variables: ('mu', 'k', 'power')>
>>> pkmu.sel(k=slice(0.1, 0.4), mu=0.5)
<BinnedStatistic: dims: (k: 30), variables: ('mu', 'k', 'power')>
shape
¶The shape of the coordinate grid
squeeze
(dim=None)[source]¶Squeeze the BinnedStatistic along the specified dimension, which removes that dimension from the BinnedStatistic.
The behavior is similar to that of numpy.squeeze()
.
Parameters: | dim (str, optional) – The name of the dimension to squeeze. If no dimension is provided, then the one dimension with unit length will be squeezed |
---|---|
Returns: | squeezed – a new BinnedStatistic instance, squeezed along one dimension |
Return type: | BinnedStatistic |
Raises: | ValueError – If the specified dimension does not have length one, or
no dimension is specified and multiple dimensions have
length one |
Examples
>>> pkmu
<BinnedStatistic: dims: (k: 200, mu: 1), variables: ('mu', 'k', 'power')>
>>> pkmu.squeeze() # squeeze the mu dimension
<BinnedStatistic: dims: (k: 200), variables: ('mu', 'k', 'power')>
to_json
(filename)[source]¶Write a BinnedStatistic from a JSON file.
Note
This uses nbodykit.utils.JSONEncoder
to write the
JSON file
Parameters: | filename (str) – the name of the file to write |
---|
variables
¶Alias to return the names of the variables stored in data
nbodykit.binned_statistic.
bin_ndarray
(ndarray, new_shape, weights=None, operation=<function mean>)[source]¶Bins an ndarray in all axes based on the target shape, by summing or averaging.
Parameters: |
|
---|
Notes
Examples
>>> m = numpy.arange(0,100,1).reshape((10,10))
>>> n = bin_ndarray(m, new_shape=(5,5), operation=numpy.sum)
>>> print(n)
[[ 22 30 38 46 54]
[102 110 118 126 134]
[182 190 198 206 214]
[262 270 278 286 294]
[342 350 358 366 374]]