nbodykit.utils¶
Functions
|
Padding an array in the front with items before this rank. |
|
Gather the input data array from all ranks to the specified |
|
Scatter the input data array across all ranks, assuming data is initially only on root (and None on other ranks). |
|
|
|
Re-direct stdout and stderr to null for every rank but |
|
This is a decorator which can be used to mark functions as deprecated. |
|
Return the global minimum/maximum of a numpy/dask array along the first axis. |
|
Test if the input array is a structured array by testing for dtype.names |
Split s into three integers, a, b, c, such that a * b * c == s and a <= b <= c |
|
|
Utility function to return a string representing the elapsed time, as computed from the input start and end times |
Classes
|
Distributed Array Object |
|
|
|
A subclass of |
|
A subclass of |
|
Helper object for the topology of a distributed array |
- class nbodykit.utils.DistributedArray(local, comm)[source]¶
Distributed Array Object
A distributed array is striped along ranks, along first dimension
- comm¶
the communicator
- Type
mpi4py.MPI.Comm
- local¶
the local data
- Type
array_like
Methods
bincount
([weights, local, shared_edges])Assign count numbers from sorted local data.
cempty
(cshape, dtype, comm)Create an empty array collectively
concat
(*args, **kwargs)Append several distributed arrays into one.
sort
([orderby])Sort array globally by key orderby.
Assign unique labels to sorted local.
- bincount(weights=None, local=False, shared_edges=True)[source]¶
Assign count numbers from sorted local data.
Warning
local data must be globally sorted, and of integer type. (numpy.bincount)
- Parameters
weights (array-like) – if given, count the weight instead of the number of objects.
local (boolean) – if local is True, only count the local array.
shared_edges (boolean) – if True, keep the counts at edges that are shared between ranks on both ranks. if False, keep the counts at shared edges to the rank on the left.
- Returns
N – distributed counts array. If items of the same value spans other chunks of array, they are added to N as well.
- Return type
Examples
if the local array is [ (0, 0), (0, 1)], Then the counts array is [ (3, ), (3, 1)]
- classmethod concat(*args, **kwargs)[source]¶
Append several distributed arrays into one.
- Parameters
localsize (None) –
- sort(orderby=None)[source]¶
Sort array globally by key orderby.
Due to a limitation of mpsort, self[orderby] must be u8.
- nbodykit.utils.FrontPadArray(array, front, comm)[source]¶
Padding an array in the front with items before this rank.
- nbodykit.utils.GatherArray(data, comm, root=0)[source]¶
Gather the input data array from all ranks to the specified
root
.This uses Gatherv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype
- Parameters
data (array_like) – the data on each rank to gather
comm (MPI communicator) – the MPI communicator
root (int, or Ellipsis) – the rank number to gather the data to. If root is Ellipsis, broadcast the result to all ranks.
- Returns
recvbuffer – the gathered data on root, and None otherwise
- Return type
array_like, None
- class nbodykit.utils.JSONDecoder(*args, **kwargs)[source]¶
A subclass of
json.JSONDecoder
that can also handle numpy arrays, complex values, andastropy.units.Quantity
objects.Methods
decode
(s[, _w])Return the Python representation of
s
(astr
instance containing a JSON document).raw_decode
(s[, idx])Decode a JSON document from
s
(astr
beginning with a JSON document) and return a 2-tuple of the Python representation and the index ins
where the document ended.hook
- class nbodykit.utils.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
A subclass of
json.JSONEncoder
that can also handle numpy arrays, complex values, andastropy.units.Quantity
objects.Methods
default
(obj)Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).encode
(o)Return a JSON string representation of a Python data structure.
iterencode
(o[, _one_shot])Encode the given object and yield each string representation as available.
- __init__(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is
None
and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError
.
- default(obj)[source]¶
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- class nbodykit.utils.LinearTopology(local, comm)[source]¶
Helper object for the topology of a distributed array
Methods
heads
()The first items on each rank.
next
()The item after the local data.
prev
()The item before the local data.
tails
()The last items on each rank.
- heads()[source]¶
The first items on each rank.
- Returns
heads – a list of first items, EmptyRank is used for empty ranks
- Return type
list
- next()[source]¶
The item after the local data.
This method the first item after the local data. If the rank after current rank is empty, item after that rank is used.
If no item is after local data, EmptyRank is returned.
- Returns
next – Item after local data, or EmptyRank if all ranks after this rank is empty.
- Return type
scalar
- prev()[source]¶
The item before the local data.
This method fetches the last item before the local data. If the rank before is empty, the rank before is used.
If no item is before this rank, EmptyRank is returned
- Returns
prev – Item before local data, or EmptyRank if all ranks before this rank is empty.
- Return type
scalar
- nbodykit.utils.ScatterArray(data, comm, root=0, counts=None)[source]¶
Scatter the input data array across all ranks, assuming data is initially only on root (and None on other ranks).
This uses
Scatterv
, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype- Parameters
data (array_like or None) – on root, this gives the data to split and scatter
comm (MPI communicator) – the MPI communicator
root (int) – the rank number that initially has the data
counts (list of int) – list of the lengths of data to send to each rank
- Returns
recvbuffer – the chunk of data that each rank gets
- Return type
array_like
- nbodykit.utils.captured_output(comm, root=0)[source]¶
Re-direct stdout and stderr to null for every rank but
root
- nbodykit.utils.deprecate(name, alternative, alt_name=None)[source]¶
This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emmitted when the function is used.
- nbodykit.utils.get_data_bounds(data, comm, selection=None)[source]¶
Return the global minimum/maximum of a numpy/dask array along the first axis.
This is computed in chunks to avoid memory errors on large data.
- Parameters
data (numpy.ndarray or dask.array.Array) – the data to find the bounds of
comm – the MPI communicator
- Returns
the min/max of
data
- Return type
min, max
- nbodykit.utils.is_structured_array(arr)[source]¶
Test if the input array is a structured array by testing for dtype.names
- nbodykit.utils.split_size_3d(s)[source]¶
Split s into three integers, a, b, c, such that a * b * c == s and a <= b <= c