nbodykit.utils

Functions

FrontPadArray(array, front, comm) Padding an array in the front with items before this rank.
GatherArray(data, comm[, root]) Gather the input data array from all ranks to the specified root.
ScatterArray(data, comm[, root, counts]) Scatter the input data array across all ranks, assuming data is initially only on root (and None on other ranks).
attrs_to_dict(obj, prefix)
captured_output(comm[, root]) Re-direct stdout and stderr to null for every rank but root
deprecate(name, alternative[, alt_name]) This is a decorator which can be used to mark functions as deprecated.
get_data_bounds(data, comm[, selection]) Return the global minimum/maximum of a numpy/dask array along the first axis.
is_structured_array(arr) Test if the input array is a structured array by testing for dtype.names
split_size_3d(s) Split s into three integers, a, b, c, such that a * b * c == s and a <= b <= c
timer(start, end) Utility function to return a string representing the elapsed time, as computed from the input start and end times

Classes

DistributedArray(local, comm) Distributed Array Object
EmptyRankType
JSONDecoder(*args, **kwargs) A subclass of json.JSONDecoder that can also handle numpy arrays, complex values, and astropy.units.Quantity objects.
JSONEncoder(*[, skipkeys, ensure_ascii, …]) A subclass of json.JSONEncoder that can also handle numpy arrays, complex values, and astropy.units.Quantity objects.
LinearTopology(local, comm) Helper object for the topology of a distributed array
class nbodykit.utils.DistributedArray(local, comm)[source]

Distributed Array Object

A distributed array is striped along ranks, along first dimension

comm

the communicator

Type:mpi4py.MPI.Comm
local

the local data

Type:array_like

Methods

bincount([weights, local, shared_edges]) Assign count numbers from sorted local data.
cempty(cshape, dtype, comm) Create an empty array collectively
concat(*args, **kwargs) Append several distributed arrays into one.
sort([orderby]) Sort array globally by key orderby.
unique_labels() Assign unique labels to sorted local.
bincount(weights=None, local=False, shared_edges=True)[source]

Assign count numbers from sorted local data.

Warning

local data must be globally sorted, and of integer type. (numpy.bincount)

Parameters:
  • weights (array-like) – if given, count the weight instead of the number of objects.
  • local (boolean) – if local is True, only count the local array.
  • shared_edges (boolean) – if True, keep the counts at edges that are shared between ranks on both ranks. if False, keep the counts at shared edges to the rank on the left.
Returns:

N – distributed counts array. If items of the same value spans other chunks of array, they are added to N as well.

Return type:

DistributedArray

Examples

if the local array is [ (0, 0), (0, 1)], Then the counts array is [ (3, ), (3, 1)]

classmethod cempty(cshape, dtype, comm)[source]

Create an empty array collectively

classmethod concat(*args, **kwargs)[source]

Append several distributed arrays into one.

Parameters:localsize (None) –
sort(orderby=None)[source]

Sort array globally by key orderby.

Due to a limitation of mpsort, self[orderby] must be u8.

unique_labels()[source]

Assign unique labels to sorted local.

Warning

local data must be globally sorted, and of simple type. (numpy.unique)

Returns:label – the new labels, starting from 0
Return type:DistributedArray
nbodykit.utils.FrontPadArray(array, front, comm)[source]

Padding an array in the front with items before this rank.

nbodykit.utils.GatherArray(data, comm, root=0)[source]

Gather the input data array from all ranks to the specified root.

This uses Gatherv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters:
  • data (array_like) – the data on each rank to gather
  • comm (MPI communicator) – the MPI communicator
  • root (int, or Ellipsis) – the rank number to gather the data to. If root is Ellipsis, broadcast the result to all ranks.
Returns:

recvbuffer – the gathered data on root, and None otherwise

Return type:

array_like, None

class nbodykit.utils.JSONDecoder(*args, **kwargs)[source]

A subclass of json.JSONDecoder that can also handle numpy arrays, complex values, and astropy.units.Quantity objects.

Methods

decode(s[, _w]) Return the Python representation of s (a str instance containing a JSON document).
raw_decode(s[, idx]) Decode a JSON document from s (a str beginning with a JSON document) and return a 2-tuple of the Python representation and the index in s where the document ended.
hook  
decode(s, _w=<built-in method match of _sre.SRE_Pattern object>)[source]

Return the Python representation of s (a str instance containing a JSON document).

raw_decode(s, idx=0)[source]

Decode a JSON document from s (a str beginning with a JSON document) and return a 2-tuple of the Python representation and the index in s where the document ended.

This can be used to decode a JSON document from a string that may have extraneous data at the end.

class nbodykit.utils.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

A subclass of json.JSONEncoder that can also handle numpy arrays, complex values, and astropy.units.Quantity objects.

Methods

default(obj) Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).
encode(o) Return a JSON string representation of a Python data structure.
iterencode(o[, _one_shot]) Encode the given object and yield each string representation as available.
__init__(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘) if indent is None and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
encode(o)[source]

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
iterencode(o, _one_shot=False)[source]

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)
class nbodykit.utils.LinearTopology(local, comm)[source]

Helper object for the topology of a distributed array

Methods

heads() The first items on each rank.
next() The item after the local data.
prev() The item before the local data.
tails() The last items on each rank.
heads()[source]

The first items on each rank.

Returns:heads – a list of first items, EmptyRank is used for empty ranks
Return type:list
next()[source]

The item after the local data.

This method the first item after the local data. If the rank after current rank is empty, item after that rank is used.

If no item is after local data, EmptyRank is returned.

Returns:next – Item after local data, or EmptyRank if all ranks after this rank is empty.
Return type:scalar
prev()[source]

The item before the local data.

This method fetches the last item before the local data. If the rank before is empty, the rank before is used.

If no item is before this rank, EmptyRank is returned

Returns:prev – Item before local data, or EmptyRank if all ranks before this rank is empty.
Return type:scalar
tails()[source]

The last items on each rank.

Returns:tails – a list of last items, EmptyRank is used for empty ranks
Return type:list
nbodykit.utils.ScatterArray(data, comm, root=0, counts=None)[source]

Scatter the input data array across all ranks, assuming data is initially only on root (and None on other ranks).

This uses Scatterv, which avoids mpi4py pickling, and also avoids the 2 GB mpi4py limit for bytes using a custom datatype

Parameters:
  • data (array_like or None) – on root, this gives the data to split and scatter
  • comm (MPI communicator) – the MPI communicator
  • root (int) – the rank number that initially has the data
  • counts (list of int) – list of the lengths of data to send to each rank
Returns:

recvbuffer – the chunk of data that each rank gets

Return type:

array_like

nbodykit.utils.captured_output(comm, root=0)[source]

Re-direct stdout and stderr to null for every rank but root

nbodykit.utils.deprecate(name, alternative, alt_name=None)[source]

This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emmitted when the function is used.

nbodykit.utils.get_data_bounds(data, comm, selection=None)[source]

Return the global minimum/maximum of a numpy/dask array along the first axis.

This is computed in chunks to avoid memory errors on large data.

Parameters:
Returns:

the min/max of data

Return type:

min, max

nbodykit.utils.is_structured_array(arr)[source]

Test if the input array is a structured array by testing for dtype.names

nbodykit.utils.split_size_3d(s)[source]

Split s into three integers, a, b, c, such that a * b * c == s and a <= b <= c

Parameters:s (int) – integer to split
Returns:a, b, c – integers such that a * b * c == s and a <= b <= c
Return type:int
nbodykit.utils.timer(start, end)[source]

Utility function to return a string representing the elapsed time, as computed from the input start and end times

Parameters:
  • start (int) – the start time in seconds
  • end (int) – the end time in seconds
Returns:

the elapsed time as a string, using the format hours:minutes:seconds

Return type:

str