nd.utils package¶

This module provides several helper functions.

nd.utils.apply(ds, fn, signature=None, njobs=1)[source]¶

Apply a function to a Dataset that operates on a defined subset of dimensions.

Parameters

ds (xr.Dataset or xr.DataArray) – The dataset to which to apply the function.
fn (function) – The function to apply to the Dataset.
signature (str, optional) – The signature of the function in dimension names, e.g. ‘(time,var)->(time)’. If ‘var’ is included, the Dataset variables will be converted into a new dimension and the result will be a DataArray.
njobs (int, optional) – The number of jobs to run in parallel.

Returns

The output dataset with changed dimensions according to the function signature.

Return type

xr.Dataset

nd.utils.array_chunks(array, n, axis=0, return_indices=False)[source]¶

Chunk an array along the given axis.

Parameters

array (numpy.array) – The array to be chunked
n (int) – The chunksize.
axis (int, optional) – The axis along which to split the array into chunks (default: 0).
return_indices (bool, optional) – If True, yield the array index that will return chunk rather than the chunk itself (default: False).

Yields

iterable – Consecutive slices of array of size n.

nd.utils.block_merge(array_list, blocks)[source]¶

Reassemble a list of arrays as generated by block_split.

Parameters

array_list (list of numpy.array) – A list of numpy.array, e.g. as generated by block_split().
blocks (array_like) – The number of blocks per axis to be merged.

Returns

A numpy array with dimension len(blocks).

Return type

numpy.array

nd.utils.block_split(array, blocks)[source]¶

Split an ndarray into subarrays according to blocks.

Parameters

array (numpy.ndarray) – The array to be split.
blocks (array_like) – The desired number of blocks per axis.

Returns

A list of blocks, in column-major order.

Return type

list

Examples

>>> block_split(np.arange(16).reshape((4, 4)), (2, 2))
[array([[ 0,  1],
        [ 4,  5]]),
 array([[ 2,  3],
        [ 6,  7]]),
 array([[ 8,  9],
        [12, 13]]),
 array([[10, 11],
        [14, 15]])]

nd.utils.chunks(l, n)[source]¶

Yield successive n-sized chunks from l.

https://stackoverflow.com/a/312464

Parameters

l (iterable) – The list or list-like object to be split into chunks.
n (int) – The size of the chunks to be generated.

Yields

iterable – Consecutive slices of l of size n.

nd.utils.dict_product(d)[source]¶: Like itertools.product, but works with dictionaries.

nd.utils.expand_variables(da, dim='variable')[source]¶

This is the inverse of xarray.Dataset.to_array().

Parameters

da (xarray.DataArray) – A DataArray that contains the variable names as dimension.
dim (str) – The dimension name (default: ‘variable’).

Returns

A dataset with the variable dimension in da exploded to variables.

Return type

xarray.Dataset

nd.utils.get_dims(ds)[source]¶: Return the dimension of dataset ds in order.

nd.utils.get_shape(ds)[source]¶

nd.utils.get_vars_for_dims(ds, dims, invert=False)[source]¶

Return a list of all variables in ds which have dimensions dims.

Parameters

ds (xarray.Dataset) –
dims (list of str) – The dimensions that each variable must contain.
invert (bool, optional) – Whether to return the variables that do not contain the given dimensions (default: False).

Returns

A list of all variable names that have dimensions dims.

Return type

list of str

nd.utils.is_complex(ds)[source]¶

Check if a dataset contains any complex variables.

Parameters: ds (xarray.Dataset or xarray.DataArray) –
Returns: True if ds contains any complex variables, False otherwise.
Return type: bool

nd.utils.parallel(fn, dim=None, chunks=None, chunksize=None, merge=True, buffer=0, compute=True)[source]¶

Parallelize a function that takes an xarray dataset as first argument.

TODO: make accept numpy arrays as well.

Parameters

fn (function) – Must take an xarray.Dataset as first argument.
dim (str, optional) – The dimension along which to split the dataset for parallel execution. If not passed, try ‘y’ as default dimension.
chunks (int, optional) – The number of chunks to execute in parallel. If not passed, use the number of available CPUs.
chunksize (int, optional) – … to be implemented
buffer (int, optional) – (default: 0)
compute (bool, optional) – If True, return the computed result. Otherwise, return the dask computation object (default: True).

Returns

A parallelized function that may be called with exactly the same arguments as fn.

Return type

function

nd.utils.select(objects, fn, unlist=True, first=False)[source]¶

Returns a subset of objects that matches a range of criteria.

Parameters

objects (list of obj) – The collection of objects to filter.
fn (lambda expression) – Filter objects by whether fn(obj) returns True.
first (bool, optional) – If True, return first entry only (default: False).
unlist (bool, optional) – If True and the result has length 1 and objects is a list, return the object directly, rather than the list (default: True).

Returns

A list of all items in objects that match the specified criteria.

Return type

list

Examples

>>> select([{'a': 1, 'b': 2}, {'a': 2, 'b': 2}, {'a': 1, 'b': 1}],
            lambda o: o['a'] == 1)
[{'a': 1, 'b': 2}, {'a': 1, 'b': 1}]

nd.utils.str2date(string, fmt=None, tz=False)[source]¶

nd.utils.xr_merge(ds_list, dim, buffer=0)[source]¶

Reverse xr_split().

Parameters

ds_list (list of xarray.Dataset) –
dim (str) – The dimension along which to concatenate.

Returns

Return type

xarray.Dataset

nd.utils.xr_split(ds, dim, chunks, buffer=0)[source]¶

Split an xarray Dataset into chunks.

Parameters

ds (xarray.Dataset) – The original dataset
dim (str) – The dimension along which to split.
chunks (int) – The number of chunks to generate.

Yields

xarray.Dataset – An individual chunk.