nd.utils package

This module provides several helper functions.

nd.utils.apply(ds, fn, signature=None, njobs=1)[source]

Apply a function to a Dataset that operates on a defined subset of dimensions.

Parameters
  • ds (xr.Dataset or xr.DataArray) – The dataset to which to apply the function.

  • fn (function) – The function to apply to the Dataset.

  • signature (str, optional) – The signature of the function in dimension names, e.g. ‘(time,var)->(time)’. If ‘var’ is included, the Dataset variables will be converted into a new dimension and the result will be a DataArray.

  • njobs (int, optional) – The number of jobs to run in parallel.

Returns

The output dataset with changed dimensions according to the function signature.

Return type

xr.Dataset

nd.utils.array_chunks(array, n, axis=0, return_indices=False)[source]

Chunk an array along the given axis.

Parameters
  • array (numpy.array) – The array to be chunked

  • n (int) – The chunksize.

  • axis (int, optional) – The axis along which to split the array into chunks (default: 0).

  • return_indices (bool, optional) – If True, yield the array index that will return chunk rather than the chunk itself (default: False).

Yields

iterable – Consecutive slices of array of size n.

nd.utils.block_merge(array_list, blocks)[source]

Reassemble a list of arrays as generated by block_split.

Parameters
  • array_list (list of numpy.array) – A list of numpy.array, e.g. as generated by block_split().

  • blocks (array_like) – The number of blocks per axis to be merged.

Returns

A numpy array with dimension len(blocks).

Return type

numpy.array

nd.utils.block_split(array, blocks)[source]

Split an ndarray into subarrays according to blocks.

Parameters
  • array (numpy.ndarray) – The array to be split.

  • blocks (array_like) – The desired number of blocks per axis.

Returns

A list of blocks, in column-major order.

Return type

list

Examples

>>> block_split(np.arange(16).reshape((4, 4)), (2, 2))
[array([[ 0,  1],
        [ 4,  5]]),
 array([[ 2,  3],
        [ 6,  7]]),
 array([[ 8,  9],
        [12, 13]]),
 array([[10, 11],
        [14, 15]])]
nd.utils.chunks(l, n)[source]

Yield successive n-sized chunks from l.

https://stackoverflow.com/a/312464

Parameters
  • l (iterable) – The list or list-like object to be split into chunks.

  • n (int) – The size of the chunks to be generated.

Yields

iterable – Consecutive slices of l of size n.

nd.utils.dict_product(d)[source]

Like itertools.product, but works with dictionaries.

nd.utils.expand_variables(da, dim='variable')[source]

This is the inverse of xarray.Dataset.to_array().

Parameters
  • da (xarray.DataArray) – A DataArray that contains the variable names as dimension.

  • dim (str) – The dimension name (default: ‘variable’).

Returns

A dataset with the variable dimension in da exploded to variables.

Return type

xarray.Dataset

nd.utils.get_dims(ds)[source]

Return the dimension of dataset ds in order.

nd.utils.get_shape(ds)[source]
nd.utils.get_vars_for_dims(ds, dims, invert=False)[source]

Return a list of all variables in ds which have dimensions dims.

Parameters
  • ds (xarray.Dataset) –

  • dims (list of str) – The dimensions that each variable must contain.

  • invert (bool, optional) – Whether to return the variables that do not contain the given dimensions (default: False).

Returns

A list of all variable names that have dimensions dims.

Return type

list of str

nd.utils.is_complex(ds)[source]

Check if a dataset contains any complex variables.

Parameters

ds (xarray.Dataset or xarray.DataArray) –

Returns

True if ds contains any complex variables, False otherwise.

Return type

bool

nd.utils.parallel(fn, dim=None, chunks=None, chunksize=None, merge=True, buffer=0)[source]

Parallelize a function that takes an xarray dataset as first argument.

TODO: make accept numpy arrays as well.

Parameters
  • fn (function) – Must take an xarray.Dataset as first argument.

  • dim (str, optional) – The dimension along which to split the dataset for parallel execution. If not passed, try ‘y’ as default dimension.

  • chunks (int, optional) – The number of chunks to execute in parallel. If not passed, use the number of available CPUs.

  • chunksize (int, optional) – … to be implemented

  • buffer (int, optional) – (default: 0)

Returns

A parallelized function that may be called with exactly the same arguments as fn.

Return type

function

nd.utils.select(objects, fn, unlist=True, first=False)[source]

Returns a subset of objects that matches a range of criteria.

Parameters
  • objects (list of obj) – The collection of objects to filter.

  • fn (lambda expression) – Filter objects by whether fn(obj) returns True.

  • first (bool, optional) – If True, return first entry only (default: False).

  • unlist (bool, optional) – If True and the result has length 1 and objects is a list, return the object directly, rather than the list (default: True).

Returns

A list of all items in objects that match the specified criteria.

Return type

list

Examples

>>> select([{'a': 1, 'b': 2}, {'a': 2, 'b': 2}, {'a': 1, 'b': 1}],
            lambda o: o['a'] == 1)
[{'a': 1, 'b': 2}, {'a': 1, 'b': 1}]
nd.utils.str2date(string, fmt=None, tz=False)[source]
nd.utils.xr_merge(ds_list, dim, buffer=0)[source]

Reverse xr_split().

Parameters
  • ds_list (list of xarray.Dataset) –

  • dim (str) – The dimension along which to concatenate.

Returns

Return type

xarray.Dataset

nd.utils.xr_split(ds, dim, chunks, buffer=0)[source]

Split an xarray Dataset into chunks.

Parameters
  • ds (xarray.Dataset) – The original dataset

  • dim (str) – The dimension along which to split.

  • chunks (int) – The number of chunks to generate.

Yields

xarray.Dataset – An individual chunk.