nd.utils package¶
This module provides several helper functions.
-
nd.utils.
apply
(ds, fn, signature=None, njobs=1)[source]¶ Apply a function to a Dataset that operates on a defined subset of dimensions.
- Parameters
ds (xr.Dataset or xr.DataArray) – The dataset to which to apply the function.
fn (function) – The function to apply to the Dataset.
signature (str, optional) – The signature of the function in dimension names, e.g. ‘(time,var)->(time)’. If ‘var’ is included, the Dataset variables will be converted into a new dimension and the result will be a DataArray.
njobs (int, optional) – The number of jobs to run in parallel.
- Returns
The output dataset with changed dimensions according to the function signature.
- Return type
xr.Dataset
-
nd.utils.
array_chunks
(array, n, axis=0, return_indices=False)[source]¶ Chunk an array along the given axis.
- Parameters
array (numpy.array) – The array to be chunked
n (int) – The chunksize.
axis (int, optional) – The axis along which to split the array into chunks (default: 0).
return_indices (bool, optional) – If True, yield the array index that will return chunk rather than the chunk itself (default: False).
- Yields
iterable – Consecutive slices of array of size n.
-
nd.utils.
block_merge
(array_list, blocks)[source]¶ Reassemble a list of arrays as generated by block_split.
- Parameters
array_list (list of numpy.array) – A list of numpy.array, e.g. as generated by block_split().
blocks (array_like) – The number of blocks per axis to be merged.
- Returns
A numpy array with dimension len(blocks).
- Return type
numpy.array
-
nd.utils.
block_split
(array, blocks)[source]¶ Split an ndarray into subarrays according to blocks.
- Parameters
array (numpy.ndarray) – The array to be split.
blocks (array_like) – The desired number of blocks per axis.
- Returns
A list of blocks, in column-major order.
- Return type
list
Examples
>>> block_split(np.arange(16).reshape((4, 4)), (2, 2)) [array([[ 0, 1], [ 4, 5]]), array([[ 2, 3], [ 6, 7]]), array([[ 8, 9], [12, 13]]), array([[10, 11], [14, 15]])]
-
nd.utils.
chunks
(l, n)[source]¶ Yield successive n-sized chunks from l.
https://stackoverflow.com/a/312464
- Parameters
l (iterable) – The list or list-like object to be split into chunks.
n (int) – The size of the chunks to be generated.
- Yields
iterable – Consecutive slices of l of size n.
-
nd.utils.
expand_variables
(da, dim='variable')[source]¶ This is the inverse of xarray.Dataset.to_array().
- Parameters
da (xarray.DataArray) – A DataArray that contains the variable names as dimension.
dim (str) – The dimension name (default: ‘variable’).
- Returns
A dataset with the variable dimension in da exploded to variables.
- Return type
xarray.Dataset
-
nd.utils.
get_vars_for_dims
(ds, dims, invert=False)[source]¶ Return a list of all variables in ds which have dimensions dims.
- Parameters
ds (xarray.Dataset) –
dims (list of str) – The dimensions that each variable must contain.
invert (bool, optional) – Whether to return the variables that do not contain the given dimensions (default: False).
- Returns
A list of all variable names that have dimensions dims.
- Return type
list of str
-
nd.utils.
is_complex
(ds)[source]¶ Check if a dataset contains any complex variables.
- Parameters
ds (xarray.Dataset or xarray.DataArray) –
- Returns
True if ds contains any complex variables, False otherwise.
- Return type
bool
-
nd.utils.
parallel
(fn, dim=None, chunks=None, chunksize=None, merge=True, buffer=0, compute=True)[source]¶ Parallelize a function that takes an xarray dataset as first argument.
TODO: make accept numpy arrays as well.
- Parameters
fn (function) – Must take an xarray.Dataset as first argument.
dim (str, optional) – The dimension along which to split the dataset for parallel execution. If not passed, try ‘y’ as default dimension.
chunks (int, optional) – The number of chunks to execute in parallel. If not passed, use the number of available CPUs.
chunksize (int, optional) – … to be implemented
buffer (int, optional) – (default: 0)
compute (bool, optional) – If True, return the computed result. Otherwise, return the dask computation object (default: True).
- Returns
A parallelized function that may be called with exactly the same arguments as fn.
- Return type
function
-
nd.utils.
select
(objects, fn, unlist=True, first=False)[source]¶ Returns a subset of objects that matches a range of criteria.
- Parameters
objects (list of obj) – The collection of objects to filter.
fn (lambda expression) – Filter objects by whether fn(obj) returns True.
first (bool, optional) – If True, return first entry only (default: False).
unlist (bool, optional) – If True and the result has length 1 and objects is a list, return the object directly, rather than the list (default: True).
- Returns
A list of all items in objects that match the specified criteria.
- Return type
list
Examples
>>> select([{'a': 1, 'b': 2}, {'a': 2, 'b': 2}, {'a': 1, 'b': 1}], lambda o: o['a'] == 1) [{'a': 1, 'b': 2}, {'a': 1, 'b': 1}]