hyperspy._signals.lazy module

class hyperspy._signals.lazy.LazySignal(data, **kwds)

Bases: hyperspy.signal.BaseSignal

A Lazy Signal instance that delays computation until explicitly saved (assuming storing the full result of computation in memory is not feasible)

change_dtype(dtype, rechunk=True)

Change the data type.

Parameters
  • dtype (str or dtype) – Typecode or data-type to which the array is cast. In addition to all standard numpy dtypes HyperSpy supports four extra dtypes for RGB images: “rgb8”, “rgba8”, “rgb16” and “rgba16”. Changing from and to any rgbx dtype is more constrained than most other dtype conversions. To change to a rgbx dtype the signal dimension must be 1, its size 3(4) for rgb(rgba) dtypes, the dtype uint8(uint16) for rgbx8(rgbx16) and the navigation dimension at least 2. After conversion the signal dimension becomes 2. The dtype of images of dtype rgbx8(rgbx16) can only be changed to uint8(uint16) and the signal dimension becomes 1.

  • rechunk (bool) – Only has effect when operating on lazy signal. If True (default), the data may be automatically rechunked before performing this operation.

Examples

>>> s = hs.signals.Signal1D([1,2,3,4,5])
>>> s.data
array([1, 2, 3, 4, 5])
>>> s.change_dtype('float')
>>> s.data
array([ 1.,  2.,  3.,  4.,  5.])
close_file()

Closes the associated data file if any.

Currently it only supports closing the file associated with a dask array created from an h5py DataSet (default HyperSpy hdf5 reader).

compute(progressbar=True, close_file=False)

Attempt to store the full signal in memory.

close_file: bool

If True, attemp to close the file associated with the dask array data if any. Note that closing the file will make all other associated lazy signals inoperative.

decomposition(normalize_poissonian_noise=False, algorithm='svd', output_dimension=None, signal_mask=None, navigation_mask=None, get=<function get>, num_chunks=None, reproject=True, bounds=False, **kwargs)

Perform Incremental (Batch) decomposition on the data, keeping n significant components.

Parameters
  • normalize_poissonian_noise (bool) – If True, scale the SI to normalize Poissonian noise

  • algorithm (str) – One of (‘svd’, ‘PCA’, ‘ORPCA’, ‘ONMF’). By default ‘svd’, lazy SVD decomposition from dask.

  • output_dimension (int) – the number of significant components to keep. If None, keep all (only valid for SVD)

  • get (dask scheduler) – the dask scheduler to use for computations; default dask.threaded.get

  • num_chunks (int) – the number of dask chunks to pass to the decomposition model. More chunks require more memory, but should run faster. Will be increased to contain atleast output_dimension signals.

  • navigation_mask ({BaseSignal, numpy array, dask array}) – The navigation locations marked as True are not used in the decompostion.

  • signal_mask ({BaseSignal, numpy array, dask array}) – The signal locations marked as True are not used in the decomposition.

  • reproject (bool) – Reproject data on the learnt components (factors) after learning.

  • **kwargs – passed to the partial_fit/fit functions.

Notes

Various algorithm parameters and their default values:
ONMF:

lambda1=1, kappa=1, robust=False, store_r=False batch_size=None

ORPCA:

fast=True, lambda1=None, lambda2=None, method=None, learning_rate=None, init=None, training_samples=None, momentum=None

PCA:

batch_size=None, copy=True, white=False

diff(axis, order=1, out=None, rechunk=True)

Returns a signal with the n-th order discrete difference along given axis.

Parameters
  • axis (int, str or axis) – The axis can be passed directly, or specified using the index of the axis in axes_manager or the axis name.

  • order (int) – the order of the derivative

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

  • rechunk (bool) – Only has effect when operating on lazy signal. If True (default), the data may be automatically rechunked before performing this operation.

See also

max(), min(), sum(), mean(), std(), var(), indexmax(), valuemax(), amax()

Examples

>>> import numpy as np
>>> s = BaseSignal(np.random.random((64,64,1024)))
>>> s.data.shape
(64,64,1024)
>>> s.diff(-1).data.shape
(64,64,1023)
get_histogram(bins='freedman', out=None, rechunk=True, **kwargs)

Return a histogram of the signal data.

More sophisticated algorithms for determining bins can be used. Aside from the bins argument allowing a string specified how bins are computed, the parameters are the same as numpy.histogram().

Parameters
  • bins (int or list or str, optional) – If bins is a string, then it must be one of: ‘knuth’ : use Knuth’s rule to determine bins ‘scotts’ : use Scott’s rule to determine bins ‘freedman’ : use the Freedman-diaconis rule to determine bins ‘blocks’ : use bayesian blocks for dynamic bin widths

  • range_bins (tuple or None, optional) – the minimum and maximum range for the histogram. If not specified, it will be (x.min(), x.max())

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

  • rechunk (bool) – Only has effect when operating on lazy signal. If True (default), the data may be automatically rechunked before performing this operation.

  • **kwargs – other keyword arguments (weight and density) are described in np.histogram().

Returns

hist_spec

Return type

An 1D spectrum instance containing the histogram.

See also

print_summary_statistics(), astroML.density_estimation.histogram(), numpy.histogram()

Notes

The lazy version of the algorithm does not support ‘knuth’ and ‘blocks’ bins arguments. The number of bins estimators are taken from AstroML. Read their documentation for more info.

Examples

>>> s = hs.signals.Signal1D(np.random.normal(size=(10, 100)))
Plot the data histogram
>>> s.get_histogram().plot()
Plot the histogram of the signal at the current coordinates
>>> s.get_current_signal().get_histogram().plot()
integrate_simpson(axis, out=None)

Returns a signal with the result of calculating the integral of the signal along an axis using Simpson’s rule.

Parameters
  • axis (int, str or axis) – The axis can be passed directly, or specified using the index of the axis in axes_manager or the axis name.

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

Returns

s

Return type

Signal

See also

max(), min(), sum(), mean(), std(), var(), indexmax(), valuemax(), amax()

Examples

>>> import numpy as np
>>> s = BaseSignal(np.random.random((64,64,1024)))
>>> s.data.shape
(64,64,1024)
>>> s.integrate_simpson(-1).data.shape
(64,64)
rebin(new_shape=None, scale=None, crop=False, out=None, rechunk=True)

Rebin array.

Rebin the signal into a smaller or larger shape, based on linear interpolation. Specify either new_shape or scale.

Parameters
  • new_shape (a list of floats or integer, default None) – For each dimension specify the new_shape. This will then be converted into a scale.

  • scale (a list of floats or integer, default None) – For each dimension specify the new:old pixel ratio, e.g. a ratio of 1 is no binning and a ratio of 2 means that each pixel in the new spectrum is twice the size of the pixels in the old spectrum. The length of the list should match the dimension of the numpy array. *Note : Only one of scale or new_shape should be specified otherwise the function will not run*

  • crop (bool, default True) –

    When binning by a non-integer number of pixels it is likely that the final row in each dimension contains less than the full quota to fill one pixel.

    e.g. 5*5 array binned by 2.1 will produce two rows containing 2.1 pixels and one row containing only 0.8 pixels worth. Selection of crop=’True’ or crop=’False’ determines whether or not this ‘black’ line is cropped from the final binned array or not.

    Please note that if crop=False is used, the final row in each dimension may appear black, if a fractional number of pixels are left over. It can be removed but has been left to preserve total counts before and after binning.

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

Returns

s

Return type

Signal subclass

Examples

>>> spectrum = hs.signals.EDSTEMSpectrum(np.ones([4, 4, 10]))
>>> spectrum.data[1, 2, 9] = 5
>>> print(spectrum)
<EDXTEMSpectrum, title: dimensions: (4, 4|10)>
>>> print ('Sum = ', sum(sum(sum(spectrum.data))))
Sum = 164.0
>>> scale = [2, 2, 5]
>>> test = spectrum.rebin(scale)
>>> print(test)
<EDSTEMSpectrum, title: dimensions (2, 2|2)>
>>> print('Sum = ', sum(sum(sum(test.data))))
Sum =  164.0
valuemax(axis, out=None, rechunk=True)

Returns a signal with the value of coordinates of the maximum along an axis.

Parameters
  • axis (int, str or axis) – The axis can be passed directly, or specified using the index of the axis in axes_manager or the axis name.

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

  • rechunk (bool) – Only has effect when operating on lazy signal. If True (default), the data may be automatically rechunked before performing this operation.

Returns

s

Return type

Signal

See also

max, min, sum, mean, std, var, indexmax, amax

>>> import numpy as np
>>> s = BaseSignal(np.random.random((64,64,1024)))
>>> s.data.shape
(64,64,1024)
>>> s.valuemax(-1).data.shape
(64,64)
valuemin(axis, out=None, rechunk=True)

Returns a signal with the value of coordinates of the minimum along an axis.

Parameters
  • axis (int, str or axis) – The axis can be passed directly, or specified using the index of the axis in axes_manager or the axis name.

  • out (Signal or None) – If None, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.

  • rechunk (bool) – Only has effect when operating on lazy signal. If True (default), the data may be automatically rechunked before performing this operation.

Returns

s

Return type

Signal

See also

max(), min(), sum(), mean(), std(), var(), indexmax(), amax()

hyperspy._signals.lazy.lazyerror = NotImplementedError('This method is not available in lazy signals')
hyperspy._signals.lazy.to_array(thing, chunks=None)

Accepts BaseSignal, dask or numpy arrays and always produces either numpy or dask array.

Parameters
  • thing ({BaseSignal, dask.array.Array, numpy.ndarray}) – the thing to be converted

  • chunks ({None, tuple of tuples}) – If None, the returned value is a numpy array. Otherwise returns dask array with the chunks as specified.

Returns

res

Return type

{numpy.ndarray, dask.array.Array}