hyperspy._signals.lazy module¶
-
class
hyperspy._signals.lazy.
LazySignal
(data, **kwds)¶ Bases:
hyperspy.signal.BaseSignal
A Lazy Signal instance that delays computation until explicitly saved (assuming storing the full result of computation in memory is not feasible)
Create a Signal from a numpy array.
- Parameters
data (
numpy.ndarray
) – The signal data. It can be an array of any dimensions.axes (dict, optional) – Dictionary to define the axes (see the documentation of the
AxesManager
class for more details).attributes (dict, optional) – A dictionary whose items are stored as attributes.
metadata (dict, optional) – A dictionary containing a set of parameters that will to stores in the
metadata
attribute. Some parameters might be mandatory in some cases.original_metadata (dict, optional) – A dictionary containing a set of parameters that will to stores in the
original_metadata
attribute. It typically contains all the parameters that has been imported from the original data file.
-
change_dtype
(dtype, rechunk=True)¶ Change the data type of a Signal.
- Parameters
dtype (str or
numpy.dtype
) – Typecode string or data-type to which the Signal’s data array is cast. In addition to all the standard numpy Data type objects (dtype), HyperSpy supports four extra dtypes for RGB images:'rgb8'
,'rgba8'
,'rgb16'
, and'rgba16'
. Changing from and to anyrgb(a)
dtype is more constrained than most other dtype conversions. To change to anrgb(a)
dtype, the signal_dimension must be 1, and its size should be 3 (forrgb
) or 4 (forrgba
) dtypes. The original dtype should beuint8
oruint16
if converting torgb(a)8
orrgb(a))16
, and the navigation_dimension should be at least 2. After conversion, the signal_dimension becomes 2. The dtype of images with original dtypergb(a)8
orrgb(a)16
can only be changed touint8
oruint16
, and the signal_dimension becomes 1.rechunk (bool) – Only has effect when operating on lazy signal. If
True
(default), the data may be automatically rechunked before performing this operation.
Examples
>>> s = hs.signals.Signal1D([1,2,3,4,5]) >>> s.data array([1, 2, 3, 4, 5]) >>> s.change_dtype('float') >>> s.data array([ 1., 2., 3., 4., 5.])
-
close_file
()¶ Closes the associated data file if any.
Currently it only supports closing the file associated with a dask array created from an h5py DataSet (default HyperSpy hdf5 reader).
-
compute
(progressbar=True, close_file=False)¶ Attempt to store the full signal in memory.
- close_file: bool
If True, attemp to close the file associated with the dask array data if any. Note that closing the file will make all other associated lazy signals inoperative.
-
decomposition
(normalize_poissonian_noise=False, algorithm='svd', output_dimension=None, signal_mask=None, navigation_mask=None, get=<function get>, num_chunks=None, reproject=True, bounds=False, **kwargs)¶ Perform Incremental (Batch) decomposition on the data, keeping n significant components.
- Parameters
normalize_poissonian_noise (bool) – If True, scale the SI to normalize Poissonian noise
algorithm (str) – One of (‘svd’, ‘PCA’, ‘ORPCA’, ‘ONMF’). By default ‘svd’, lazy SVD decomposition from dask.
output_dimension (int) – the number of significant components to keep. If None, keep all (only valid for SVD)
get (dask scheduler) – the dask scheduler to use for computations; default dask.threaded.get
num_chunks (int) – the number of dask chunks to pass to the decomposition model. More chunks require more memory, but should run faster. Will be increased to contain atleast output_dimension signals.
navigation_mask ({BaseSignal, numpy array, dask array}) – The navigation locations marked as True are not used in the decompostion.
signal_mask ({BaseSignal, numpy array, dask array}) – The signal locations marked as True are not used in the decomposition.
reproject (bool) – Reproject data on the learnt components (factors) after learning.
**kwargs – passed to the partial_fit/fit functions.
Notes
- Various algorithm parameters and their default values:
- ONMF:
lambda1=1, kappa=1, robust=False, store_r=False batch_size=None
- ORPCA:
fast=True, lambda1=None, lambda2=None, method=None, learning_rate=None, init=None, training_samples=None, momentum=None
- PCA:
batch_size=None, copy=True, white=False
-
diff
(axis, order=1, out=None, rechunk=True)¶ Returns a signal with the n-th order discrete difference along given axis. i.e. it calculates the difference between consecutive values in the given axis: out[n] = a[n+1] - a[n]. See
numpy.diff()
for more details.- Parameters
axis (
int
,str
, orDataAxis
) – The axis can be passed directly, or specified using the index of the axis in the Signal’s axes_manager or the axis name.order (int) – The order of the discrete difference.
out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.rechunk (bool) – Only has effect when operating on lazy signal. If
True
(default), the data may be automatically rechunked before performing this operation.
- Returns
s – Note that the size of the data on the given
axis
decreases by the givenorder
. i.e. ifaxis
is"x"
andorder
is 2, the x dimension is N,der
’s x dimension is N - 2.- Return type
BaseSignal
(or subclasses) or None
See also
derivative()
,integrate1D()
,integrate_simpson()
Examples
>>> import numpy as np >>> s = BaseSignal(np.random.random((64,64,1024))) >>> s.data.shape (64,64,1024) >>> s.diff(-1).data.shape (64,64,1023)
-
get_histogram
(bins='freedman', out=None, rechunk=True, **kwargs)¶ Return a histogram of the signal data.
More sophisticated algorithms for determining bins can be used. Aside from the bins argument allowing a string specified how bins are computed, the parameters are the same as
numpy.histogram()
.- Parameters
bins (int, list, or str, optional) –
If bins is a string, then it must be one of:
'knuth'
: use Knuth’s rule to determine bins'scotts'
: use Scott’s rule to determine bins'freedman'
: use the Freedman-diaconis rule to determine bins'blocks'
: use bayesian blocks for dynamic bin widths
range_bins (tuple or None, optional) – the minimum and maximum range for the histogram. If range_bins is
None
, (x.min()
,x.max()
) will be used.out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.rechunk (bool) – Only has effect when operating on lazy signal. If
True
(default), the data may be automatically rechunked before performing this operation.**kwargs – other keyword arguments (weight and density) are described in
numpy.histogram()
.
- Returns
hist_spec – A 1D spectrum instance containing the histogram.
- Return type
See also
print_summary_statistics()
,astroML.density_estimation.histogram()
,numpy.histogram()
Notes
The lazy version of the algorithm does not support the
'knuth'
and'blocks'
bins arguments.The estimators for bins are taken from the AstroML project. Read the documentation of
astroML.density_estimation.histogram()
for more info.
Examples
>>> s = hs.signals.Signal1D(np.random.normal(size=(10, 100))) >>> # Plot the data histogram >>> s.get_histogram().plot() >>> # Plot the histogram of the signal at the current coordinates >>> s.get_current_signal().get_histogram().plot()
-
integrate_simpson
(axis, out=None)¶ Calculate the integral of a Signal along an axis using Simpson’s rule.
- Parameters
axis (
int
,str
, orDataAxis
) – The axis can be passed directly, or specified using the index of the axis in the Signal’s axes_manager or the axis name.out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.
- Returns
s – A new Signal containing the integral of the provided Signal along the specified axis.
- Return type
BaseSignal
(or subclasses)
See also
diff()
,derivative()
,integrate1D()
Examples
>>> import numpy as np >>> s = BaseSignal(np.random.random((64,64,1024))) >>> s.data.shape (64,64,1024) >>> s.integrate_simpson(-1).data.shape (64,64)
-
rebin
(new_shape=None, scale=None, crop=False, out=None, rechunk=True)¶ Rebin the signal into a smaller or larger shape, based on linear interpolation. Specify either new_shape or scale.
- Parameters
new_shape (list (of floats or integer) or None) – For each dimension specify the new_shape. This will internally be converted into a scale parameter.
scale (list (of floats or integer) or None) – For each dimension, specify the new:old pixel ratio, e.g. a ratio of 1 is no binning and a ratio of 2 means that each pixel in the new spectrum is twice the size of the pixels in the old spectrum. The length of the list should match the dimension of the Signal’s underlying data array. Note : Only one of `scale` or `new_shape` should be specified, otherwise the function will not run
crop (bool) –
Whether or not to crop the resulting rebinned data (default is
True
). When binning by a non-integer number of pixels it is likely that the final row in each dimension will contain fewer than the full quota to fill one pixel.e.g. a 5*5 array binned by 2.1 will produce two rows containing 2.1 pixels and one row containing only 0.8 pixels. Selection of
crop=True
orcrop=False
determines whether or not this “black” line is cropped from the final binned array or not.
Please note that if
crop=False
is used, the final row in each dimension may appear black if a fractional number of pixels are left over. It can be removed but has been left to preserve total counts before and after binning.out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.
- Returns
s – The resulting cropped signal.
- Return type
BaseSignal
(or subclass)
Examples
>>> spectrum = hs.signals.EDSTEMSpectrum(np.ones([4, 4, 10])) >>> spectrum.data[1, 2, 9] = 5 >>> print(spectrum) <EDXTEMSpectrum, title: dimensions: (4, 4|10)> >>> print ('Sum = ', sum(sum(sum(spectrum.data)))) Sum = 164.0 >>> scale = [2, 2, 5] >>> test = spectrum.rebin(scale) >>> print(test) <EDSTEMSpectrum, title: dimensions (2, 2|2)> >>> print('Sum = ', sum(sum(sum(test.data)))) Sum = 164.0
-
valuemax
(axis, out=None, rechunk=True)¶ Returns a signal with the value of coordinates of the maximum along an axis.
- Parameters
axis (
int
,str
, orDataAxis
) – The axis can be passed directly, or specified using the index of the axis in the Signal’s axes_manager or the axis name.out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.rechunk (bool) – Only has effect when operating on lazy signal. If
True
(default), the data may be automatically rechunked before performing this operation.
- Returns
s – A new Signal containing the calibrated coordinate values of the maximum along the specified axis.
- Return type
BaseSignal
(or subclasses)
See also
max()
,min()
,sum()
,mean()
,std()
,var()
,indexmax()
,indexmin()
,valuemin()
Examples
>>> import numpy as np >>> s = BaseSignal(np.random.random((64,64,1024))) >>> s.data.shape (64,64,1024) >>> s.valuemax(-1).data.shape (64,64)
-
valuemin
(axis, out=None, rechunk=True)¶ Returns a signal with the value of coordinates of the minimum along an axis.
- Parameters
axis (
int
,str
, orDataAxis
) – The axis can be passed directly, or specified using the index of the axis in the Signal’s axes_manager or the axis name.out (
BaseSignal
(or subclasses) orNone
) – IfNone
, a new Signal is created with the result of the operation and returned (default). If a Signal is passed, it is used to receive the output of the operation, and nothing is returned.rechunk (bool) – Only has effect when operating on lazy signal. If
True
(default), the data may be automatically rechunked before performing this operation.
- Returns
s – A new Signal containing the calibrated coordinate values of the minimum along the specified axis.
- Return type
BaseSignal
(or subclasses)
See also
max()
,min()
,sum()
,mean()
,std()
,var()
,indexmax()
,indexmin()
,valuemax()
-
hyperspy._signals.lazy.
to_array
(thing, chunks=None)¶ Accepts BaseSignal, dask or numpy arrays and always produces either numpy or dask array.
- Parameters
thing ({BaseSignal, dask.array.Array, numpy.ndarray}) – the thing to be converted
chunks ({None, tuple of tuples}) – If None, the returned value is a numpy array. Otherwise returns dask array with the chunks as specified.
- Returns
res
- Return type
{numpy.ndarray, dask.array.Array}