Blind Source Separation#
In some cases it is possible to obtain more physically interpretable set of components using a process called Blind Source Separation (BSS). This largely depends on the particular application. For more information about blind source separation please see [Hyvarinen2000], and for an example application to EELS analysis, see [Pena2010].
Warning
The BSS algorithms operate on the result of a previous
decomposition analysis. It is therefore necessary to perform a
decomposition first before calling
blind_source_separation()
, otherwise it
will raise an error.
You must provide an integer number_of_components
argument,
or a list of components as the comp_list
argument. This performs
BSS on the chosen number/list of components from the previous
decomposition.
To perform blind source separation on the result of a previous decomposition,
run the blind_source_separation()
method, for example:
>>> s = hs.signals.Signal1D(np.random.randn(10, 10, 200))
>>> s.decomposition(output_dimension=3)
Decomposition info:
normalize_poissonian_noise=False
algorithm=SVD
output_dimension=3
centre=None
>>> s.blind_source_separation(number_of_components=3)
Blind source separation info:
number_of_components=3
algorithm=sklearn_fastica
diff_order=1
reverse_component_criterion=factors
whiten_method=PCA
scikit-learn estimator:
FastICA(tol=1e-10, whiten=False)
# Perform only on the first and third components
>>> s.blind_source_separation(comp_list=[0, 2])
Blind source separation info:
number_of_components=2
algorithm=sklearn_fastica
diff_order=1
reverse_component_criterion=factors
whiten_method=PCA
scikit-learn estimator:
FastICA(tol=1e-10, whiten=False)
Available algorithms#
HyperSpy implements a number of BSS algorithms via the algorithm
argument.
The table below lists the algorithms that are currently available, and includes
links to the appropriate documentation for more information on each one.
Algorithm |
Method |
---|---|
“sklearn_fastica” (default) |
|
“orthomax” |
|
“FastICA” |
|
“JADE” |
|
“CuBICA” |
|
“TDSEP” |
|
custom object |
An object implementing |
Note
Except orthomax()
, all of the implemented BSS algorithms listed above
rely on external packages being available on your system. sklearn_fastica
, requires
scikit-learn while FastICA, JADE, CuBICA, TDSEP
require the Modular toolkit for Data Processing (MDP).
Orthomax#
Orthomax rotations are a statistical technique used to clarify and highlight the relationship among factors, by adjusting the coordinates of PCA results. The most common approach is known as “varimax”, which intended to maximize the variance shared among the components while preserving orthogonality. The results of an orthomax rotation following PCA are often “simpler” to interpret than just PCA, since each componenthas a more discrete contribution to the data.
>>> s = hs.signals.Signal1D(np.random.randn(10, 10, 200))
>>> s.decomposition(output_dimension=3, print_info=False)
>>> s.blind_source_separation(number_of_components=3, algorithm="orthomax")
Blind source separation info:
number_of_components=3
algorithm=orthomax
diff_order=1
reverse_component_criterion=factors
whiten_method=PCA
Independent component analysis (ICA)#
One of the most common approaches for blind source separation is Independent Component Analysis (ICA). This separates a signal into subcomponents by assuming that the subcomponents are (a) non-Gaussian, and (b) that they are statistically independent from each other.
Custom BSS algorithms#
As with decomposition, HyperSpy supports passing a custom BSS algorithm,
provided it follows the form of a scikit-learn estimator.
Any object that implements fit()
and transform()
methods is acceptable, including
sklearn.pipeline.Pipeline
and sklearn.model_selection.GridSearchCV
.
You can access the fitted estimator by passing return_info=True
.
>>> # Passing a custom BSS algorithm
>>> from sklearn.preprocessing import MinMaxScaler
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.decomposition import FastICA
>>> pipe = Pipeline([("scaler", MinMaxScaler()), ("ica", FastICA())])
>>> out = s.blind_source_separation(number_of_components=3, algorithm=pipe, return_info=True, print_info=False)
>>> out
Pipeline(steps=[('scaler', MinMaxScaler()), ('ica', FastICA())])