hyperspy.learn.ornmf module

class hyperspy.learn.ornmf.ORNMF(rank, store_error=False, lambda1=1.0, kappa=1.0, method='PGD', subspace_learning_rate=1.0, subspace_momentum=0.5, random_state=None)

Bases: object

Performs Online Robust NMF with missing or corrupted data.

The ORNMF code is based on a transcription of the online proximal gradient descent (PGD) algorithm MATLAB code obtained from the authors of [Zhao2016]. It has been updated to also include L2-normalization cost function that is able to deal with sparse corruptions and/or outliers slightly faster (please see ORPCA implementation for details). A further modification has been made to allow for a changing subspace W, where X ~= WH^T + E in the ORNMF framework.

Read more in the User Guide.

References

Zhao2016

Zhao, Renbo, and Vincent YF Tan. “Online nonnegative matrix factorization with outliers.” Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 2016.

Creates Online Robust NMF instance that can learn a representation.

Parameters
  • rank (int) – The rank of the representation (number of components/factors)

  • store_error (bool, default False) – If True, stores the sparse error matrix.

  • lambda1 (float) – Nuclear norm regularization parameter.

  • kappa (float) – Step-size for projection solver.

  • method ({'PGD', 'RobustPGD', 'MomentumSGD'}, default 'PGD') –

    • ‘PGD’ - Proximal gradient descent

    • ’RobustPGD’ - Robust proximal gradient descent

    • ’MomentumSGD’ - Stochastic gradient descent with momentum

  • subspace_learning_rate (float) – Learning rate for the ‘MomentumSGD’ method. Should be a float > 0.0

  • subspace_momentum (float) – Momentum parameter for ‘MomentumSGD’ method, should be a float between 0 and 1.

  • random_state (None or int or RandomState instance, default None) – Used to initialize the subspace on the first iteration.

finish()

Return the learnt factors and loadings.

fit(X, batch_size=None)

Learn NMF components from the data.

Parameters
  • X ({numpy.ndarray, iterator}) – [n_samples x n_features] matrix of observations or an iterator that yields samples, each with n_features elements.

  • batch_size ({None, int}) – If not None, learn the data in batches, each of batch_size samples or less.

project(X, return_error=False)

Project the learnt components on the data.

Parameters
  • X ({numpy.ndarray, iterator}) – [n_samples x n_features] matrix of observations or an iterator that yields n_samples, each with n_features elements.

  • return_error (bool) – If True, returns the sparse error matrix as well. Otherwise only the weights (loadings)

hyperspy.learn.ornmf._mrdivide(B, A)

Solves xB = A as per Matlab.

hyperspy.learn.ornmf._thresh(X, lambda1, vmax)

Soft-thresholding with clipping.

hyperspy.learn.ornmf.ornmf(X, rank, store_error=False, project=False, batch_size=None, lambda1=1.0, kappa=1.0, method='PGD', subspace_learning_rate=1.0, subspace_momentum=0.5, random_state=None)

Perform online, robust NMF on the data X.

This is a wrapper function for the ORNMF class.

Parameters
  • X (numpy array) – The [n_samples, n_features] input data.

  • rank (int) – The rank of the representation (number of components/factors)

  • store_error (bool, default False) – If True, stores the sparse error matrix.

  • project (bool, default False) – If True, project the data X onto the learnt model.

  • batch_size ({None, int}, default None) – If not None, learn the data in batches, each of batch_size samples or less.

  • lambda1 (float) – Nuclear norm regularization parameter.

  • kappa (float) – Step-size for projection solver.

  • method ({'PGD', 'RobustPGD', 'MomentumSGD'}, default 'PGD') –

    • ‘PGD’ - Proximal gradient descent

    • ’RobustPGD’ - Robust proximal gradient descent

    • ’MomentumSGD’ - Stochastic gradient descent with momentum

  • subspace_learning_rate (float) – Learning rate for the ‘MomentumSGD’ method. Should be a float > 0.0

  • subspace_momentum (float) – Momentum parameter for ‘MomentumSGD’ method, should be a float between 0 and 1.

  • random_state (None or int or RandomState instance, default None) – Used to initialize the subspace on the first iteration.

Returns

  • Xhat (numpy array) – is the [n_features x n_samples] non-negative matrix Only returned if store_error is True.

  • Ehat (numpy array) – is the [n_features x n_samples] sparse error matrix Only returned if store_error is True.

  • W (numpy array, shape [n_features, rank]) – is the non-negative factors matrix

  • H (numpy array, shape [rank, n_samples]) – is the non-negative loadings matrix