Gaussian Mixture Models¶
-
class
sparseklearn.gmm.
GaussianMixture
(n_components=3, covariance_type='spherical', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmpp', means_init=None, covariances_init=None, weights_init=None, predict_training_data=False, **kwargs)[source]¶ Sparsified Gaussian mixture model.
Fit a Gaussian mixture model to sparsified data. Diagonal and spherical covariances are supported.
- Parameters
- n_componentsint, defaults to 3.
The number of components (clusters) to fit.
- covariance_type{‘spherical’, ‘diag’}, defaults to ‘spherical’.
The form of the covariance matrix.
- tolfloat, defaults to 1e-3.
The convergence threshold. EM iterations will stop when the lower bound average gain is below this threshold.
- reg_covarfloat, defaults to 1e-6.
Non-negative regularization added to the diagonal of the covariance to ensure it’s positive.
- max_iterint, defaults to 100.
The number of EM iterations to perform.
- n_initint, defaults to 1.
The number of initializations to perform. The best results are kept.
- init_params{‘kmpp’, ‘random’}, defaults to ‘kmpp’.
The method used to initialize the weights, the means and the precisions. If ‘kmpp’, the initial means are chosen using the k-means++ algorithm. If ‘random’, initial means are chosen at random from the input data.
- means_initnd.array, shape (n_components, P), optional.
The user-provided initial means, defaults to None, in which case the means are initialized using the init_params method. P is the number of features in the full-dimensional space.
- predict_training_databool, default to False.
Whether to predict labels for the training data.
- Attributes
- weights_nd.array, shape (n_components,)
The weight of each mixture component.
- means_nd.array, shape (n_components, P)
The mean of each mixture component.
- covariances_nd.array
The covariance of each mixture component. The shape depends on covariance_type: (n_components,) if spherical and (n_components, P) if diag.
- converged_bool
True if fit() converged, False otherwise.
Methods
apply_HD
(self, X)Apply the preconditioning transform to X.
apply_mask
(self, X, mask)Apply the mask to X.
fit
(self[, X, HDX, RHDX, y])Estimate model parameters using EM algorithm.
fit_sparsifier
(self[, X, HDX, RHDX])Fit the sparsifier to specified data.
invert_HD
(self, HDX)Apply the inverse of HD to HDX.
invert_mask_bool
(self)Compute the mask inverse.
pairwise_distances
(self[, Y])Computes the pairwise distance between each sparsified sample, or between each sparsified sample and each full sample in Y if Y is given.
pairwise_mahalanobis_distances
(self, means, …)Computes the mahalanobis distance between each compressed sample and each full mean (each row of means).
predict
(self, X)Predict the labels for the data samples in X using the trained model.
weighted_means
(self, W)Computes weighted full means of sparsified samples.
weighted_means_and_variances
(self, W)Computes weighted full means and variances of sparsified samples.
-
fit
(self, X=None, HDX=None, RHDX=None, y=None)[source]¶ Estimate model parameters using EM algorithm.
Fits the model
n_init
times and keeps the parameters with which the model has the largest likelihood. Each trial performs at mostmax_iter
iterations of EM until covergence. or aConvergenceWarning
is raised.At least one of X, HDX, or RHDX must be passed.
- Parameters
- Xnd.array, shape (N, P), optional
defaults to None. Dense, raw data.
- HDXnd.array, shape (N, P), optional
defaults to None. Dense, transformed data.
- RHDXnd.array, shape (N, Q), optional
defaults to None. Subsampled, transformed data.
- ynd.array, shape (N,), optional
True labels.
-
predict
(self, X)[source]¶ Predict the labels for the data samples in X using the trained model.
- Parameters
- Xnd.array, shape (n_samples, Q)
Array of Q-dimensional data points. Each row corresponds to a single data point. X is assumed to be preconditioned and subsampled.
- Returns
- labelsarray, shape (n_samples,)
Component labels.