K-Means¶

class sparseklearn.kmeans.KMeans(n_components=8, init='kmpp', tol=0.0001, n_init=10, n_passes=1, max_iter=300, means_init_array=None, **kwargs)[source]¶

Sparsified K-Means clustering.

Parameters

n_componentsint, default: 8

The number of clusters.

init{ndarray, ‘kmpp’, ‘random’}, default: ‘kmpp’

Initialization method:

ndarray : shape (n_components, P). Initial cluster centers, must be transformed already.

‘kmpp’: picks initial cluster centers from the data with probability proportional to the distance of each datapoint to the current initial means. More expensive but better convergence. These will be drawn from HDX if the sparsifier has access to it, otherwise they come from RHDX.

‘random’: picks iniitial cluster centers uniformly at random from the datapoints.These will be drawn from HDX if the sparsifier has access to it, otherwise they come from RHDX.

n_initint, default: 10

Number of times to run k-means on new initializations. The best results are kept.

max_iterint, default: 300

Maximum number of iterations for each run.

tolfloat, default: 1e-4

Relative tolerance with regards to inertia for convergence.

Attributes

cluster_centers_nd.array, shape (n_components, P): Coordinates of cluster centers
labels_np.array, shape (N,): Labels of each point
intertia_float: Sum of squared distances of samples to their cluster center.

Methods

`apply_HD`(self, X)	Apply the preconditioning transform to X.
`apply_mask`(self, X, mask)	Apply the mask to X.
`fit`(self[, X, HDX, RHDX])	Compute k-means clustering and assign labels to datapoints.
`fit_sparsifier`(self[, X, HDX, RHDX])	Fit the sparsifier to specified data.
`invert_HD`(self, HDX)	Apply the inverse of HD to HDX.
`invert_mask_bool`(self)	Compute the mask inverse.
`pairwise_distances`(self[, Y])	Computes the pairwise distance between each sparsified sample, or between each sparsified sample and each full sample in Y if Y is given.
`pairwise_mahalanobis_distances`(self, means, …)	Computes the mahalanobis distance between each compressed sample and each full mean (each row of means).
`weighted_means`(self, W)	Computes weighted full means of sparsified samples.
`weighted_means_and_variances`(self, W)	Computes weighted full means and variances of sparsified samples.

fit(self, X=None, HDX=None, RHDX=None)[source]¶

Compute k-means clustering and assign labels to datapoints. At least one of the parameters must be set.

Parameters

Xnd.array, shape (N, P), optional: defaults to None. Dense, raw data.
HDXnd.array, shape (N, P), optional: defaults to None. Dense, transformed data.
RHDXnd.array, shape (N, Q), optional: defaults to None. Subsampled, transformed data.

K-Means¶

sparseklearn

Navigation

Related Topics