pyemma.coordinates.cluster_mini_batch_kmeans¶
-
pyemma.coordinates.cluster_mini_batch_kmeans(data=None, k=100, max_iter=10, batch_size=0.2, metric='euclidean', init_strategy='kmeans++', n_jobs=None, chunksize=None, skip=0, clustercenters=None, **kwargs)¶ k-means clustering with mini-batch strategy
Mini-batch k-means is an approximation to k-means which picks a randomly selected subset of data points to be updated in each iteration. Usually much faster than k-means but will likely deliver a less optimal result.
- Returns
kmeans_mini – Object for mini-batch kmeans clustering. It holds discrete trajectories and cluster center information.
- Return type
a
MiniBatchKmeansClusteringclustering object
See also
kmeans: for full k-means clustering-
class
pyemma.coordinates.clustering.kmeans.MiniBatchKmeansClustering(*args, **kwargs)¶ Mini-batch k-means clustering
Methods
assign([X, stride])Assigns the given trajectory or list of trajectories to cluster centers by using the discretization defined by this clustering method (usually a Voronoi tesselation).
describe()Get a descriptive string representation of this class.
dimension()output dimension of clustering algorithm (always 1).
estimate(X, **kwargs)Estimates the model given the data X
fit(X[, y])Estimates parameters - for compatibility with sklearn.
fit_predict(X[, y])Performs clustering on X and returns cluster labels.
fit_transform(X[, y])Fit to data, then transform it.
get_model_params([deep])Get parameters for this model.
get_output([dimensions, stride, skip, chunk])Maps all input data of this transformer and returns it as an array or list of arrays
get_params([deep])Get parameters for this estimator.
iterator([stride, lag, chunk, …])creates an iterator to stream over the (transformed) data.
load(file_name[, model_name])Loads a previously saved PyEMMA object from disk.
n_chunks(chunksize[, stride, skip])how many chunks an iterator of this sourcde will output, starting (eg.
n_frames_total([stride, skip])Returns total number of frames.
number_of_trajectories([stride])Returns the number of trajectories.
output_type()By default transformers return single precision floats.
sample_indexes_by_cluster(clusters, nsample)Samples trajectory/time indexes according to the given sequence of states.
save(file_name[, model_name, overwrite, …])saves the current state of this object to given file and name.
save_dtrajs([trajfiles, prefix, output_dir, …])saves calculated discrete trajectories.
set_model_params(clustercenters)set_params(**params)Set the parameters of this estimator.
trajectory_length(itraj[, stride, skip])Returns the length of trajectory of the requested index.
trajectory_lengths([stride, skip])Returns the length of each trajectory.
transform(X)Maps the input data through the transformer to correspondingly shaped output data array/list.
update_model_params(**params)Update given model parameter if they are set to specific values
write_to_csv([filename, extension, …])write all data to csv with numpy.savetxt
write_to_hdf5(filename[, group, …])writes all data of this Iterable to a given HDF5 file.
Attributes
References