kmeans_mt

所属分类:matlab编程
开发工具:matlab
文件大小:8KB
下载次数:3
上传日期:2014-09-05 13:10:42
上 传 者tariq
说明:  Efficient Kmeans using Multiple Threads

文件列表:
kmeans_mt\kmeans_mt.cpp (2255, 2014-01-13)
kmeans_mt\kmeans_unix.h (5726, 2014-05-27)
kmeans_mt\kmeans_win.h (5799, 2014-05-27)
kmeans_mt\newdelete.cpp (756, 2014-01-03)
kmeans_mt\newdelete.h (487, 2014-01-03)
license.txt (1500, 2014-09-04)

Efficient Kmeans using Multiple Threads ------------------------------- by Haw-Shiuan Chang ------------------------------- contact samkendi@hotmail.com if you have any question about this code ------------------------------- Last updated date: 2014/9/4 ------------------------------- About: This code implements the basic kmeans algorithm using Euclidean distance, and its computation speed is optimized using C/C++ and multiple threads. When the number of samples and feature dimensions are large, this code would be significantly faster than the one in the Matlab toolbox and other efficient implementation such as litekmeans(http://www.cad.zju.edu.cn/home/dengcai/Data/code/litekmeans.m) For example, for data with 17 dimensions and 154401 samples, the following are the speeds of different codes to generate the same result after 100 iterations in a PC with 3.4GHz i7 Intel CPU: Matlab toolbox: 10.32 sec litekmeans: 7.50 sec This code: 2.92 sec For research purposes, using or modifying our soure code is granted, but any form of commercial usage is not allowed. This code is originally written for the building visual words in following publication: Haw-Shiuan Chang and Yu-Chiang Frank Wang, "Simple-to-Complex Discriminative Clustering for Hierarchical Image Segmentation", ACCV 2014 If you use this code and your research is related to ours, you can consider to cite our paper. ------------------------------- Compilation: You first need to compile this code, execute following command after mex is setted: mex kmeans_mt.cpp newdelete.cpp; ------------------------------- Usage: Execute following command to run this code: [IDX,final_k_centers]=kmeans_mt(X,init_k_centers,max_iter,1); where "X" is the data, "init_k_centers" is the location of k initial centers, "max_iter" is the maximal iteration number of EM algorithm, and the final input argument indicates whether the program shows the summation of distances between every samples and its closest center at each iteration. The first output ("IDX") is the belonging index of every sample, and the second output ("final_k_centers") is the location of k final centers. If you can use Matlab statistics toolbox, the command would output the same result as: opts = statset('Display','iter'); [IDX,final_k_centers]=kmeans(X,[],'Options',opts,'start',init_k_centers); The detail formats of "X", "init_k_centers", "IDX" and "final_k_centers" are the same as the ones in Matlab toolbox. You can find them in http://www.mathworks.com/help/stats/kmeans.html ------------------------------- Note: The initialization of kmeans is not the speed bottleneck of the algorithm, so you can use Matlab code to perform this step according your own needs. The following is the simple example to directly use samples as initial centers if K clusters are needed: sample_num=size(X,1); init_k_centers=X(round(1:sample_num/K:sample_num),:);

近期下载者

相关文件


收藏者