381890

所属分类:大数据
开发工具:C++
文件大小:43KB
下载次数:0
上传日期:2018-11-09 14:50:59
上 传 者Mignvk
说明:  CUREClustering Using Representatives是一种针对大型数据库的高效的聚类算法,基于划分
(CUREClustering Using Representatives is an efficient clustering algorithm for large databases based on partitioning.)

文件列表:
alloc.c (3169, 2002-01-21)
alloc.h (1890, 2002-01-21)
cluster-no-noise.c (27460, 2002-01-21)
cluster-no-noise.h (627, 2002-01-21)
cluster-orig.c (18421, 2002-01-21)
cluster-sim.c (19675, 2002-01-21)
cluster.c (24935, 2002-01-21)
cluster.h (610, 2002-01-21)
error.c (447, 2002-01-21)
error.h (1585, 2002-01-21)
ex1.dat (3141, 1999-01-16)
genplots-noise.c (1912, 2002-01-21)
genplots.c (1874, 2002-01-21)
graph.c (10763, 2002-01-21)
kadd-blank.c (709, 2002-01-21)
Makefile (4220, 2002-01-21)
pca.c (10929, 2002-01-21)
testm.c (909, 2002-01-21)
tmp.c (1861, 2002-01-21)

0. I have modified an existing hierarchical clustering code and followed the CURE paper as closely as possible. Please let me know (han@cs.umn.edu) if you find any bugs in the code. 1. First part of the input file (ex1.dat) is shown below. Each row corresponds to (x,y) coordinates of a data point. 40.678 63.***31 41.4301 ***.8066 41.3468 63.8069 42.3465 63.7236 41.93 62.9738 40.7637 62.8905 2. You compile and run the code as follows: % cluster -k 2 -a 0.1 -r 10 ex1.dat Note that -k option is for the number of clusters, -a is for alpha parameter of CURE, and -r is the number of representative points of the cluster. The above run will generate 2 clusters with alpha 0.1 and the number of representative points 10. 3. This will generate ex1.dat-partition file like following (note that this is not an output from the real run): 0 0 1 1 1 0 The first 2 data points and the last data point belongs to cluster 0 and the remaining 3 data points belong to cluster 1.

近期下载者

相关文件


收藏者