jssxrance

所属分类:数据结构
开发工具:MultiPlatform
文件大小:1194KB
下载次数:1
上传日期:2018-01-07 17:55:09
上 传 者permqe
说明:  挖掘频繁闭序列的算法是序列挖掘算法早期比较著名的算法

文件列表:
www\finished.html (564, 2005-12-07)
www\index.html (2574, 2005-10-26)
www\running.html (591, 2005-11-03)
src\SeqTree\ClosedSeqTree.h (4218, 2005-10-26)
src\SeqTree\ClosedTree.h (2753, 2005-10-26)
src\Global.h (2269, 2005-10-26)
src\SeqTree\MaxSeqTree.h (4837, 2005-10-26)
src\MemMap.h (622, 2005-10-26)
src\ProjDB.h (3272, 2005-10-26)
src\SeqTree\SeqTree.h (3275, 2005-10-26)
src\1LPrefixSpan.cpp (11776, 2005-10-26)
src\SeqTree\ClosedSeqTree.cpp (34597, 2005-10-26)
src\SeqTree\ClosedTree.cpp (15484, 2005-10-26)
src\Global.cpp (3144, 2005-10-26)
src\LinuxMemMap.cpp (2569, 2005-10-26)
src\SeqTree\MaxSeqTree.cpp (24640, 2005-10-26)
src\NTMemMap.cpp (3347, 2005-10-26)
src\ProjDB.cpp (28127, 2005-10-26)
src\SeqTree\SeqTree.cpp (20610, 2005-10-26)
lbin\clospan.exe (32768, 2005-10-26)
lbin\clospan_debug.exe (147581, 2005-10-26)
lbin\seq_data_generator.exe (98304, 2005-11-03)
lbin\stlport_vc6_stldebug46.dll (831552, 2005-10-26)
lbin\stlport_vc646.dll (827392, 2005-10-26)
lbin\clospan (449330, 2005-10-26)
Makefile (474, 2005-10-26)
lbin\seq_data_generator (2775166, 2005-10-28)
msvc_proj\clospan.dsp (5370, 2005-10-26)
msvc_proj\clospan.dsw (508, 2005-10-26)
src\SeqTree (0, 2017-12-01)
lbin (0, 2017-12-01)
msvc_proj (0, 2017-12-01)
src (0, 2017-12-01)
www (0, 2017-12-01)

CloSpan: Mining Closed Sequential Patterns Author: Xifeng Yan, University of Illinois at Urbana-Champaign The program is built upon PrefixSpan source code, "PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth" Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, Mei-Chun Hsu (ICDE'2001) Contact: xyan@cs.uiuc.edu Reference: "X. Yan, J. Han, R. Afshar, CloSpan: Mining Closed Sequential Patterns in Large Databases, Proc. 2003 SIAM Int. Conf. Data Mining (SDM'03)", 166 - 177, 2003. NOTE: For compiling under VC, please install stlport first. How-To: CloSpan filename min_sup num_of_labels Parameters: (1) filename, your binary data (2) min_sup, the minimum frequency of patterns (3) num_of_labels, the number of distinct item labels. Example: CloSpan D10N1B.data 0.1 1000 It mines all frequent sequences from "D10N1B.data", each of which should appear in at least 10% of the sequences in the dataset. 1000 means there are 1000 different symbols in this dataset. Input Format: 1. The input is a set of sequences; each sequence has the following format <(item_11, item_12, ..., item_1n)(item_21, item_22, ... item_2m)...> ------------------------------ ----------------------------- transaction 1 transaction 2 ...... Example: <(ab)(c)(d)> <(e)(acfh)> ... The input is stored in a binary file, we use a 4-byte integer "-1" to separate transactions in each sequence and another 4-byte integer "-2" to separate sequences in a dataset. Each of items is encoded using a 4-byte integer. For example, <(ab)(c)(d)><(e)(acfh)> is stored as ab-1c-1d-1-2e-1acfh-1-2 where each symbol is a 4-byte integer and all of them are concatenated together. Output: Program status as it is executing and the final results (such as timing) are printed to stdout (console). The discovered patterns are stored in a file named "ClosedPatterns", which is in a format of plain text. The first column in the output file shows the discovered patterns. The second column in the output file is the number of times that a pattern appears in the dataset.

近期下载者

相关文件


收藏者