WuManber

所属分类:数值算法/人工智能
开发工具:Visual C++
文件大小:5KB
下载次数:27
上传日期:2010-10-17 17:47:27
上 传 者wy108
说明:  Wumanber算法的实现,多模式匹配的最经典的算法有AC算法和Wu-Manber算法,Wu-Manber算法是在BM算法的基础上借助哈希来实现的。
(Wumanber algorithm, multi-pattern matching algorithm of the most classical AC algorithm and the Wu-Manber algorithm, Wu-Manber algorithm is based on the BM algorithm is implemented using hash.)

文件列表:
WuManber.cpp (6522, 2009-12-21)
Main.cpp (833, 2009-12-21)
WuManber.h (2122, 2009-12-21)

http://www.oneunified.net/blog/2008/03/23/ C++ Implementation of Wu Manber's Multi-Pattern Search Algorithm I'm finding that this algorithm is useful in a number of situations. Out in the real world, it is found in Network Intrusion Scanners, grep engines, as well as text processing. For me, I'm working on an implementation for a RADIUS server for authorizing and logging calls for a voice over ip service. Some vendor specific attributes don't have a unique numeric identifier, the identifier is simply a string. Being able to do a fast match would be nice in this situation. Currently I'm using C++'s unordered_map to facilitate identification of the string. The Wu Manber algorithm may be a suitable alternative. On another front, I subscribe to a real time news service. Passing this news service through a Multi-Pattern Search Algorithm would make for a processor efficient method of filtering for news items in which I'm interested. Wu-Manber stand on the shoulders of their Multi Pattern Search predecessors: Aho and Corasick (with a linear time scanner based upon an automata approach), Commentz-Walter (who combined Aho-Corasick with the Boyer-Moore string search algorithm), and Baeza-Yates (with a slightly different combination of Aho-Corasick and Boyer-Moore-Horspool). For those who are interested, here is a Visual Studio implmentation of Wu Manber's algorithm. It should be readily transportable to other platforms with little or no change, other the the _tmain construct. The implementation relies on C++ Standard Template Library for various containers. A table driven character comparison approach is used to make it easy to choose between case sensitive and case insensitive matches. WuManber.h: header file WuManber.cpp: class implementation Main.cpp: test enclosure The next step would be turn it into a template to make it useful on various types of alphabets. It all needs to be redone a bit in order to accept a 'functor' or a 'delegate' so that when a match is found, an appropriate routine can be invoked. There is another paper out there called "An Improved Wu-Manber Multiple Patterns Matching Algorithm" that purports to be faster. The charts in the paper say it is a bit faster. I'm wondering if it is 'faster enough' to be of value implementing.

近期下载者

相关文件


收藏者