aspseek

所属分类:搜索引擎
开发工具:Visual C++
文件大小:1130KB
下载次数:204
上传日期:2006-04-11 16:05:52
上 传 者chrismas
说明:  ASPSeek是一个C++编写的互联网搜索引擎,并使用了STL库。它主要包括一个检索机器人,一个搜索守护程序,和一个搜索前端(CGI或者是Apache模块)。它大概可以检索几百万个URLs,来查找给定的短语和单词,并使用通配符,进行布尔搜索。搜索结果可以限定在给定的时间或站点,站点空间,并按照相关性或者时间进行排序(这里面使用了一些非常酷的技术)。ASPSeek可以应用于很多语言和编码中(甚至包括多字节语言如中文)。它为多个站点做了优化。(多线程检索,同步DNS查询, 按站点将结果分组, Web集合等),同时它也可以用于单个站点的搜索。其他特性包括支持stopwords和ispell, 字符集和语言的预测, 搜索结果的HTML模板,引用和查询词高亮度显示。并且它有详细的文档可以利用。
(ASPSeek C is prepared in an Internet search engine and the use of the STL library. It mainly includes a retrieval robot, a guardian search procedures and a search front end (CGI or Apache module). It probably can search millions of URLs to the search for phrases and words, and the use of wildcards for Boolean search. Search results can be limited to the time or site, site space in accordance with the relevant time or rank (which is used by some very cool technology). ASPSeek can be used in many languages and coding (including multi-byte languages such as Chinese). It has done a number of site optimization. (Multi-threaded searching, synchronous DNS inquiries, according to results of a site, Web pools, etc.) It also can be used to search a single site. Other features include support stopwor)

文件列表:
aspseek\aspseek\etc\stopwords\catalan (909, 2002-07-01)
aspseek\aspseek\etc\stopwords\czech (1069, 2002-03-10)
aspseek\aspseek\etc\stopwords\danish (678, 2002-03-10)
aspseek\aspseek\etc\stopwords\dutch (378, 2002-05-14)
aspseek\aspseek\etc\stopwords\english (266, 2002-03-10)
aspseek\aspseek\etc\stopwords\french (895, 2002-03-10)
aspseek\aspseek\etc\stopwords\german (1062, 2002-07-01)
aspseek\aspseek\etc\stopwords\hungarian (303, 2002-07-01)
aspseek\aspseek\etc\stopwords\italian (1363, 2002-07-01)
aspseek\aspseek\etc\stopwords\italian-small (568, 2002-05-14)
aspseek\aspseek\etc\stopwords\norwegian (708, 2002-03-10)
aspseek\aspseek\etc\stopwords\polish (819, 2002-03-10)
aspseek\aspseek\etc\stopwords\portuguese (1028, 2002-03-10)
aspseek\aspseek\etc\stopwords\russian (574, 2002-03-10)
aspseek\aspseek\etc\stopwords\slovak (1083, 2002-03-10)
aspseek\aspseek\etc\stopwords\spanish (1562, 2002-03-10)
aspseek\aspseek\etc\stopwords\turkish (851, 2002-07-01)
aspseek\aspseek\etc\stopwords\ukrainian (290, 2002-03-10)
aspseek\aspseek\etc\stopwords\czech.iso88592 (1185, 2002-03-10)
aspseek\aspseek\etc\stopwords (0, 2006-02-28)
aspseek\aspseek\etc\tables\big5.txt (262415, 2001-05-16)
aspseek\aspseek\etc\tables\chinese.txt (822484, 2001-05-16)
aspseek\aspseek\etc\tables\cp852.txt (1758, 2001-05-16)
aspseek\aspseek\etc\tables\gb2312.txt (144553, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-1.txt (1627, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-10.txt (2307, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-13.txt (1706, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-14.txt (2297, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-15.txt (1692, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-2.txt (2068, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-3.txt (1896, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-4.txt (2092, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-5.txt (2003, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-6.txt (1596, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-7.txt (1659, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-8.txt (1153, 2001-05-16)
aspseek\aspseek\etc\tables\iso8859-9.txt (1662, 2001-05-16)
aspseek\aspseek\etc\tables\keybcs2.txt (1163, 2001-05-16)
... ...

ASPseek v.1.2 http://www.aspseek.org/ Advanced Internet search engine Copyright (C) 2000, 2001, 2002 by SWsoft ASPseek is a full-featured medium-to-large scale Internet search engine. It consists of an indexing robot, a search daemon and a search front-ends (CGI or Apache module). These programs are written in C++ using STL library. ASPseek uses mix of SQL database and binary files for data storage. ASPseek features ---------------- To learn about ASPseek features, please read aspseek(7) man page. Here is just a brief list: * Ability to index and search through several millions of documents * HTTP, HTTP proxy, FTP (via proxy) protocols * HTTP basic authorization * HTTPS protocol * text/html and text/plain documents * Other document types support via external converters * Architecture optimized for multiple sites * Multithreaded * Async DNS resolver * Stopwords * Unicode support to deal with many character sets (including CJK) at once * Charset guesser (optional) * Language guesser * Robot exclusion standard (robots.txt) support * Settings to control network bandwidth usage and Web servers load * Real-time asynchronous indexing * Very good relevancy of results * Sorting results by relevance or by date * Smart results cache * Advanced search capabilities * Ispell support * Excerpts * Grouping results by site * Clones (mirrored documents) detection * Spaces and subsets * Query words highlighting in results * Cached compressed local copy of every indexed document * HTML templates for easy-to-customize search results How to use it ------------- Please start with reading INSTALL file there you can find detailed instructions about installation, run-time configuration and usage of ASPseek. Contribution ------------ As ASPseek is free software project, any type of contribution is very welcome. Please send patches, bug reports, ideas etc. to developers You can also subscribe to ASPseek users mailing list to ask questions, share ideas, and communicate in general with other users and developers. To subscribe, please send mail to majordomo@lists.asplinux.ru with the line "subscribe aseek-users" in message body. Disclaimer (see COPYING for details) ------------------------------------ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

近期下载者

相关文件


收藏者