larbin-2.6.3

所属分类:搜索引擎
开发工具:Visual C++
文件大小:164KB
下载次数:45
上传日期:2009-12-21 10:28:42
上 传 者zfnh2002
说明:  larbin是一种开源的网络爬虫/网络蜘蛛,由法国的年轻人Sébastien Ailleret独立开发。larbin目的是能够跟踪页面的url进行扩展的抓取,最后为搜索引擎提供广泛的数据来源。 Larbin只是一个爬虫,也就是说larbin只抓取网页,至于如何parse的事情则由用户自己完成。另外,如何存储到数据库以及建立索引的事情 larbin也不提供。   latbin最初的设计也是依据设计简单但是高度可配置性的原则,因此我们可以看到,一个简单的larbin的爬虫可以每天获取500万的网页,实在是非常高效。   利用larbin,我们可以轻易的获取/确定单个网站的所有联结,甚至可以镜像一个网站;也可以用它建立url 列表群,例如针对所有的网页进行 url retrive后,进行xml的联结的获取。或者是 mp3,或者定制larbin,可以作为搜索引擎的信息的来源。
(no)

文件列表:
larbin-2.6.3\adns\adns.h (33067, 2002-01-02)
larbin-2.6.3\adns\check.c (5395, 2002-04-15)
larbin-2.6.3\adns\config.h.in (2371, 2002-01-02)
larbin-2.6.3\adns\configure (61125, 2002-01-02)
larbin-2.6.3\adns\dlist.h (1966, 2002-01-02)
larbin-2.6.3\adns\event.c (20523, 2002-01-02)
larbin-2.6.3\adns\general.c (10109, 2002-01-02)
larbin-2.6.3\adns\install-sh (5585, 2002-01-02)
larbin-2.6.3\adns\internal.h (26031, 2002-01-02)
larbin-2.6.3\adns\Makefile (368, 2002-04-15)
larbin-2.6.3\adns\parse.c (7339, 2002-01-02)
larbin-2.6.3\adns\poll.c (3542, 2002-01-02)
larbin-2.6.3\adns\query.c (14310, 2002-01-02)
larbin-2.6.3\adns\reply.c (11826, 2002-01-02)
larbin-2.6.3\adns\setup.c (16944, 2002-01-02)
larbin-2.6.3\adns\transmit.c (7056, 2002-01-02)
larbin-2.6.3\adns\tvarith.h (1396, 2002-01-02)
larbin-2.6.3\adns\types.c (27241, 2002-01-02)
larbin-2.6.3\configure (1054, 2002-05-19)
larbin-2.6.3\COPYING (18007, 2002-01-02)
larbin-2.6.3\CREDITS (1178, 2002-03-05)
larbin-2.6.3\doc\custom-eng.html (10312, 2003-07-09)
larbin-2.6.3\doc\download.html (8696, 2003-07-10)
larbin-2.6.3\doc\index-eng.html (3220, 2002-01-23)
larbin-2.6.3\doc\index.html (3702, 2002-01-23)
larbin-2.6.3\doc\l-en.jpg (1249, 2002-01-02)
larbin-2.6.3\doc\l-fr.jpg (503, 2002-01-02)
larbin-2.6.3\doc\use-eng.html (3149, 2002-03-04)
larbin-2.6.3\larbin.conf (1669, 2003-07-09)
larbin-2.6.3\Makefile (469, 2002-04-15)
larbin-2.6.3\options.h (3223, 2003-07-09)
larbin-2.6.3\src\fetch\checker.cc (1655, 2002-01-07)
larbin-2.6.3\src\fetch\checker.h (409, 2002-01-02)
larbin-2.6.3\src\fetch\defaultspecbuf.cc (304, 2002-01-02)
larbin-2.6.3\src\fetch\dynamicspecbuf.cc (1341, 2002-01-31)
larbin-2.6.3\src\fetch\dynamicspecbuf.h (434, 2002-01-02)
larbin-2.6.3\src\fetch\fetchOpen.cc (1577, 2002-01-02)
larbin-2.6.3\src\fetch\fetchOpen.h (339, 2002-01-02)
larbin-2.6.3\src\fetch\fetchPipe.cc (5079, 2002-04-04)
... ...

****************************************************************** Larbin web crawler by Sebastien Ailleret ****************************************************************** ****************** Table of content : - Compiling - Configuring - Running - Supported platforms - Contact *********** Compiling : have a look at options.h to choose options you need ./configure gmake ************* Configuring : see larbin.conf. Please be sure to specify your mail There is also some documentation in the doc directory in html format ********* Running : Be sure you did the configuration ./larbin to see how it works, http://localhost:8081/ ********************* Supported platforms : Larbin has mainly been developped under Linux I've tested larbin with success on Linux and freeBSD. It probably won't compile right out of the box on any other platform, but i'll work on it for future versions. Please report success or failure on any platform. ********* Contact : mailto:sebastien@ailleret.com http://www.ailleret.com/

近期下载者

相关文件


收藏者