yioop-v0.80

所属分类:WEB开发
开发工具:PHP-PERL
文件大小:735KB
下载次数:6
上传日期:2011-12-08 11:30:37
上 传 者qw833
说明:  Yioop搜索引擎源码。 yioop 是一个 PHP 的搜索引擎,可用于 Web 的一般用途搜索,或者可提供URL搜索以及各种文档的索引搜索,包括:HTML, PDF, DOC, PPT, RTF, RSS, XML, SVG, PNG, JPG, BMP, GIF, 以及 sitemaps。
(This version supports starting, stopping, and viewing log files of the queue server and fetchers from a Web interface. One can now inject new URLs into an active crawl via a Web interface. This version of Yioop! supports re-crawling of pages after a fixed number of days. Also, the file extensions that are crawled, the number of bytes downloaded per page, and how Yioop! weighs different page components can now all be controlled through a Web interface rather than just the config.php file. Improvements have also been made to how HTML Processor extracts text to index. )

文件列表:
yioop-v0.80 (0, 2011-12-07)
yioop-v0.80\INSTALL (2884, 2011-12-07)
yioop-v0.80\LICENSE (36068, 2011-12-07)
yioop-v0.80\bin (0, 2011-12-07)
yioop-v0.80\bin\arc_tool.php (15906, 2011-12-07)
yioop-v0.80\bin\fetcher.php (66793, 2011-12-07)
yioop-v0.80\bin\query_tool.php (5190, 2011-12-07)
yioop-v0.80\bin\queue_server.php (66737, 2011-12-07)
yioop-v0.80\bot.php (2209, 2011-12-07)
yioop-v0.80\configs (0, 2011-12-07)
yioop-v0.80\configs\config.php (12372, 2011-12-07)
yioop-v0.80\configs\createdb.php (11713, 2011-12-07)
yioop-v0.80\configs\default_crawl.ini (4702, 2011-12-07)
yioop-v0.80\controllers (0, 2011-12-07)
yioop-v0.80\controllers\admin_controller.php (94133, 2011-12-07)
yioop-v0.80\controllers\archive_controller.php (3514, 2011-12-07)
yioop-v0.80\controllers\controller.php (9495, 2011-12-07)
yioop-v0.80\controllers\fetch_controller.php (8889, 2011-12-07)
yioop-v0.80\controllers\machine_controller.php (5265, 2011-12-07)
yioop-v0.80\controllers\search_controller.php (28500, 2011-12-07)
yioop-v0.80\controllers\settings_controller.php (5039, 2011-12-07)
yioop-v0.80\css (0, 2011-12-07)
yioop-v0.80\css\search.css (12301, 2011-12-07)
yioop-v0.80\data (0, 2011-12-07)
yioop-v0.80\data\default.db (94208, 2011-12-07)
yioop-v0.80\examples (0, 2011-12-07)
yioop-v0.80\examples\Archive1317414322.zip (11672, 2011-12-07)
yioop-v0.80\examples\IndexData1317414322.zip (122487, 2011-12-07)
yioop-v0.80\examples\search_api.php (7943, 2011-12-07)
yioop-v0.80\favicon.ico (1150, 2011-12-07)
yioop-v0.80\index.php (4526, 2011-12-07)
yioop-v0.80\lib (0, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators (0, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators\arc_archive_bundle_iterator.php (7307, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators\archive_bundle_iterator.php (2566, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators\mediawiki_bundle_iterator.php (20642, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators\odp_rdf_bundle_iterator.php (16187, 2011-12-07)
yioop-v0.80\lib\archive_bundle_iterators\web_archive_bundle_iterator.php (7052, 2011-12-07)
... ...

SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer Copyright (C) 2009 - 2012 Chris Pollett chris@pollett.org http://www.seekquarry.com/ LICENSE: This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Summary ------- The Yioop search engine consists of five main scripts: fetcher.php - used to download batches of urls provided the queue_server. queue_server.php - maintains a queue of urls that are going to be scheduled to be seen. It also keeps track of what has been seen and robots.txt info. Its last responsibility is to create the index_archive that is used by the search front end. arc_tool.php - an auxiliary script used to view the contents of a web or index archive from the command line query_tool.php - an auxiliary script used to query an index from the command line index.php -- a search engine web page. It is also used to handle message passing between the fetchers (multiple machines can act as fetchers) and the queue_server. Download -------- You can download the SeekQuarry search engine from http://www.seekquarry.com/ Requirements ------------ The Yioop search engine requires Apache and PHP. It was developed under Apache 2.2, PHP 5.3, and the sqlite3 built into PHP. Credits ------ The source code is mainly due to Chris Pollett. Other contributors include: Priya Gangaraju, Nakul Natu, Vijaya Pamidi, Vijeth Patil, Tarun Pepira. Several people helped with localization: My wife, Mary Pollett, Jonathan Ben-David, Sujata Dongre, Youn Kim, Chao-Hsin Shih and Sugi Widjaja. Thanks to Ravi Dhillon for finding and helping with the fixes for Issue 15 and Commit 632e46. Installation ------------- Please see the INSTALL file Documentation and Support ------------------------- Please check out seekquarry.com

近期下载者

相关文件


收藏者