java网络爬虫

所属分类:数据挖掘/数据仓库
开发工具:Java
文件大小:14255KB
下载次数:5
上传日期:2017-09-21 10:02:44
上 传 者孤独的老张
说明:  是一个无须配置、便于二次开发的JAVA爬虫框架(内核),它提供精简的的API,只需少量代码即可实现一个功能强大的爬虫
(Is a JAVA reptile framework (kernel) that does not need to be configured for easy development. It provides a streamlined API that requires a small amount of code to implement a powerful crawler)

文件列表:
基于 Java 的开源网络爬虫框架.htm (279578, 2017-09-21)
WebCollector (0, 2017-06-03)
WebCollector\LICENSE.txt (35141, 2017-06-03)
WebCollector\NewsCrawler.java (2333, 2017-06-03)
WebCollector\WebCollector-JRuby (0, 2017-06-03)
WebCollector\WebCollector-JRuby\lib (0, 2017-06-03)
WebCollector\WebCollector-JRuby\lib\webcollector.rb (790, 2017-06-03)
WebCollector\WebCollector-JRuby\webcollector-0.1.0.gem (7213056, 2017-06-03)
WebCollector\WebCollector-JRuby\webcollector.gemspec (409, 2017-06-03)
WebCollector\WebCollector (0, 2017-06-03)
WebCollector\WebCollector\CODE_COVERAGE.md (300, 2017-06-03)
WebCollector\WebCollector\WebCollector.iml (6334, 2017-06-03)
WebCollector\WebCollector\pom.xml (11549, 2017-06-03)
WebCollector\WebCollector\src (0, 2017-06-03)
WebCollector\WebCollector\src\main (0, 2017-06-03)
WebCollector\WebCollector\src\main\java (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\contentextractor (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\contentextractor\ContentExtractor.java (17531, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\contentextractor\News.java (2152, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawldb (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawldb\DBManager.java (2439, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawldb\Generator.java (1186, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawldb\Injector.java (969, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawldb\SegmentWriter.java (1316, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawler (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawler\AutoParseCrawler.java (4739, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\crawler\Crawler.java (11384, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\example (0, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\example\DemoBingCrawler.java (6434, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\example\DemoDepthCrawler.java (3029, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\example\DemoHashSetNextFilter.java (3948, 2017-06-03)
WebCollector\WebCollector\src\main\java\cn\edu\hfut\dmic\webcollector\example\DemoMetaCrawler.java (5610, 2017-06-03)
... ...

近期下载者

相关文件


收藏者