arana
所属分类:数据采集/爬虫
开发工具:Ruby
文件大小:14KB
下载次数:0
上传日期:2011-06-08 22:33:21
上 传 者:
sh-1993
说明: 搜索El País新闻页面的网络爬虫
(Web crawler to look for news page on El País)
文件列表:
Gemfile (77, 2011-06-09)
TODO.txt (181, 2011-06-09)
db (0, 2011-06-09)
db\arana.sqlite3 (24576, 2011-06-09)
lib (0, 2011-06-09)
lib\arana.rb (831, 2011-06-09)
lib\arana (0, 2011-06-09)
lib\arana\crawler.rb (653, 2011-06-09)
lib\arana\link.rb (911, 2011-06-09)
lib\arana\word.rb (271, 2011-06-09)
# Arana!!
![Arana!!](http://dl.dropbox.com/u/10232797/108-arana_1_350.jpg)
Web crawler to look for news page on El Pais
database schema:
CREATE TABLE links (
id INTEGER NOT NULL,
title VARCHAR(1000),
target VARCHAR(1000),
PRIMARY KEY(id)
);
CREATE TABLE words (
id INTEGER NOT NULL,
term VARCHAR(255),
count INTEGER,
PRIMARY KEY(id)
);
近期下载者:
相关文件:
收藏者: