warta-scrap

所属分类:数据采集/爬虫
开发工具:Python
文件大小:391KB
下载次数:0
上传日期:2018-10-12 02:45:36
上 传 者sh-1993
说明:  印尼指数新闻爬虫,包括10个在线媒体
(Indonesia Index News Crawler, including 10 online media)

文件列表:
antara (0, 2018-10-12)
antara\antara (0, 2018-10-12)
antara\antara\__init__.py (0, 2018-10-12)
antara\antara\items.py (204, 2018-10-12)
antara\antara\middlewares.py (1904, 2018-10-12)
antara\antara\pipelines.py (286, 2018-10-12)
antara\antara\settings.py (3128, 2018-10-12)
antara\antara\spiders (0, 2018-10-12)
antara\antara\spiders\__init__.py (161, 2018-10-12)
antara\antara\spiders\antara_spider.py (1008, 2018-10-12)
antara\sampleResult.json (2897, 2018-10-12)
antara\scrapy.cfg (256, 2018-10-12)
detik (0, 2018-10-12)
detik\detik (0, 2018-10-12)
detik\detik\__init__.py (0, 2018-10-12)
detik\detik\items.py (202, 2018-10-12)
detik\detik\middlewares.py (1903, 2018-10-12)
detik\detik\pipelines.py (285, 2018-10-12)
detik\detik\settings.py (3118, 2018-10-12)
detik\detik\spiders (0, 2018-10-12)
detik\detik\spiders\__init__.py (161, 2018-10-12)
detik\detik\spiders\detik_spider.py (2012, 2018-10-12)
detik\sampleResult.json (5503, 2018-10-12)
detik\scrapy.cfg (254, 2018-10-12)
kompas (0, 2018-10-12)
kompas\kompas (0, 2018-10-12)
kompas\kompas\__init__.py (0, 2018-10-12)
kompas\kompas\items.py (204, 2018-10-12)
kompas\kompas\middlewares.py (1904, 2018-10-12)
kompas\kompas\pipelines.py (286, 2018-10-12)
kompas\kompas\settings.py (3128, 2018-10-12)
kompas\kompas\spiders (0, 2018-10-12)
kompas\kompas\spiders\__init__.py (161, 2018-10-12)
kompas\kompas\spiders\kompas_spider.py (1214, 2018-10-12)
kompas\sampleResult.json (5728, 2018-10-12)
kompas\scrapy.cfg (256, 2018-10-12)
liputan6 (0, 2018-10-12)
... ...

# warta-scrap Indonesia Index News Crawler, including 10 online ### Online Media List: - Detik.com http://news.detik.com/indeks - Republika.co.id http://www.republika.co.id/indeks - Viva.co.id http://www.viva.co.id/indeks - Kompas.com http://indeks.kompas.com/ - Antaranews.com http://www.antaranews.com/terkini - Tempo.co https://www.tempo.co/indeks - Okezone.com http://index.okezone.com/ - Liputan6.com http://www.liputan6.com/indeks - Merdeka.com https://www.merdeka.com/berita-hari-ini/ - Tirto.id https://tirto.id/indeks ### Installation : Open Terminal, and clone this repo: > git clone https://github.com/harryandriyan/warta-scrap Go to project folder > cd warta-scrap Setup virtualenv > virtualenv venv Activate virtualenv > . venv/bin/activate Install requirements > pip install -r requirements.txt ### How to use Open the specific project, example > cd republika Run crawl command, example > scrapy crawl republika -o sampleResult.json -t json

近期下载者

相关文件


收藏者