Crawlers_Google_News_Twitter 联合开发网

Pudn.com > 下载中心 > 数据采集/爬虫 > Crawlers_Google_News_Twitter

Crawlers_Google_News_Twitter

所属分类：数据采集/爬虫
开发工具：HTML
文件大小：300KB
下载次数：0
上传日期：2019-11-02 14:46:14
上传者：sh-1993

说明：网络抓取、网络抓取、谷歌新闻、推特
(Web Scraping ,Web Crawling ,Google_News ,Twitter)

文件列表:

api (0, 2019-11-02)
api\api_caller.py (1079, 2019-11-02)
common (0, 2019-11-02)
common\file_writer.py (529, 2019-11-02)
config.json (756, 2019-11-02)
config (0, 2019-11-02)
config\config.py (2948, 2019-11-02)
connection (0, 2019-11-02)
connection\mongo_connection.py (361, 2019-11-02)
crawlers (0, 2019-11-02)
crawlers\google_news_crawlers (0, 2019-11-02)
crawlers\google_news_crawlers\__pycache__ (0, 2019-11-02)
crawlers\google_news_crawlers\__pycache__\__init__.cpython-36.pyc (160, 2019-11-02)
crawlers\google_news_crawlers\google_news_crawlers.py (2538, 2019-11-02)
crawlers\twitter_crawlers (0, 2019-11-02)
crawlers\twitter_crawlers\__pycache__ (0, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\__init__.cpython-36.pyc (148, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\twitter.cpython-36.pyc (1253, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\twitter_credentials.cpython-36.pyc (422, 2019-11-02)
crawlers\twitter_crawlers\twitter.py (1517, 2019-11-02)
crawlers\twitter_crawlers\twitter_credentials.py (304, 2019-11-02)
data (0, 2019-11-02)
data\google_news_data (0, 2019-11-02)
data\google_news_data\google_news_{}.html (1408979, 2019-11-02)
main.py (1403, 2019-11-02)
models (0, 2019-11-02)
models\__pycache__ (0, 2019-11-02)
models\__pycache__\__init__.cpython-36.pyc (129, 2019-11-02)
models\__pycache__\google_news.cpython-36.pyc (2512, 2019-11-02)
models\__pycache__\twitter.cpython-36.pyc (1180, 2019-11-02)
models\google_news.py (2581, 2019-11-02)
models\twitter.py (1012, 2019-11-02)
requirements.txt (4571, 2019-11-02)
routes (0, 2019-11-02)
routes\__init__.py (94, 2019-11-02)
routes\__pycache__ (0, 2019-11-02)
routes\__pycache__\__init__.cpython-36.pyc (249, 2019-11-02)
routes\__pycache__\google_news_data.cpython-36.pyc (805, 2019-11-02)
... ...

# Crawlers_Google_News_Twitter ''' This project is developed with python 3.6 in order to collect and process external data from different sources such as social media "Twitter" and other sites like "Google News". Those data will be integrated to a NoSQL database "MongoDB" to be consumed by another application. ''' ### Installing ''' To ensure that this project will work successfully you need to install first of all a virtualenv to prevent any conflict with other system libraries.Then, you should install libraries existant in text file "requirements.txt. ''' ## Deployment ## Project in details . data-collection ├── common ├── api ├── config ├── constant ├── crawlers │ ├── twitter_crawlers │ │ └── __init__.py │ ├── google_news_crawlers │ │ ├── google_news_crawlers.py │ │ └── __init__.py │ └── wikipedia_crawlers │ └── __init__.py │ │ ├──tests ├──config.json ├──requirements.txt ├── main.py ├── README.md └── .gitignore ## Getting Started Enjoy !!!

近期下载者：

相关文件：

评论：[我要评论] [举报此文件]

收藏者：