Crawlers_Google_News_Twitter

所属分类:数据采集/爬虫
开发工具:HTML
文件大小:300KB
下载次数:0
上传日期:2019-11-02 14:46:14
上 传 者sh-1993
说明:  网络抓取、网络抓取、谷歌新闻、推特
(Web Scraping ,Web Crawling ,Google_News ,Twitter)

文件列表:
api (0, 2019-11-02)
api\api_caller.py (1079, 2019-11-02)
common (0, 2019-11-02)
common\file_writer.py (529, 2019-11-02)
config.json (756, 2019-11-02)
config (0, 2019-11-02)
config\config.py (2948, 2019-11-02)
connection (0, 2019-11-02)
connection\mongo_connection.py (361, 2019-11-02)
crawlers (0, 2019-11-02)
crawlers\google_news_crawlers (0, 2019-11-02)
crawlers\google_news_crawlers\__pycache__ (0, 2019-11-02)
crawlers\google_news_crawlers\__pycache__\__init__.cpython-36.pyc (160, 2019-11-02)
crawlers\google_news_crawlers\google_news_crawlers.py (2538, 2019-11-02)
crawlers\twitter_crawlers (0, 2019-11-02)
crawlers\twitter_crawlers\__pycache__ (0, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\__init__.cpython-36.pyc (148, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\twitter.cpython-36.pyc (1253, 2019-11-02)
crawlers\twitter_crawlers\__pycache__\twitter_credentials.cpython-36.pyc (422, 2019-11-02)
crawlers\twitter_crawlers\twitter.py (1517, 2019-11-02)
crawlers\twitter_crawlers\twitter_credentials.py (304, 2019-11-02)
data (0, 2019-11-02)
data\google_news_data (0, 2019-11-02)
data\google_news_data\google_news_{}.html (1408979, 2019-11-02)
main.py (1403, 2019-11-02)
models (0, 2019-11-02)
models\__pycache__ (0, 2019-11-02)
models\__pycache__\__init__.cpython-36.pyc (129, 2019-11-02)
models\__pycache__\google_news.cpython-36.pyc (2512, 2019-11-02)
models\__pycache__\twitter.cpython-36.pyc (1180, 2019-11-02)
models\google_news.py (2581, 2019-11-02)
models\twitter.py (1012, 2019-11-02)
requirements.txt (4571, 2019-11-02)
routes (0, 2019-11-02)
routes\__init__.py (94, 2019-11-02)
routes\__pycache__ (0, 2019-11-02)
routes\__pycache__\__init__.cpython-36.pyc (249, 2019-11-02)
routes\__pycache__\google_news_data.cpython-36.pyc (805, 2019-11-02)
... ...

# Crawlers_Google_News_Twitter ''' This project is developed with python 3.6 in order to collect and process external data from different sources such as social media "Twitter" and other sites like "Google News". Those data will be integrated to a NoSQL database "MongoDB" to be consumed by another application. ''' ### Installing ''' To ensure that this project will work successfully you need to install first of all a virtualenv to prevent any conflict with other system libraries.Then, you should install libraries existant in text file "requirements.txt. ''' ## Deployment ## Project in details . data-collection ├── common ├── api ├── config ├── constant ├── crawlers │ ├── twitter_crawlers │ │ └── __init__.py │ ├── google_news_crawlers │ │ ├── google_news_crawlers.py │ │ └── __init__.py │ └── wikipedia_crawlers │ └── __init__.py │ │ ├──tests ├──config.json ├──requirements.txt ├── main.py ├── README.md └── .gitignore ## Getting Started Enjoy !!!

近期下载者

相关文件


收藏者