news-feeder

所属分类:collect
开发工具:Python
文件大小:0KB
下载次数:0
上传日期:2022-12-09 01:19:34
上 传 者sh-1993
说明:  用新闻提要和摘要来满足你的新闻渴望。,
(Feed your news cravings with news feeds and summaries.,)

文件列表:
api/ (0, 2022-09-13)
api/.vscode/ (0, 2022-09-13)
api/.vscode/settings.json (99, 2022-09-13)
api/api/ (0, 2022-09-13)
api/api/__init__.py (645, 2022-09-13)
api/api/routes.py (5281, 2022-09-13)
api/email_news_summary.py (4683, 2022-09-13)
api/environment.yml (569, 2022-09-13)
api/feeder/ (0, 2022-09-13)
api/feeder/emailer/ (0, 2022-09-13)
api/feeder/emailer/__init__.py (0, 2022-09-13)
api/feeder/emailer/emailer.py (2323, 2022-09-13)
api/feeder/extractor/ (0, 2022-09-13)
api/feeder/extractor/__init__.py (0, 2022-09-13)
api/feeder/extractor/rss_extractor.py (1932, 2022-09-13)
api/feeder/extractor/source_extractor.py (904, 2022-09-13)
api/feeder/extractor/twtr_extractor.py (78, 2022-09-13)
api/feeder/formatter/ (0, 2022-09-13)
api/feeder/formatter/__init__.py (0, 2022-09-13)
api/feeder/formatter/article_formatter.py (9308, 2022-09-13)
api/feeder/formatter/debug_parser.py (1836, 2022-09-13)
api/feeder/formatter/keyword_extractor.py (9363, 2022-09-13)
api/feeder/formatter/summarizer.py (4031, 2022-09-13)
api/feeder/models/ (0, 2022-09-13)
api/feeder/models/__init__.py (0, 2022-09-13)
api/feeder/models/article.py (1820, 2022-09-13)
api/feeder/models/pipeline_event.py (567, 2022-09-13)
api/feeder/models/source.py (2579, 2022-09-13)
api/feeder/models/topic.py (1691, 2022-09-13)
api/feeder/reader/ (0, 2022-09-13)
api/feeder/reader/__init__.py (0, 2022-09-13)
api/feeder/reader/reader.py (1743, 2022-09-13)
api/feeder/reader/topic_mapper.py (6110, 2022-09-13)
api/feeder/util/ (0, 2022-09-13)
api/feeder/util/api.py (2749, 2022-09-13)
api/feeder/util/content_fixer.py (5046, 2022-09-13)
... ...

Heads up: this repo is currently a work in progress. Imagine an evening news cast personalized to your interests... This application is an experiment with news aggregation and summarization. The goal is to replicate the experience of a newscaster or news team aggregating the most important news items related to a subject (e.g. world news or NASA) and aggregating them into a summary of summaries. The API has an extractor which pulls raw data from RSS feeds and extracts the content using BeautifulSoup. Articles are then sanitized and keywords are extracted using `nltk` and `gensim`. However this method is limited to Python 2.7 and does not yield the kind of results we could get from modern NLP techniques. This branch has two goals: to implement a modern keyword extraction and topic mapping pipeline using `huggingface` and `tensorflow` as well migrating away from AWS to reduce compute costs. To that end `content_fixer.py` and `topic_mapper.py` demonstrate the new pipeline using tensorflow. This pipeline is being tested on a Raspberry Pi 4 with a refactor to isolate incompatibility issues with the current environment. The extraction pipeline is being migrated to a Raspberry Pi 3B. Down the line the idea would be to expose the summarization service to a cloud hosted API for consumption by the frontend, thereby containing costs by isolating the heavy lifting to the on-prem RPis. ## Build and Run ### API 1. Install and run with conda from the environment.yml file (`conda create -n news-feeder -f environment.yml`) Note: For the API you may have to run `pip install flask-cors --upgrade` for `flask_cors` to work correctly ([see this issue.] (https://github.com/corydolphin/flask-cors/issues/194)) ### Vue Frontend 1. Install node modules with `yarn`. 2. Run the application with `yarn serve`. ### React Frontend 1. Install node modules with `yarn`. 2. Run the application with `yarn start`.

近期下载者

相关文件


收藏者