hacker-news-digest
所属分类:特征抽取
开发工具:Python
文件大小:1361KB
下载次数:0
上传日期:2023-06-18 17:29:15
上 传 者:
sh-1993
说明: 让ChatGPT为您总结黑客新闻
(Let ChatGPT Summarize Hacker News for You)
文件列表:
LICENSE (7651, 2023-10-16)
Makefile (2088, 2023-10-16)
content-from-web-pages-using-Machine-Learning.ipynb (43857, 2023-10-16)
config.py (2722, 2023-10-16)
config (0, 2023-10-16)
config\blueware.ini (8455, 2023-10-16)
config\newrelic.ini (8767, 2023-10-16)
config\nginx.conf.erb (1852, 2023-10-16)
db (0, 2023-10-16)
db\__init__.py (332, 2023-10-16)
db\engine.py (1001, 2023-10-16)
db\image.py (1322, 2023-10-16)
db\summary.py (3676, 2023-10-16)
db\translation.py (1738, 2023-10-16)
hacker_news (0, 2023-10-16)
hacker_news\__init__.py (0, 2023-10-16)
hacker_news\algolia_api.py (3141, 2023-10-16)
hacker_news\llm (0, 2023-10-16)
hacker_news\llm\__init__.py (0, 2023-10-16)
hacker_news\llm\google_t5.py (1455, 2023-10-16)
hacker_news\llm\llama.py (948, 2023-10-16)
hacker_news\news.py (12616, 2023-10-16)
hacker_news\parser.py (4229, 2023-10-16)
output (0, 2023-10-16)
output\image (0, 2023-10-16)
page_content_extractor (0, 2023-10-16)
page_content_extractor\__init__.py (1850, 2023-10-16)
page_content_extractor\embeddable.py (5545, 2023-10-16)
page_content_extractor\exceptions.py (38, 2023-10-16)
page_content_extractor\html.py (17491, 2023-10-16)
page_content_extractor\http.py (2384, 2023-10-16)
page_content_extractor\imgsz.py (13406, 2023-10-16)
... ...
[Let ChatGPT Summarize Hacker News for You](https://hackernews.betacat.io/)
==================
[![Github Pages](https://github.com/polyrabbit/hacker-news-digest/actions/workflows/static.yml/badge.svg)](https://github.com/polyrabbit/hacker-news-digest/actions/workflows/static.yml)
[![license](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://github.com/polyrabbit/hacker-news-digest/blob/master/LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/polyrabbit/hacker-news-digest/pulls)
[![Hacker News](https://camo.githubusercontent.com/73322cbcbf1c517bb5d3d8d4e724f81091fc767ccc278b44f1ee1a1179e9ad38/68747470733a2f2f736869656c***732e696f2f6261***67652f4861636b65722532304e6577732d6630363532663f6c6f676f3d79253230636f6d62696e61746f72267374796c653d666c61742d737175617265266c6f676f436f6c6f723d7768697465)](https://hackernews.betacat.io/)
> [–‰](https://blog.betacat.io/post/2023/06/summarize-hacker-news-by-chatgpt/)
[Hacker News Summary](https://hackernews.betacat.io/) leverages AI technology to extract summaries
and illustrations from [Hacker News](https://news.ycombinator.com/)
articles, providing a seamless news scanning experience.
Summaries are primarily generated by
ChatGPT [gpt-3.5-turbo](https://platform.openai.com/docs/models/gpt-3-5) model, and fallback to
local [GoogleT5](https://huggingface.co/t5-large) model when ChatGPT is not available.
## Features
* Clear and easily understandable summaries generated by our advanced AI assistant
* Relevant illustrations make articles easily scannable and visually engaging
* Common video sites, PDFs, and GitHub gists are seamlessly embedded
* Flexibility to sort articles based on their points, comment count, or publication time
* Filter the topN articles based on their points.
* RSS feeds fully supported ([#14](https://github.com/polyrabbit/hacker-news-digest/issues/14), [#19](https://github.com/polyrabbit/hacker-news-digest/issues/19))
* Local translation (Chinese)
## Talk is cheap, show me the screenshot!
![hn-summary](https://github.com/polyrabbit/hacker-news-digest/assets/2657334/cc08f770-5154-4c7e-8ba8-13c89f394b1f)
Emoji explained:
* ¤: point - upvotes received from the Hacker News community
* ‘¤: user - Hacker News user who submitted this post
* : submission time - a human-readable time indicating when the post was submitted
* ’: comment count - comments posted by the community, click to visit this comment page
* ”—: source of the news - where the news originated
* “°: summary model - which model is used to generate the summary, options
are `OpenAI`, `GoogleT5` and `Prefix`
## How it works
[Hacker News Summary](https://hackernews.betacat.io/) is a static site hosted on GitHub Pages. It
performs the following periodic actions:
1. Parsing the Hacker News page to obtain a list of news articles
2. Extracting the main content from each news article using
a [score algorithm](%5Btutorial%5D%20How-to-extract-main-content-from-web-pages-using-Machine-Learning.ipynb)
3. Finding the most suitable illustration for each article and making a local copy
4. Generating summaries of the article's content using OpenAI API or invoking a local model as a
fallback when the API is unavailable
5. Rendering a template that incorporates the illustrations and summaries, and deploying it to
GitHub Pages
## Localization
Translation is also performed by ChatGPT, with a single extra step in the prompt. Currently supported languages:
* [–è‘](https://hackernews.betacat.io/zh.html)
## TODO
- [ ] A better way to scrap websites (maybe PhantomJS & Selenium)
- [ ] Also summarize comments ([see discussions on Hacker News](https://news.ycombinator.com/item?id=36260140))
- [ ] Switch to [Hacker News API](https://github.com/HackerNews/API)
- [ ] A more beautiful home page (maybe in HTML9)
- [ ] Discover an alternative local models for generating summaries
- [X] Sort articles by points/comments/time
- [X] Filter topN articles by points
- [X] RSS
- [X] Deploy on github pages
- [X] Have a good sleep !important
近期下载者:
相关文件:
收藏者: