News-Extraction

所属分类:前端开发
开发工具:Python
文件大小:6446KB
下载次数:0
上传日期:2022-08-14 18:38:58
上 传 者sh-1993
说明:  这是我第一个尝试使用python进行新闻提取的存储库。从新闻网页上,我摘录了...
(This is my first repository where I have tried news extraction using python. From the news webpage, I have extracted the headlines, image URL, source URL, author, summary each news article on that web page. After extracting the data, I have created a front-end to display it using HTML and with help of Bootstrap.)

文件列表:
Extraction.py (2165, 2022-06-15)
__pycache__ (0, 2022-06-15)
__pycache__\Extraction.cpython-310.pyc (1324, 2022-06-15)
__pycache__\Extraction.cpython-38.pyc (1335, 2022-06-15)
__pycache__\Extraction.cpython-39.pyc (1277, 2022-06-15)
screenshots (0, 2022-06-15)
screenshots\Screenshot (4).png (3166638, 2022-06-15)
screenshots\Screenshot (5).png (2546489, 2022-06-15)
screenshots\Screenshot (6).png (521952, 2022-06-15)
screenshots\Screenshot (7).png (308390, 2022-06-15)
static (0, 2022-06-15)
static\css (0, 2022-06-15)
static\css\main.css (1828, 2022-06-15)
static\images (0, 2022-06-15)
static\images\bg-photo.jpg (78320, 2022-06-15)
templates (0, 2022-06-15)
templates\homepage.html (1952, 2022-06-15)
test.py (296, 2022-06-15)
words.txt (22227, 2022-06-15)

# News-Extraction - This is my first repository where I have tried news extraction using python - From the news webpage, I have extracted the headlines, keyword, image URL, source URL, author, summary of each news article on that web page - After extracting the data, I have created a front-end to display it using HTML and CSS # Objective - Extraction is the process of removing required data from a source - Here the source is a news website, Inshorts (https://www.inshorts.com/en/read) - Ultimately the data extracted is in the form of string. After extracting the data, I have tried to parse it - Parsing is a method where one string of data gets converted into different forms of data # Technologies used - I have written the source code in python with the help of some libraries and built-in functions - The libraries I've used are 'requests', 'bs4 (BeautifulSoup4)' and 'flask' - 'requests' to requests the access of the data from the news webpage URL - 'bs4 (BeautifulSoup4)' to parse the data in the required form - 'flask' to build the web application - The source code 'test.py' can run on any IDE that supports python 3 and has the above libraries installed - You will also need a search engine to view the website # Code Output - After running the code, the code will host a server of a webpage which has the code output - The webpage is written in HTML and CSS ![Screenshot 1](https://github.com/sehajdeep1814/News-Extraction/blob/master/<./screenshots/Screenshot%20(4).png>) ![Screenshot 1](https://github.com/sehajdeep1814/News-Extraction/blob/master/<./screenshots/Screenshot%20(5).png>) # References - To learn web scrapping, I had an access to a course on "My Captain" platform (https://app.mycaptain.in/) # Authors - Sehajdeep Singh - https://github.com/sehajdeep1814/ - Qasim Shaikh - https://github.com/shaikhmq20/

近期下载者

相关文件


收藏者