BeautifulWebScraper

所属分类:数据采集/爬虫
开发工具:Others
文件大小:0KB
下载次数:0
上传日期:2023-10-16 10:57:10
上 传 者sh-1993
说明:  基于Python的Web Scraper实现了Beautiful Soup Package来抓取关于关键字的最新新闻,
(Python based Web Scraper implemented Beautiful Soup Package to scrape for recent news regarding a key word,)

文件列表:
LICENSE (35149, 2023-11-09)
code (5335, 2023-11-09)

# BeautifulWebScraper Python based Web Scraper in Jupyter Notebooks implemented using Beautiful Soup Package to scrape for recent information regarding a key word. In the example case being the word 'medicine'. The data was scraped from Reuters.com, which is a world-renowned trusted news source. The main genesis of the project was to efficiently scrape a news source in order to provide myself and my colleagues at the fire station up to date articles in order to learn and discuss improvements to new medicines. Medicine is always evolving so utlizing modern tools to understand modern changes in medicine can then be discussed to our Physicians whom write the protocols and general orders which we follow in patient care. I want to be able to improve the scraping and am looking at packages like asyncio and multithreading to concurrently scrape. As well as adding an API to the webscraper too, such as Rest API. Reuters was chosen due to its reputation and unbiased dissemination of information. I also implemented ChatGpt to be an efficient debugger and improve suggestions as ChatGpt can rapidly analyze and suggest improvements instanteously which is useful. Sifting through opinion pieces was not the aim in regards to this project; may be used in future projects, related to opinions of certain topics (athletes, new laws implemented etc) on social media sites. Article regarding the legality of webscraping as there is a huge usage of bots especially with web scraping for sneakers. https://mccarthylg.com/a-comprehensive-legal-guide-to-web-scraping-in-the-us/ ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54) ![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge&logo=jupyter&logoColor=white) ![ChatGPT](https://img.shields.io/badge/chatGPT-74aa9c?style=for-the-badge&logo=openai&logoColor=white)

近期下载者

相关文件


收藏者