Media-Bias-NLP-Clustering

所属分类:特征抽取
开发工具:Others
文件大小:883644KB
下载次数:0
上传日期:2019-02-09 05:42:29
上 传 者sh-1993
说明:  揭示遗漏——奥巴马医改新闻报道中的媒体偏见探究。采用硒和美容...
(Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare. Employs Selenium and BeautifulSoup to scrape over 160k articles across over 8k publishers on Obamacare. Uses TF-IDF and LDA to perform topic modeling which revealed what’s theoretically omitted in a given article and systematically underrepresented at a)

文件列表:
.ipynb_checkpoints (0, 2017-03-16)
.ipynb_checkpoints\Project Overview - Technical IO via Pickles-checkpoint.ipynb (6975, 2017-03-16)
.ipynb_checkpoints\Project Overview-checkpoint.ipynb (2262, 2017-03-16)
Code (0, 2017-03-16)
Code\.ipynb_checkpoints (0, 2017-03-16)
Code\.ipynb_checkpoints\Part 1b - Sourcing Articles - From NY times - Using NYTIMes API + Newspaper Article-checkpoint.ipynb (29826, 2017-03-16)
Code\.ipynb_checkpoints\Part 2 -- Google News Article URLS -> Scrape Text-checkpoint.ipynb (201358, 2017-03-16)
Code\.ipynb_checkpoints\Part 2.5 - Convert big pickles to smaller files-checkpoint.ipynb (7465, 2017-03-16)
Code\Part 1a - Sourcing Articles - Google Search News Articles.ipynb (22090, 2017-03-16)
Code\Part 1b - Sourcing Articles - From NY times - Using NYTIMes API + Newspaper Article.ipynb (29826, 2017-03-16)
Code\Part 2 -- Google News Article URLS -> Scrape Text.ipynb (201358, 2017-03-16)
Code\Part 2.5 - Convert big pickles to smaller files.ipynb (7465, 2017-03-16)
Code\Part 3 - Text Preprocessing & LDA Model Creation.ipynb (668023, 2017-03-16)
Code\project_functions.py (2250, 2017-03-16)
Code\project_functions.pyc (1250, 2017-03-16)
MediaBias_Presentation_2016_May.key (14431784, 2017-03-16)
MediaBias_Presentation_2016_May_w_notes.pdf (9647545, 2017-03-16)
MediaBias_Presentation_2016_May_wo_notes.pdf (12932081, 2017-03-16)
Project Overview - Technical IO via Pickles.ipynb (6975, 2017-03-16)
Project Overview.ipynb (2262, 2017-03-16)
Resources (0, 2017-03-16)
Resources\Data (0, 2017-03-16)
Resources\Data\Derived (0, 2017-03-16)
Resources\Data\Derived\Articles_Overtime (0, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\abc.txt (2339, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\blaze.txt (5636, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\cbs.txt (2262, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\cs_monitor.txt (3121, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\fortune.txt (2752, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\fox_news.txt (4433, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\huff_post.txt (6484, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\la_times.txt (6058, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\msnbc.txt (2930, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\ny_daily.txt (3275, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\ny_mag.txt (4857, 2017-03-16)
Resources\Data\Derived\Articles_Overtime\the_gaurdian.txt (4431, 2017-03-16)
... ...

# Revealing the Omitted - An Exploration of Media Bias in the news coverage of Obamacare ## Project Summery: Employs Selenium and BeautifulSoup to scrape over 160k articles across over 8k publishers on Obamacare. Uses TF-IDF and LDA to perform topic modeling which revealed whats theoretically omitted in a given article and systematically underrepresented at a publisher level. ## File Structure Summary: The project's files are organized into the following structure: **[Code Folder](https://github.com/Code/)** - In Python - Contains IPython Jupyter notebooks which perform contain all data analysis and any custom functions built in separate python script files. (*Currently in progress*) **[Resources Folder](https://github.com/Resources/)** - All the raw and derived data used in the project. Also contains original research describing how the origin of the raw data. **[Presentation file](https://github.comMediaBias_Presentation_2016_May.pdf)** - A deck of initial results presented in early 2016. **[Project Overview file](https://github.comProject%20Overview.ipynb)** - Provides a high level overview of the insights made throughout the data analysis. (*Currently in progress*) **README file** - You're reading it. Describes logistics what things are doing and how they are organized.


近期下载者

相关文件


收藏者