TR-2014

所属分类:图形图象
开发工具:JavaScript
文件大小:0KB
下载次数:0
上传日期:2015-01-17 04:12:33
上 传 者sh-1993
说明:  分析nediyor.com和thepplazz.com提供的舆论塑造者和新闻制作者的策划推文,以了解新......的动态...,
(Analyzing curated tweets of opinion-shapers and newsmakers provided by nediyor.com and theplazz.com to understand the dynamics of the news in 2014 in Turkey and in the US.)

文件列表:
LICENSE (1076, 2015-01-16)
analysis/ (0, 2015-01-16)
analysis/Analysis.xlsx (171331, 2015-01-16)
analysis/stats.py (416, 2015-01-16)
data/ (0, 2015-01-16)
data/TR-headlines.csv (1493793, 2015-01-16)
data/TR-tweeps.csv (38809348, 2015-01-16)
data/US-headlines.csv (725009, 2015-01-16)
data/US-tweeps.csv (28614742, 2015-01-16)
data/nediyor.com.zip (4474642, 2015-01-16)
data/theplazz.com.zip (3302463, 2015-01-16)
scrapers/ (0, 2015-01-16)
scrapers/scrape-nediyor.py (1421, 2015-01-16)
scrapers/scrape-theplazz.py (1332, 2015-01-16)
scrapers/scrape-tweets.py (3165, 2015-01-16)
visualization/ (0, 2015-01-16)
visualization/aggregate-daily.py (1274, 2015-01-16)
visualization/container.js (12810, 2015-01-16)

# Commentary Tweets of the *Elites* Analyzing curated tweets of opinion-shapers and newsmakers provided by [nediyor.com](http://nediyor.com) and [theplazz.com](http://theplazz.com) news sites to understand the dynamics of the responses of the elites to the important events in the US and in Turkey. ## Data Collection On the news that made to the headlines we collected about two years of curated tweets data for the United States (154,684 tweets of 1,442 commentators on 7,376 news between 01/09/2015 and 01/14/2013) and Turkey (190,180 tweets of 1306 commentators on 10,044 news between 01/09/2015 and 01/14/2013). * Filenames starting with `scrape-` : * Selenium (as a Python API) is used to scrape the data from the main pages of the websites. * Scrolled down 1000 times to overcome the lazy loading feature of the sites. * To get individual comments, downloaded ~17,000 htmls from the links scraped from the main pages by `nohup sh -c "cat urls.txt | xargs -n 1 -P 10 wget " &` * The compressed files for [nediyor(190MB)](https://www.dropbox.com/s/3so72z136xfm9pn/nediyor_news.rar) and [theplazz(107MB)](https://www.dropbox.com/s/di6uatp7emdn5qb/theplazz_news.rar) are on dropbox. ## Data Analysis ### Daily Commentary Statistics * `Aggregate-daily` & `container.js` : * Counts of comments on news are aggregated by day and visualized * Time series data is visualized using [Highcarts JS](http://www.highcharts.com/demo/line-time-series). ### Commentator Statistics * `commentators-stats.py` calculates and visualizes the following statistics: * Comment counts by commentator * Group commentators by profession * Monthly commentator performance * ... ## Initial Findings * Daily comment count visualization is [here](http://talhaoz.com/news/)

近期下载者

相关文件


收藏者