TwitterDataAnalysis_BigDataProgramming

所属分类:大数据
开发工具:Jupyter Notebook
文件大小:32489KB
下载次数:0
上传日期:2020-10-20 04:28:20
上 传 者sh-1993
说明:  大数据与编程项目
(Big Data and Programming Project)

文件列表:
March2016_Tweet_Processing_Spark.ipynb (10562325, 2020-10-20)
March2016_Twitter_Processing_Local.ipynb (22393567, 2020-10-20)
TwitterProjectReport.pdf (1083351, 2020-10-20)
Twitter_Presentation (0, 2020-10-20)
Twitter_Presentation\Slide1.jpeg (84716, 2020-10-20)
Twitter_Presentation\Slide10.jpeg (126423, 2020-10-20)
Twitter_Presentation\Slide11.jpeg (82867, 2020-10-20)
Twitter_Presentation\Slide12.jpeg (136434, 2020-10-20)
Twitter_Presentation\Slide13.jpeg (147649, 2020-10-20)
Twitter_Presentation\Slide14.jpeg (129898, 2020-10-20)
Twitter_Presentation\Slide15.jpeg (142247, 2020-10-20)
Twitter_Presentation\Slide16.jpeg (118342, 2020-10-20)
Twitter_Presentation\Slide17.jpeg (152881, 2020-10-20)
Twitter_Presentation\Slide18.jpeg (159460, 2020-10-20)
Twitter_Presentation\Slide19.jpeg (111448, 2020-10-20)
Twitter_Presentation\Slide2.jpeg (101494, 2020-10-20)
Twitter_Presentation\Slide20.jpeg (175558, 2020-10-20)
Twitter_Presentation\Slide21.jpeg (115249, 2020-10-20)
Twitter_Presentation\Slide22.jpeg (137385, 2020-10-20)
Twitter_Presentation\Slide23.jpeg (121875, 2020-10-20)
Twitter_Presentation\Slide24.jpeg (136347, 2020-10-20)
Twitter_Presentation\Slide25.jpeg (117014, 2020-10-20)
Twitter_Presentation\Slide26.jpeg (184244, 2020-10-20)
Twitter_Presentation\Slide27.jpeg (146682, 2020-10-20)
Twitter_Presentation\Slide28.jpeg (91255, 2020-10-20)
Twitter_Presentation\Slide3.jpeg (180495, 2020-10-20)
Twitter_Presentation\Slide4.jpeg (132348, 2020-10-20)
Twitter_Presentation\Slide5.jpeg (153381, 2020-10-20)
Twitter_Presentation\Slide6.jpeg (182361, 2020-10-20)
Twitter_Presentation\Slide7.jpeg (176177, 2020-10-20)
Twitter_Presentation\Slide8.jpeg (109317, 2020-10-20)
Twitter_Presentation\Slide9.jpeg (131383, 2020-10-20)
Twitter_Presentation\final_presentation.pptx (10721412, 2020-10-20)
pip3-install-packages-bash.sh (339, 2020-10-20)

# Twitter-march2016-analysis Exploratory data analysis of Twitter Stream from March 2016 using the Big data tool Apache-Spark with python and application of various machine learning, data mining concepts such as LDA, Network Analysis & Clustering. # Project is divided in following sections: - VISUAL ANALYSIS OF TWEETS - SIX DEGREES OF SEPARATION - TOPIC MODELING - USER CLUSTERING - MISSPELLED WORDS ANALYSIS ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide1.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide2.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide3.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide4.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide5.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide6.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide7.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide8.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide9.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide10.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide11.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide12.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide13.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide14.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide15.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide16.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide17.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide18.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide19.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide20.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide21.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide22.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide23.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide24.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide25.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide26.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide27.jpeg?raw=true ) ![alt text](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/Twitter_Presentation/Slide28.jpeg?raw=true ) ## Instructions to reproduce project * Please start with **readme_main.pdf** for the instructions to run the Jupyter Notebook specific to Spark for the Pre-processing and most of the analysis. * Please go through **readme_local.md** for the instructions to run the Jupyter Notebook specific to local machine which contains a couple of analytics based on the outputs of previous spark pre-processing. We had to run it in two separate notebooks because of few limitations we are facing with Microsoft Azure HDInsights. Please check the IEEE format report for project insights: [LINK](https://github.com/SONAMDAWANI/TwitterDataAnalysis_BigDataProgramming/blob/master/TwitterProjectReport.pdf)

近期下载者

相关文件


收藏者