News-Articles-Classification

所属分类:特征抽取
开发工具:Python
文件大小:15190KB
下载次数:0
上传日期:2018-08-13 08:04:32
上 传 者sh-1993
说明:  该数据集包含从《赫芬顿邮报》获得的2013年至2018年的约12.5万条新闻标题。模型训练...
(This dataset contains around 125k news headlines from the year 2013 to 2018 obtained from HuffPost. The model trained on this dataset could be used to identify tags for untracked news articles or to identify the type of category of the news.)

文件列表:
News_Category_Dataset.json.zip (15551444, 2018-08-13)
news_headlines.py (5384, 2018-08-13)

# ML---News-Category [It is not yet completed , I have Implemented using machine learning algorithms and the accuracy is saturating at ~3.5. I am currently implementing Neural Networks to get a better model ] To categorize news articles based on their headlines and short descriptions? Acknowledgements: I have taken this HuffPost dataset from Kaggle to practise purpose only If this is against the TOS, please let me know and I will take it down. This dataset contains around 125k news headlines from the year 2013 to 2018 obtained from HuffPost. The model trained on this dataset could be used to identify tags for untracked news articles or to identify the type of category of the news. Content Each news headline has a corresponding category. Categories and corresponding article counts are as follows: POLITICS: 32739 ENTERTAINMENT: 14257 HEALTHY LIVING: 6694 QUEER VOICES: 4995 BUSINESS: 4254 SPORTS: 4167 COMEDY: 3971 PARENTS: 3955 BLACK VOICES: 3858 THE WORLDPOST: 36*** WOMEN: 3490 CRIME: 2893 MEDIA: 2815 WEIRD NEWS: 2670 GREEN: 2622 IMPACT: 2602 WORLDPOST: 2579 RELIGION: 2556 STYLE: 2254 WORLD NEWS: 2177 TRAVEL: 2145 TASTE: 2096 ARTS: 1509 FIFTY: 1401 GOOD NEWS: 13*** SCIENCE: 1381 ARTS & CULTURE: 1339 TECH: 1231 COLLEGE: 1144 LATINO VOICES: 1129 EDUCATION: 1004

近期下载者

相关文件


收藏者