bbc

所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:2646KB
下载次数:1
上传日期:2018-11-06 13:14:09
上 传 者三千秋水
说明:  用于机器学习,自然语言处理,文本分类,bbc新闻数据集
(For machine learning, Natural Language Processing)

文件列表:
bbc\business\001.txt (2560, 2017-04-16)
bbc\business\002.txt (2252, 2017-04-16)
bbc\business\003.txt (1552, 2017-04-16)
bbc\business\004.txt (2412, 2017-04-16)
bbc\business\005.txt (1570, 2017-04-16)
bbc\business\006.txt (1187, 2017-04-16)
bbc\business\007.txt (1669, 2017-04-16)
bbc\business\008.txt (1922, 2017-04-16)
bbc\business\009.txt (1494, 2017-04-16)
bbc\business\010.txt (1449, 2017-04-16)
bbc\business\011.txt (1144, 2017-04-16)
bbc\business\012.txt (1847, 2017-04-16)
bbc\business\013.txt (1830, 2017-04-16)
bbc\business\014.txt (2981, 2017-04-16)
bbc\business\015.txt (3808, 2017-04-16)
bbc\business\016.txt (1393, 2017-04-16)
bbc\business\017.txt (1299, 2017-04-16)
bbc\business\018.txt (1002, 2017-04-16)
bbc\business\019.txt (1733, 2017-04-16)
bbc\business\020.txt (3854, 2017-04-16)
bbc\business\021.txt (2046, 2017-04-16)
bbc\business\022.txt (1933, 2017-04-16)
bbc\business\023.txt (1267, 2017-04-16)
bbc\business\024.txt (1954, 2017-04-16)
bbc\business\025.txt (2704, 2017-04-16)
bbc\business\026.txt (1829, 2017-04-16)
bbc\business\027.txt (1620, 2017-04-16)
bbc\business\028.txt (1249, 2017-04-16)
bbc\business\029.txt (2492, 2017-04-16)
bbc\business\030.txt (2487, 2017-04-16)
bbc\business\031.txt (1888, 2017-04-16)
bbc\business\032.txt (1733, 2017-04-16)
bbc\business\033.txt (1348, 2017-04-16)
bbc\business\034.txt (1780, 2017-04-16)
bbc\business\035.txt (1217, 2017-04-16)
bbc\business\036.txt (2130, 2017-04-16)
bbc\business\037.txt (2462, 2017-04-16)
bbc\business\038.txt (930, 2017-04-16)
bbc\business\039.txt (1252, 2017-04-16)
bbc\business\040.txt (1355, 2017-04-16)
... ...

Consists of 2225 documents from the BBC news website corresponding to stories in five topical areas from 2004-2005. Natural Classes: 5 (business, entertainment, politics, sport, tech) If you make use of the dataset, please consider citing the publication: - D. Greene and P. Cunningham. "Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering", Proc. ICML 2006. All rights, including copyright, in the content of the original articles are owned by the BBC. Contact Derek Greene for further information. http://mlg.ucd.ie/datasets/bbc.html

近期下载者

相关文件


收藏者