news-clustering

所属分类:聚类算法
开发工具:Python
文件大小:21KB
下载次数:0
上传日期:2022-05-02 17:25:44
上 传 者sh-1993
说明:  新闻聚类算法。提交给EMNLP的“流媒体新闻的多语言聚类”论文的实现...
(News clustering algorithm. Implementation of the "Multilingual Clustering of Streaming News" paper submitted to EMNLP 2018)

文件列表:
LICENSE (1589, 2022-05-03)
clustering.py (6200, 2022-05-03)
dataset_loader.py (5894, 2022-05-03)
download_data.sh (553, 2022-05-03)
eval.py (14627, 2022-05-03)
eval_lib.py (4429, 2022-05-03)
load_corpora.py (3189, 2022-05-03)
model.py (2886, 2022-05-03)
models (0, 2022-05-03)
models\de (0, 2022-05-03)
models\de\2_1499938269.299021_100.0.model (507, 2022-05-03)
models\de\example_2017-07-13T085725.498310.ii (174, 2022-05-03)
models\en (0, 2022-05-03)
models\en\4_1491902620.876421_10000.0.model (508, 2022-05-03)
models\en\example_2017-04-10T193850.536289.ii (174, 2022-05-03)
models\en\md_3 (459, 2022-05-03)
models\es (0, 2022-05-03)
models\es\2_1492035151.291134_100.0.model (516, 2022-05-03)
models\es\example_2017-04-12T215308.030747.ii (213, 2022-05-03)
run.sh (208, 2022-05-03)
testbench.py (3272, 2022-05-03)
utils.py (3730, 2022-05-03)

# Supercedence Notice This work has been superseded by https://github.com/Priberam/projected-news-clustering. # news-clustering run download_data.sh to download dataset run run.sh to execute and print scores Implementation of the paper: Multilingual Clustering of Streaming News, Sebastiao Miranda, Arturs Znotins, Shay B. Cohen, Guntis Barzdins, In EMNLP 2018 (http://aclweb.org/anthology/D18-1483) The original paper, as mentioned above, used proprietary software by Priberam. Unfortunately, we are unable to release this software (because of licensing issues and because it is embedded in a larger C++ system), so we provide a re-implementation in Python that we hope will also be clearer to work with and change. Some parts, such as the feature extraction and svm training code are proprietary or part of proprietary code, so we provide the dataset with the features already extracted and also the pre-trained models.

近期下载者

相关文件


收藏者