PacSum
所属分类:特征抽取
开发工具:Python
文件大小:18KB
下载次数:0
上传日期:2021-09-06 09:21:36
上 传 者:
sh-1993
说明: 基于位置增强中心度的无监督提取摘要
(Unsupervised Extractive Summarization based on Position-Augmented Centrality)
文件列表:
code (0, 2021-09-06)
code\bert_model.py (15251, 2021-09-06)
code\data_iterator.py (7122, 2021-09-06)
code\extractor.py (8526, 2021-09-06)
code\gensim_preprocess.py (13185, 2021-09-06)
code\run.py (2904, 2021-09-06)
code\tokenizer.py (10366, 2021-09-06)
code\utils.py (2970, 2021-09-06)
requirements.txt (105, 2021-09-06)
# PacSum
This code is for paper [Sentence Centrality Revisited for Unsupervised Summarization](https://arxiv.org/pdf/1906.03508.pdf) ACL 2019
Some codes are borrowed from [pytorch_pretrained_bert](https://github.com/huggingface/pytorch-transformers) and [gensim](https://github.com/RaRe-Technologies/gensim)
-------
### Dependencies
Python3.6, pytorch >= 1.0, numpy, gensim, pyrouge
-------
### Data used in the paper:
Download https://drive.google.com/open?id=1gNKWkZG4dVr5XrOeQBVicy1fdnpH2d5l
### Bert models fine-tuned using the approach in the paper:
Download https://drive.google.com/file/d/1wbMlLmnbD_0j7Qs8YY8cSCh935WKKdsP/view?usp=sharing
### Tuning the hyperparamters and test the performance using TfIdf or BERT representation
```
python run.py --rep tfidf --mode tune --tune_data_file path/to/validation/data --test_data_file path/to/test/data
```
```
python run.py --rep bert --mode tune --tune_data_file path/to/validation/data --test_data_file path/to/test/data --bert_model_file path/to/model --bert_config_file path/to/config --bert_vocab_file path/to/vocab
```
近期下载者:
相关文件:
收藏者: