CALM

所属分类:自动编程
开发工具:Python
文件大小:27828KB
下载次数:0
上传日期:2021-09-16 07:36:36
上 传 者sh-1993
说明:  ICLR 2021论文的源代码:为以概念为中心的常识预训练文本到文本转换器
(Source code for ICLR 2021 paper : Pre-training Text-to-Text Transformers for Concept-Centric Common Sense)

文件列表:
CALM (0, 2021-09-16)
CALM\__init__.py (0, 2021-09-16)
CALM\dataset.py (34130, 2021-09-16)
CALM\dataset_utils (0, 2021-09-16)
CALM\dataset_utils\__init__.py (0, 2021-09-16)
CALM\dataset_utils\concept_deshuffling_data_generation.py (1698, 2021-09-16)
CALM\dataset_utils\generate_discriminative_dataset.py (26800, 2021-09-16)
CALM\dataset_utils\keyword_lm_data_generation.py (1968, 2021-09-16)
CALM\dataset_utils\mix_dataset.py (3509, 2021-09-16)
CALM\datasets (0, 2021-09-16)
CALM\datasets\c2s (0, 2021-09-16)
CALM\datasets\cor (0, 2021-09-16)
CALM\datasets\csqa (0, 2021-09-16)
CALM\datasets\mix (0, 2021-09-16)
CALM\datasets\openbookqa (0, 2021-09-16)
CALM\datasets\option1 (0, 2021-09-16)
CALM\datasets\option2 (0, 2021-09-16)
CALM\datasets\option3 (0, 2021-09-16)
CALM\datasets\piqa (0, 2021-09-16)
CALM\datasets\wiki (0, 2021-09-16)
CALM\datasets\wiki\wiki.train.raw (61045883, 2021-09-16)
CALM\datasets\wiki\wiki.valid.raw (12348595, 2021-09-16)
CALM\finetune.py (4951, 2021-09-16)
CALM\finetune_generator_discriminator.py (4942, 2021-09-16)
CALM\generator (0, 2021-09-16)
CALM\generator\__init__.py (0, 2021-09-16)
CALM\generator\concept (0, 2021-09-16)
CALM\generator\concept\__init__.py (0, 2021-09-16)
CALM\generator\concept\concept_generator.py (3770, 2021-09-16)
... ...

# Pre-training Text-to-Text Transformers for Concept-centric Common Sense This code is for ICLR2021 paper: [Pre-training Text-to-Text Transformers for Concept-centric Common Sense](https://openreview.net/forum?id=3k20LAiHYL2). Checkout our [Project website](https://inklab.usc.edu/calm-project) for details! ## Installation ``` conda create -n calm python==3.7 conda activate calm python setup.py install cd CALM ``` ## Preprocess for CALM ### wiki pre-process ``` cat wiki.doc | tail -n +500000 | head -n 500000 > wiki/wiki.train.raw cat wiki.doc | tail -n +1000000 | head -n 100000 > wiki/wiki.valid.raw ``` ### Generative Objective ``` python dataset_utils/concept_deshuffling_data_generation.py python dataset_utils/keyword_lm_data_generation.py ``` Dataset creation for concept-order-recovering (COR) and concept-to-sentence (C2S). ### Contrastive Objective ``` python dataset_utils/generate_discriminative_dataset.py ``` Dataset creation for generative question answering (QA) dataset.\ There are three types of contrastive objectives (See Table 4 (b) in the [paper](https://openreview.net/forum?id=3k20LAiHYL2)). ``` Option 1: Multi-choice QA Option 2: Generative QA Option 3: True/False ``` For CALM, we use option 2, which is Generative QA. ### Mix three dataset ``` python dataset_utils/mix_dataset.py ``` ## Pre-training ### Pre-train CALM_mix First, train the mix dataset. ``` python finetune.py \ --data_dir datasets/mix \ --output_dir outputs/calm_mix_base \ --model_name_or_path t5-base \ --tokenizer_name_or_path t5-base \ --max_seq_length 256 \ --learning_rate 5e-4 \ --num_train_epochs 2 \ --train_batch_size 8 \ --graident_accumulation_steps 4 \ --weight_decay 0.01 \ --warmup_steps 10000 \ --adam_epsilon 1e-6 \ --n_gpu 4 \ --gpu_nums 4,5,6,7 \ --model_parallel python finetune.py \ --data_dir datasets/mix \ --output_dir outputs/calm_mix_large_dp \ --model_name_or_path t5-large \ --tokenizer_name_or_path t5-large \ --max_seq_length 256 \ --learning_rate 5e-4 \ --num_train_epochs 2 \ --train_batch_size 8 \ --graident_accumulation_steps 4 \ --weight_decay 0.01 \ --warmup_steps 10000 \ --adam_epsilon 1e-6 ``` ### Pre-train CALM Then, train CALM using the checkpoint of mix dataset. ``` python finetune_generator_discriminator.py \ --data_dir datasets/option2 \ --checkpoint_dir outputs/calm_mix \ --output_dir outputs/calm \ --max_seq_length 256 \ --learning_rate 5e-7 \ --num_train_epochs 3 \ --train_batch_size 8 \ --graident_accumulation_steps 32 \ --fp_16 False \ --weight_decay 0.01 \ --warmup_steps 10000 \ --adam_epsilon 1e-6 \ --n_gpu 8 \ --gpu_nums 0,1,2,3,4,5,6,7 python finetune_generator_discriminator.py \ --data_dir datasets/option2 \ --checkpoint_dir outputs/calm_mix_base_dp \ --output_dir outputs/calm_base_dp \ --max_seq_length 256 \ --learning_rate 5e-7 \ --num_train_epochs 3 \ --train_batch_size 8 \ --graident_accumulation_steps 32 \ --fp_16 False \ --weight_decay 0.01 \ --warmup_steps 10000 \ --adam_epsilon 1e-6 ``` ## Fine-tuning Use checkpoint to fine-tune on the downstream tasks. ## Model List Our released models are listed as following. You can import these models by using [HuggingFace's Transformers](https://github.com/huggingface/transformers). | Model | CSQA | OBQA | PIQA | aNLI | Description |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:| | [danny911kr/calm-mix-base](https://huggingface.co/danny911kr/calm-mix-base) | 63.02 | 60.40 | 70.07 | 62.79 | Mix-Only | [danny911kr/calm-base](https://huggingface.co/danny911kr/calm-base) | 63.32 | 60.90 | 71.01 | 63.20 | | [danny911kr/calm-mix-large](https://huggingface.co/danny911kr/calm-mix-large) | 70.26 | 62.50 | 73.70 | 75.99 | Mix-Only | [danny911kr/calm-large](https://huggingface.co/danny911kr/calm-large) | 71.31 | 66.00 | 75.11 | 77.12 |

近期下载者

相关文件


收藏者