BERT-for-20NewsGroups
所属分类:数据挖掘/数据仓库
开发工具:Python
文件大小:14770KB
下载次数:0
上传日期:2021-06-22 02:36:33
上 传 者:
sh-1993
说明: BERT-for-20NewsGroups,《2021医学健康数据分析与挖掘》课程论文 -- 基于BERT的20NewsGroups数据集新闻分类实验
(BERT for-20NewsGroups, Course Paper on "2021 Medical Health Data Analysis and Mining" - News Classification Experiment on the 20NewsGroups Dataset Based on BERT)
文件列表:
data (0, 2021-06-22)
data\20news.test.txt (13826858, 2021-06-22)
data\20news.train.txt (22094083, 2021-06-22)
data\label.txt (332, 2021-06-22)
logging (0, 2021-06-22)
logging\checkpoint.Large.txt (130, 2021-06-22)
logging\checkpoint.base.txt (130, 2021-06-22)
logging\train-BertClassifier.Base.log (35080, 2021-06-22)
logging\train-BertClassifier.Large.log (19180, 2021-06-22)
src (0, 2021-06-22)
src\__pycache__ (0, 2021-06-22)
src\__pycache__\config.cpython-37.pyc (2073, 2021-06-22)
src\__pycache__\dataloader.cpython-37.pyc (4973, 2021-06-22)
src\__pycache__\model.cpython-37.pyc (1326, 2021-06-22)
src\__pycache__\utils.cpython-37.pyc (1476, 2021-06-22)
src\config.py (1857, 2021-06-22)
src\main.py (9164, 2021-06-22)
src\model.py (1244, 2021-06-22)
src\news_dataloader.py (3248, 2021-06-22)
src\utils.py (1450, 2021-06-22)
姚昕智-医学健康数据挖掘.pdf (414799, 2021-06-22)
# BERT-for-20NewsGroups
《2021医学健康数据分析与挖掘》课程论文 -- 基于BERT的20NewsGroups数据集新闻分类实验
### Virtual Environment
You can build a virtual environment for project operation.
```
# Building a virtual environment
pip3 install virtualenv
pip3 install virtualenvwrapper
virtualenv -p /usr/local/bin/python3.6 $env_name --clear
# active venv.
source $env_name/bin/activate
# deactive venv.
deactivate
```
### Requirements
```
pip3 install -r requirements.txt
```
If you cannot download torch automatically through requirements.txt, you can delete the torch version information and get the command line of torch installation from the [torch official website](https://pytorch.org/). Note that the installed torch version needs to be the same as that in requirenemts.txt.
**OSX**
```
pip3 install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2
```
**Linux and Windos**
```
# CUDA 11.0
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
# CUDA 10.2
pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2
# CUDA 10.1
pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
# CUDA 9.2
pip install torch==1.7.1+cu92 torchvision==0.8.2+cu92 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
# CPU only
pip install torch==1.7.1+cpu torchvision==0.8.2+cpu torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
```
### Default Run
**Create Dic.**
Before running, you need to build two folders, **logging** and **models**, in the project folder
**Model training and evaluation**
```
python3 main.py
```
**modify hyperparameters**
You can modify the model hyperparameters by editing the config.py file.
```vi config.py```
### Training Log
The training log files are stored in the logging folder, corresponding to the training logs of the BERT-base and BERT-large versions respectively.
近期下载者:
相关文件:
收藏者: