Headline-news-Classification-with-BERT
所属分类:人工智能/神经网络/深度学习
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2023-08-02 20:56:25
上 传 者:
sh-1993
说明: 标题新闻BERT分类,,
(Headline-news-Classification-with-BERT,,)
文件列表:
.DS_Store (6148, 2023-10-30)
Models.ipynb (110145, 2023-10-30)
data/ (0, 2023-10-30)
data/cleaned.csv (38652058, 2023-10-30)
headline-news-classification-with-bert.ipynb (119260, 2023-10-30)
news-classification-using-bert.ipynb (212625, 2023-10-30)
# News Category Classification with BERT
Identify the type of news based on headlines and short descriptions
# Dataset
This dataset contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost. The model trained on this dataset could be used to identify tags for untracked news articles or to identify the type of language used in different news articles.
# Implementations
- [x] BERT (Fine-Tuning)
- [x] Bi-GRU + CONV
- [x] LSTM + Attention
# TL;DR
* [glove.840B.300d](http://nlp.stanford.edu/data/glove.840B.300d.zip) (840B tokens, 2.2M vocab, cased, 300d vectors, 2.03 GB download) was used as the embedding layer for the Bi-GRU and LSTM models.
* bert-base-uncased (12-layer, 768-hidden, 12-heads, 110M parameters) pre-trained model was used.
# Resuts
- `BERT` - test_accuracy: 0.72, test_loss: 0.0015671474330127238
- `Bidirectional GRU + Conv` - test_accuracy: 0.6545
- `LSTM with Attention` - test_accuracy: 0.67144
# Requirements
* Python 3.6
* PyTorch 0.4.1/1.0.0 - For the creation of BiLSTM-CRF architecture
* pytorch-pretrained-bert - https://github.com/huggingface/pytorch-pretrained-BERT
近期下载者:
相关文件:
收藏者: