News-Classifier
所属分类:其他
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2024-04-05 17:08:13
上 传 者:
sh-1993
说明: NLP模型用于真假新闻的分类和POS标记
(NLP model to classify fake news from true news and perform POS Tagging)
文件列表:
Fake True News Detection.zip
News Classification Model.ipynb
# News-Classifier
### Project Description:
This project aims to build a supervised learning NLP model to classify fake news from true news using the "News Classifier Dataset."
The dataset can be found at Kaggle.
Link: [https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification](https://www.kaggle.com/datasets/emineyetm/fake-news-detection-datasets)
Embedding used are given under this link: [https://tinyurl.com/NLPLAB3DATA](https://tinyurl.com/NLPLAB3DATA)
### Preprocessing Steps:
Tokenization: The text data will be tokenized into individual words or tokens for further analysis.
Stopword Removal: Commonly occurring words (stopwords) that do not contribute much to the classification will be removed.
Lemmatization : Words will be reduced to their base or root form using lemmatization to reduce feature dimensionality.
### Exploratory Data Analysis (EDA)
EDA provides insights into the dataset's characteristics and distributions. We conduct analyses such as bigram frequency, word frequency mapping, word cloud visualization, and KDE plot for word tokens.
### Vectorization
Two vectorization techniques, Bag of Words and TF-IDF, are employed to prepare the data for classification tasks. The accuracies of these techniques are compared for classification purposes.
### POS Tagging and Fake News Detection
We delve into POS tagging using NLTK and apply various machine learning models for fake news detection. Models include Naive Bayes, LSTM, RNN, and Bidirectional LSTM. Performance metrics such as confusion matrices are presented to evaluate the effectiveness of each model.
### Models:
Logistic Regression Model,
Passive Aggressive Classifier Model
Naive Bayes Model
LSTM
RNN
Bidirectional LSTM
近期下载者:
相关文件:
收藏者: