Fake-News-Detection

所属分类:GPT/ChatGPT
开发工具:Jupyter Notebook
文件大小:5KB
下载次数:0
上传日期:2020-09-14 20:06:49
上 传 者sh-1993
说明:  区分GPT 2生成的真实新闻和假新闻。
(Differentiating real news from the fake news generated by GPT 2. ,)

文件列表:
FNEWS.ipynb (31546, 2020-09-15)

# Fake-News-Detection ## Introduction Today, we are producing more information than ever before, but not all information is true. Some of it is actually malicious and harmful. And it makes it harder for us to trust any piece of information we come across! Not only that, now the bad actors are able to use language modelling tools like Open AI's GPT 2 to generate fake news too. Ever since its initial release, there have been talks on how it can be potentially misused for generating misleading news articles, automating the production of abusive or fake content for social media, and automating the creation of spam and phishing content. How do we figure out what is true and what is fake? Can we do something about it? ## Problem Statement To differentiate real news from the fake news generated by GPT 2, Given a dataset of various texts ,predict whether or not they are real/fake? ## Dataset The dataset consists of around 387,000 pieces of texts which has been sourced from various news articles from the web as well as texts generated by Open AI's GPT 2 language model! The dataset is split into train,val and test such that each of the sets has an equal split of the two classes. ## Files data can be downloaded from ```https://www.aicrowd.com/challenges/ai-for-good-ai-blitz-3/problems/fnews``` ## Evaluation Criteria ```F1 Score``` ## Preprocessing Dataset contains equal number of both the classes. The preprocessing of the corpus i.e. tokeninzing, building up the vocabulary, padding the sequences to make them of equal length is done using keras preprosessing library. No stopwords are removed as they changes the meaning of the sentences. ## Models 1. Sequential model with simple embedding layer. 2. Sequential model with embedding and conv1D layer. 3. Sequential model with embedding and bidirectional layer. ## Results Model 1 achieved a train accuracy of 96% with a val accuracy of 95% Model 2 achieved a train accuracy of 99% with a val accuracy of 97% Model 3 achieved a train accuracy of 99.5% with a val accuracy of 97.3% ## Test dataset Model 3 achieved highest f1 score of 96.7% in the leaderboard.

近期下载者

相关文件


收藏者