fake-news-detection

所属分类:自然语言处理
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2023-11-14 14:02:20
上 传 者sh-1993
说明:  区分真实新闻和误导性新闻比以往任何时候都更加重要。我们的假新闻检测项目解决了这个当代...
(Distinguishing between genuine and misleading news is more crucial than ever. Our Fake News Detection project addresses this contemporary challenge using the tools and techniques of NLP. The primary objective is to construct a machine learning model adept at discerning the alignment of a news headline with its corresponding article body.)

文件列表:
LICENSE (1106, 2023-11-29)
Makefile (4264, 2023-11-29)
data/ (0, 2023-11-29)
data/external/ (0, 2023-11-29)
data/interim/ (0, 2023-11-29)
data/processed/ (0, 2023-11-29)
data/raw/ (0, 2023-11-29)
data/raw/test_bodies.csv (2045680, 2023-11-29)
data/raw/test_stances_unlabeled.csv (1940688, 2023-11-29)
data/raw/train_bodies.csv (3752301, 2023-11-29)
data/raw/train_stances.csv (4255300, 2023-11-29)
data/raw/train_stances.random.csv (4201299, 2023-11-29)
docs/ (0, 2023-11-29)
docs/Makefile (5616, 2023-11-29)
docs/commands.rst (363, 2023-11-29)
docs/conf.py (8484, 2023-11-29)
docs/getting-started.rst (256, 2023-11-29)
docs/index.rst (451, 2023-11-29)
docs/make.bat (5122, 2023-11-29)
models/ (0, 2023-11-29)
notebooks/ (0, 2023-11-29)
notebooks/cleaning.ipynb (10364, 2023-11-29)
references/ (0, 2023-11-29)
reports/ (0, 2023-11-29)
reports/figures/ (0, 2023-11-29)
requirements.txt (103, 2023-11-29)
setup.py (781, 2023-11-29)
src/ (0, 2023-11-29)
src/__init__.py (0, 2023-11-29)
... ...

Fake News Detection ============================== In an age where digital information is ubiquitous, distinguishing between genuine and misleading news is more crucial than ever. Our Fake News Detection project addresses this contemporary challenge using the tools and techniques of Natural Language Processing (NLP). The primary objective is to construct a machine learning model adept at discerning the alignment of a news headline with its corresponding article body. This endeavor is not just an academic exercise but a vital step towards mitigating the spread of misinformation in today's fast-paced digital world. Project Organization ------------ ├── LICENSE ├── Makefile <- Makefile with commands like `make data` or `make train` ├── README.md <- The top-level README for developers using this project. ├── data │ ├── external <- Data from third party sources. │ ├── interim <- Intermediate data that has been transformed. │ ├── processed <- The final, canonical data sets for modeling. │ └── raw <- The original, immutable data dump. │ ├── docs <- A default Sphinx project; see sphinx-doc.org for details │ ├── models <- Trained and serialized models, model predictions, or model summaries │ ├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering), │ the creator's initials, and a short `-` delimited description, e.g. │ `1.0-jqp-initial-data-exploration`. │ ├── references <- Data dictionaries, manuals, and all other explanatory materials. │ ├── reports <- Generated analysis as HTML, PDF, LaTeX, etc. │ └── figures <- Generated graphics and figures to be used in reporting │ ├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. │ generated with `pip freeze > requirements.txt` │ ├── setup.py <- makes project pip installable (pip install -e .) so src can be imported ├── src <- Source code for use in this project. │ ├── __init__.py <- Makes src a Python module │ │ │ ├── data <- Scripts to download or generate data │ │ └── make_dataset.py │ │ │ ├── features <- Scripts to turn raw data into features for modeling │ │ └── build_features.py │ │ │ ├── models <- Scripts to train models and then use trained models to make │ │ │ predictions │ │ ├── predict_model.py │ │ └── train_model.py │ │ │ └── visualization <- Scripts to create exploratory and results oriented visualizations │ └── visualize.py │ └── tox.ini <- tox file with settings for running tox; see tox.readthedocs.io --------

Project based on the cookiecutter data science project template. #cookiecutterdatascience

--------- # TODO: ## Modèles à comparer - LLM : Mistral7B (open source) - Une approche en deux étapes qui utilise des résumés automatiques (en utilisant BERT) de nouvelles pour déterminer la position d'un titre par rapport au corps du texte qui lui est associé. L'approche utilise des techniques de résumé comme entrée pour les deux classificateurs au lieu du texte complet de l'article, ce qui réduit la quantité d'informations à traiter tout en conservant les informations importantes. - Justifier les choix de modèles hello

近期下载者

相关文件


收藏者