TextClassification-NLP

所属分类:自然语言处理
开发工具:Jupyter Notebook
文件大小:58096KB
下载次数:0
上传日期:2020-04-03 13:20:34
上 传 者sh-1993
说明:  文本分类NLP,图加斯NLP
(TextClassification-NLP,Tugas NLP)

文件列表:
1301160063_Agnes Leady Octaviana-0.1.pdf (389552, 2020-04-03)
1301160063_Agnes Leady Octaviana.pdf (310499, 2020-04-03)
1_Dataset_Creation.ipynb (7742, 2020-04-03)
2_Data_Analysis.ipynb (128609, 2020-04-03)
3_Feature_Engineering (1).ipynb (45533, 2020-04-03)
3_Feature_Engineering.ipynb (43456, 2020-04-03)
4_Training_Classifier.ipynb (44108, 2020-04-03)
Data (0, 2020-04-03)
Data\Data.zip (15155372, 2020-04-03)
Data\News_dataset (1).csv (5146195, 2020-04-03)
Data\News_dataset.csv (5146195, 2020-04-03)
Data\News_dataset.pickle (4964731, 2020-04-03)
Data\X_test.csv (8898210, 2020-04-03)
Data\X_test.pickle (549478, 2020-04-03)
Data\X_train.csv (8898210, 2020-04-03)
Data\X_train.pickle (3254071, 2020-04-03)
Data\bbc (0, 2020-04-03)
Data\bbc\tech (0, 2020-04-03)
Data\bbc\tech\000 (1, 2020-04-03)
Data\df.csv (8898210, 2020-04-03)
Data\df.pickle (8714512, 2020-04-03)
Data\features_test.csv (8898210, 2020-04-03)
Data\features_test.pickle (801762, 2020-04-03)
Data\features_train.csv (8898210, 2020-04-03)
Data\features_train.pickle (4538562, 2020-04-03)
Data\labels_test.csv (8898210, 2020-04-03)
Data\labels_test.pickle (8716, 2020-04-03)
Data\labels_train.csv (8898210, 2020-04-03)
Data\labels_train.pickle (46084, 2020-04-03)
Data\tfidf.csv (8898210, 2020-04-03)
Data\tfidf.pickle (6883758, 2020-04-03)
Data\y_test.csv (8898210, 2020-04-03)
Data\y_test.pickle (8716, 2020-04-03)
Data\y_train.csv (8898210, 2020-04-03)
Data\y_train.pickle (46084, 2020-04-03)

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Readme" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Berikut ini adalah hands-on text classification, mulai dari mempersiapkan data sampai training classifier.\n", "\n", "File yang digunakan yaitu Data/bbc, yang berisi berita dalam 5 kategori yang sekaligus merupakan label pada task kali ini:\n", "\n", "- business\n", "- entertainment\n", "- politics\n", "- sport\n", "- tech\n", "\n", "Langkah-langkah yang harus dilakukan yaitu, pahami dan jalankan kode secara bertahap :\n", "1. Dataset Creation\n", "2. Data Analysis\n", "3. Feature Engineering\n", "4. Training Classifier\n", "\n", "Perhatikan dan pahami output yang dikeluarkan. Lalu coba ganti-ganti parameter untuk melihat perubahan yang ditimbulkan.\n", "\n", "Terakhir, jawab pertanyaan pada section Latihan pada file 4. Training Classifier.\n", "\n", "Kumpulkan ke classroom pada Rabu, \n", "\n", "\n", "\n", "\n", "Referensi : \n", "1. http://mlg.ucd.ie/datasets/bbc.html\n", "2. https://github.com/miguelfzafra/Latest-News-Classifier" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.***" } }, "nbformat": 4, "nbformat_minor": 2 }

近期下载者

相关文件


收藏者