movie-review-sentiment-classification

所属分类:自然语言处理
开发工具:Scheme
文件大小:0KB
下载次数:0
上传日期:2019-04-28 06:36:19
上 传 者sh-1993
说明:  使用Word2Vec、双向LSTM、AWD LSTM...构建情感分类器,将电影评论文本和输出评级从1到10...,
(Building a sentiment classifier that takes movie review text and output rating from 1 to 10 using Word2Vec, bidirectional LSTM, AWD LSTM and more.)

文件列表:
.vscode/ (0, 2019-04-27)
.vscode/settings.json (84, 2019-04-27)
LICENSE (1060, 2019-04-27)
data/ (0, 2019-04-27)
data/test_set.ss (4736563, 2019-04-27)
data/training_set.ss (35837256, 2019-04-27)
data/validation_set.ss (4468503, 2019-04-27)
images/ (0, 2019-04-27)
images/drop_connect.jpg (65696, 2019-04-27)
language.w2v.model (27533625, 2019-04-27)
language/ (0, 2019-04-27)
language/movie_corpus.txt (10245491, 2019-04-27)
language/selected/ (0, 2019-04-27)
language/selected/0_0.txt (791, 2019-04-27)
language/selected/1000_0.txt (1558, 2019-04-27)
language/selected/1001_0.txt (4778, 2019-04-27)
language/selected/1002_0.txt (1527, 2019-04-27)
language/selected/1003_0.txt (498, 2019-04-27)
language/selected/1004_0.txt (1042, 2019-04-27)
language/selected/1005_0.txt (2151, 2019-04-27)
language/selected/1006_0.txt (2222, 2019-04-27)
language/selected/1007_0.txt (483, 2019-04-27)
language/selected/1008_0.txt (1475, 2019-04-27)
language/selected/1009_0.txt (2062, 2019-04-27)
language/selected/100_0.txt (315, 2019-04-27)
language/selected/1010_0.txt (909, 2019-04-27)
language/selected/1011_0.txt (687, 2019-04-27)
language/selected/1012_0.txt (634, 2019-04-27)
language/selected/1013_0.txt (2301, 2019-04-27)
language/selected/1014_0.txt (819, 2019-04-27)
language/selected/1015_0.txt (4582, 2019-04-27)
language/selected/1016_0.txt (2361, 2019-04-27)
language/selected/1017_0.txt (968, 2019-04-27)
language/selected/1018_0.txt (705, 2019-04-27)
language/selected/1019_0.txt (930, 2019-04-27)
language/selected/101_0.txt (423, 2019-04-27)
language/selected/1020_0.txt (2399, 2019-04-27)
language/selected/1021_0.txt (3447, 2019-04-27)
... ...

# Movie Review Sentiment Classification > The project is done using Jupyter Notebook with Python 3.7, PyTorch 1.0.1, fastai 1.0.52, gensim, ... Building a sentiment classifier that takes movie review text and output rating from 1 to 10 using `Word2Vec`, `Bidirectional LSTM`, `AWD LSTM` and more. ## Directory Structure ``` project ├─data │ ├─test_set.ss Test dataset │ ├─training_set.ss Training dataset │ └─validation.ss Validation dataset ├─images Notebook images ├─language │ └─movie_corpus.txt Corpus for training Word2Cec model ├─rating_model_fastai.ipynb Plan B notebook ├─rating_model.ipynb Plan A notebook ├─word2vec.ipynb Word2Vec model training notebook │ ... ``` ## Report Reports with implementation introduction, code explanation and result analysis are all embedded in the notebooks for better coherence. ## Plan A: Bidirectional LSTM with word2vec as embedding **Training Word2Vec model** The corpus I used is a self-made 10M `movie review + Harry Potter` sentence collection. File at [language/movie_corpus.txt](https://github.com/dizys/movie-review-sentiment-classification/blob/master/./language/movie_corpus.txt). The dimension of Word2Vec model is 100. Please see [word2vec.ipynb](https://github.com/dizys/movie-review-sentiment-classification/blob/master/./word2vec.ipynb) **Training Classifier** Mainly use PyTorch Please see [rating_model.ipynb](https://github.com/dizys/movie-review-sentiment-classification/blob/master/./rating_model.ipynb) ## Plan B: Transfer Learning LSTM using FastAI Mainly use FastAI - an high-level library for easier working with PyTorch. Please see [rating_model_fastai.ipynb](https://github.com/dizys/movie-review-sentiment-classification/blob/master/./rating_model_fastai.ipynb) ## Predicts of Test Set Choosing the result of 'Plan B' for its better performance. Please see [senti_output.ss](https://github.com/dizys/movie-review-sentiment-classification/blob/master/./senti_output.ss) ## License MIT, see the [LICENSE](https://github.com/dizys/movie-review-sentiment-classification/blob/master//LICENSE) file for details.

近期下载者

相关文件


收藏者