NFTSentimentAnalysis

所属分类:NFT
开发工具:Jupyter Notebook
文件大小:11030KB
下载次数:0
上传日期:2022-06-23 01:41:37
上 传 者sh-1993
说明:  应用自然语言处理来理解以NFT为特色的最新新闻文章中的情绪。申请基金...
(Apply natural language processing to understand the sentiment in the latest news articles featuring NFTs. Apply fundamental NLP techniques to better understand the other factors involved with the coin prices such as common words and phrases and organizations and entities mentioned in the articles)

文件列表:
Data (0, 2022-06-23)
Data\BAYC.csv (4048, 2022-06-23)
Data\BAYC_1000.csv (108241, 2022-06-23)
Data\BAYC_2022-06-11.csv (838134, 2022-06-23)
Data\BAYC_2022-06-12.csv (715598, 2022-06-23)
Data\BAYC_2022-06-13.csv (1181048, 2022-06-23)
Data\BAYC_2022-06-14.csv (1181077, 2022-06-23)
Data\BAYC_2022-06-15.csv (1058795, 2022-06-23)
Data\BAYC_2022-06-16.csv (785258, 2022-06-23)
Data\BAYC_2022-06-17.csv (530752, 2022-06-23)
Data\BAYC_2022-06-18.csv (343604, 2022-06-23)
Data\NFT.csv (5551, 2022-06-23)
Data\backup (0, 2022-06-23)
Data\backup\BAYC_2022-06-13.csv (562561, 2022-06-23)
Data\backup\BAYC_2022-06-14.csv (447681, 2022-06-23)
Data\backup\BAYC_2022-06-15.csv (308781, 2022-06-23)
Data\backup\BAYC_2022-06-16.csv (193201, 2022-06-23)
Data\backup\BAYC_2022-06-17.csv (144480, 2022-06-23)
Data\boredapeyachtclub_06152022_06162022.csv (44069, 2022-06-23)
Data\boredapeyachtclub_06162022_06182022.csv (36537, 2022-06-23)
Images (0, 2022-06-23)
Images\BAYC_Sentiment_change.png (29892, 2022-06-23)
Images\BAYC_pics.png (128257, 2022-06-23)
Images\BAYC_price_change.png (52304, 2022-06-23)
Images\Correlation_Price_Sentiment.png (25763, 2022-06-23)
Images\HeatMap.png (23182, 2022-06-23)
Images\RNN_LTSM_ClassificationReport.png (28801, 2022-06-23)
Images\ROC_Curve.png (39612, 2022-06-23)
Images\WordCloud.png (433075, 2022-06-23)
NFTSentimentAnalysis.ipynb (602992, 2022-06-23)
NFT_Sentiment_Analysis_SlideDeck.pptx (7143348, 2022-06-23)

# NFT Sentiment Analysis ![image](https://user-images.githubusercontent.com/99493522/175188661-c31a3581-285c-4bb3-99b8-f3472c0b6ec7.png) ## Background This project aims to analyze sentiments for a particular NFT collection and determine its correlation with price action. An attempt was made to train a machine learning model (RNN LSTM) based on twitter data classified using Vader Sentiment Intensity Analyzer. However, due to the nature of the tweet data and most probably lack of data cleanup, the model was not identifying true positives correcly on the test data. The project generated real time sentiment analysis from Tweets to determine the correlation with the closing price of the NFT Bored Ape Yatch Club collection over a one week period. **The following approach was followed to generate sentiment analysis for an NFT collection and check its correlation to price action.** - Get twitter data for 1 week from Twitter using Tweepy API based on a query for an NFT collection i.e Boared APE Yatch Club (BAYC) - Sanitize the data to remove emojis, special characters, mentions, hyperlinks, etc - User Vader Sentiment Intensity Analyser to get the sentiment of tweets - Classify the sentiment as positive/non-positive based on compound score >0.1 - Use this classified data for training the RNN LSTM model to classify future tweets - The model did not do well, probably becuase of the type of data (short tweets with spelling mistakes and repeat texts), so it was not used further for analysis - The price for the collection for 1 week was then fetched from OpenSea using API provided by OpenSea - The sentiment and price for the 1 week period were then plotted to identify correlation between the two ## Installations **This project uses the library tweepy to get the data from twitter** pip install tweepy ## Usage Examples ***BAYC (Boared APE Yatch Club) NFT Sample*** ![BAYC.png](Images/BAYC_pics.png) ***World Cloud to indicate most common words in Tweets related to BAYC*** ![WordCloud.png](Images/WordCloud.png) ***Classification Report of RNN LTSM model*** ![RNN_LTSM_ClassificationReport.png](Images/RNN_LTSM_ClassificationReport.png) ***The ROC curve shows the trade-off between sensitivity (or TPR) and specificity (1 – FPR)*** ![ROC_Curve.png](Images/ROC_Curve.png) ***Sentiment change for BAYC over 1 week period*** ![BAYC_Sentiment_change.png](Images/BAYC_Sentiment_change.png) ***Price change for BAYC over 1 week period*** ![BAYC_price_change.png](Images/BAYC_price_change.png) ***Correlation between Sentiment and Price of BAYC over 1 week period*** ![Correlation_Price_Sentiment.png](Images/Correlation_Price_Sentiment.png) ***HeatMap to indicate correlation between Sentiment and Price for a 1 week period*** ![HeatMap.png](Images/HeatMap.png) ## Findings * The correlation we found was not clear. The hypothesis was that there would be a positive sentiment as prices went up and vice versa. * The data was insufficient to determine any correlation. * There are many different ways to approach this task, but they require time to test and validate the results and see what works and what doesn’t, requiring more than two weeks. * The full process required to get the API keys, see all the options available, pull the data, run the model, and understand the pros and cons of different types of tweets was fullfilled. * The approach of using hashtags (#NFT, #BAYC) can be one of the purest ways to gather people’s sentiment, but there is significant effort required to clean up the data, whereas specialized publications with language that is more polished could have been used, similar to what was used in class, but with a potentially distorted view of client sentiment in real time, which in the end is what we were trying to understand. * This does not mean that the RNN LSTM model was not good, but it was probably not the best approach for this project considering the time and resources available, however with futher exploration the model could be a good choice. ## Recommendations **In order to continue improving this tool, we could explore other methods, for example:** * Use ngrams to read sentiment via sequencing of words. * Clean up and classify tweets manually, the challenge with this is that its could be very time consuming to go through thousands of tweets, and we would need to be cautious to make sure our personal biases do not affect the results of the classification exercise. * Take NFT pricing to another level and use Machine Learning to try to predict future valuations of NFT collections, in combination with Sentiment analysis to reach a more reliable future valuation. Machine learning models could be particularly helpful here considering that many NFT collections don’t have a lot price history. * Incorporate additional data sources to our analysis to make it more accurate and complete, for example adding Discord, Articles, or curating a list of hashtags, NFT publications , and key influencers to try to present a balanced view of NFT sentiment. * Use a longer period of time, months or years. * Once a solid model is created that is more accurate, replicate it to other collections beyond BAYC (I.e.Doodles, Mutant Ape Yatch club, Crypto punks , etc) ## Contributors Chantal Garnett Sameer Lakhe Emiliano Mendez Marcus Policicchio

近期下载者

相关文件


收藏者