Natural_Language_Processing

所属分类:以太坊
开发工具:Jupyter Notebook
文件大小:1842KB
下载次数:0
上传日期:2022-07-27 04:24:47
上 传 者sh-1993
说明:  在这项作业中,我将运用自然语言处理来理解最新新闻文章中的情感...
(In this assignment, I will apply natural language processing to understand the sentiment in the latest news articles featuring Bitcoin and Ethereum. I will also apply fundamental NLP techniques to better understand the other factors involved with the coin prices such as common words, phrases, organizations, and entities mentioned in the articles.)

文件列表:
Coding Notebooks (0, 2022-07-27)
Coding Notebooks\[1]Crypto_Sentiment.ipynb (1495536, 2022-07-27)
Supplemental (0, 2022-07-27)
Supplemental\BTC_Results_NLP.png (94941, 2022-07-27)
Supplemental\BTC_word_cloud.png (417053, 2022-07-27)
Supplemental\ETH_Results_NLP.png (94082, 2022-07-27)
Supplemental\ETH_Word_Cloud.png (369246, 2022-07-27)
Supplemental\NLP_image.jpeg (75827, 2022-07-27)

A Natural Language Processing Analysis of Bitcoin and Ethereum

An NLP Investigation into the Latest News Articles

Created by Cam Gould for the University of Toronto Fintech BootCamp

### Background Information There's been a lot of hype in the news lately about cryptocurrency, making it quite challenging to distinguish between where public opinion lies. That is why I set out to take stock, so to speak, of the latest news headlines and content regarding **Bitcoin** and **Ethereum** to get a better feel for the current public sentiment around each coin.

In this assignment, I will apply ***natural language processing*** to understand the sentiment in the latest news articles featuring *Bitcoin* and *Ethereum*. I will also apply *fundamental NLP techniques* to better understand the other factors involved with the coin prices such as common words and phrases and organizations and entities mentioned in the articles.
### Project Files Use the following links to jump right into the anaylsis notebook or view results:

This notebook contains the [Natural Language Processing Techniques & Results](https://github.com/CamGould/Natural_Language_Processing/blob/main/Coding%20Notebooks/%5B1%5DCrypto_Sentiment.ipynb)
This image shows the [Numerical Sentiment Results for Bitcoin](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/BTC_Results_NLP.png?raw=true)
This image shows the [Numerical Sentiment Results for Ethereum](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/ETH_Results_NLP.png?raw=true)
### Project Outline and Instructions #### Here is the structure of the [NLP Python Notebook](https://github.com/CamGould/Natural_Language_Processing/blob/main/Coding%20Notebooks/%5B1%5DCrypto_Sentiment.ipynb): 1. Sentiment Analysis 1. Here I use the [newsapi](https://newsapi.org/) to pull the latest news articles for Bitcoin and Ethereum and create a DataFrame of sentiment scores for each coin. 2. I use this data to derive descriptive statistics to answer the following questions: 1. Which coin had the *highest mean positive score*? 2. Which coin had the *highest negative score*? 3. Which coin had the *highest positive score*? 2. Natural Language Processing 1. In this section, I use *NLTK* and *Python* to tokenize text, find n-gram counts, and create word clouds for [Bitcoin](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/BTC_word_cloud.png?raw=true) & [Ethereum](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/ETH_Word_Cloud.png?raw=true). This involves: 1. Lowercasing each word 2. Removing all punctuation 3. Removing all stopwords (Stop words are a set of commonly used words in any language that are considered unimportant in NLP) 3. Named Entity Recognition 1. In this section, you will build a named entity recognition model for both coins and visualize the tags using SpaCy. ### Key Findings and Visuals Here are the numerical findings for the sentiments on each coin.

Bitcoin - Descriptive Statistics:
![](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/BTC_Results_NLP.png?raw=true)

Ethereum - Descriptive Statistics:
![](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/ETH_Results_NLP.png?raw=true)

Here are the word clouds generated from each coins sentiment analysis.

Bitcoin - word cloud visual
![](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/BTC_word_cloud.png?raw=true)

Ethereum - word cloud visual
![](https://github.com/CamGould/Natural_Language_Processing/blob/main/Supplemental/ETH_Word_Cloud.png?raw=true)

近期下载者

相关文件


收藏者