NASA_programming_languages

所属分类:交通/航空行业
开发工具:Jupyter Notebook
文件大小:2011KB
下载次数:0
上传日期:2021-05-11 16:35:31
上 传 者sh-1993
说明:  旨在预测美国国家航空航天局存储库编程语言的自然语言处理项目
(Natural language processing project aimed to predict the programming language of NASA repositories)

文件列表:
acquire.py (3768, 2021-05-12)
clean.ipynb (92537, 2021-05-12)
dirty.ipynb (1713407, 2021-05-12)
nasa_archived.csv (23781, 2021-05-12)
nasa_repo_scrape.csv (1574116, 2021-05-12)
nlp_final.ipynb (1324185, 2021-05-12)
wrangle.py (5313, 2021-05-12)

# Predicting NASA Github Repository Languages ![robonaut](https://www.nasa.gov/sites/default/files/styles/full_width_feature/public/images/631052main_jsc2010e188611_full.jpg) ## Project Description For this project, I will be scraping data from NASA GitHub repository README files. The goal will be to build a model that can predict what programming language a repository is, given the text of the README file. ## Goals 1. Predict the programming language of each repository ## Background Github displays the programming languages used on the right side of each repository, showing all languages used and the percentage of the repo in each language. The languages Github tracks are vast and include languages such as JavaScript, Python, Java, TypeScript, C#, PhP, C++, C, Shell, Ruby, etc. For this project, I will predict the most used language for each repository using the README files from NASA's Github repositories. ## Deliverables 1. A well-documented jupyter notebook that contains your analysis 2. One or two google slides suitable for a general audience that summarize your findings. Include a well-labelled visualization in your slides. ## Data Dictionary ## Takeaways ## How to Reproduce 1. Clone or fork this repo 2. Download all modules 3. Run notebook

近期下载者

相关文件


收藏者