ds-project-starter:这是数据科学项目的入门模板; 它包含文件结构,并提供一些代码来指导您

  • M4_571886
  • 43.9KB
  • zip
  • 0
  • VIP专享
  • 0
  • 2022-06-15 09:43
数据科学项目模板 这是数据科学项目的入门模板; 它包含文件结构,并提供一些代码来指导您。 (在分叉此存储库之后,请替换此部分以描述您的项目和目标。它应该简短,最多2句话。) 介绍 GitHub允许我们管理代码并与他人合作。 不仅适用于数据科学家,而且适用于为个人或工作项目编程的任何人。 它是一个平台,可作为您的技术产品组合,是任何人阅读您的代码和报告的绝佳平台。 对于您的项目,如果可能的话,我建议您对一个报告使用多个笔记本。 您可以将每个笔记本专用于数据科学项目管道中的某个阶段。 这在构建数据科学项目时非常有用,因为通常会有明确的阶段,例如EDA和建模。 这使您的项目更易于消化,因为某人可以在特定阶段运行一个笔记本,而不用运行一个冗长的笔记本。 我还建议在每个阶段将数据/模型另存为单独的文件。 将它们保存在文件夹中,然后将其重新导入下一阶段使用的笔记本中。 尽量保持笔记本电脑的清洁和
  • ds-project-starter-main
  • preprocessing.py
  • data
  • README.md
  • 2-Exploratory_Data_Analysis.ipynb
  • assets
  • fork_star_repo.png
  • README.md
  • plotting.py
  • 1-Extract_Data.ipynb
  • 3-Modeling_Evaluation.ipynb
  • model.py
  • .gitignore
  • README.md
# Data Science Project Template This is a starter template for data science projects; it contains the file structure and presents some codes to guide you. *(After you have fork this repo, replace this section to describe your project and objectives. It should be short, maybe at max 2 sentences.)* ## Introduction GitHub allows us to manage code and collaborate with others. Not only for data scientists but anyone who does programming for personal or work projects. It is a platform that works as your technical portfolio, a great medium for anyone to read your code and reports. For your project, I encourage you to use multiple notebooks for one report if possible. You can dedicate each notebook to a stage in a data science project pipeline. This can be useful when structuring a data science project, as there are usually clear stages, such as EDA and modeling. This makes your project more digestible, as someone can run one notebook for a specific stage instead of running a single long notebook. I'd also recommend saving the data/model at each stage as separate files. Saving them in a folder, then importing them back in the notebook used for the next stage. Try to keep your notebooks clean and informative rather than filled with codes. If your code is long or complex, it would be helpful to have markdown titles, subtitles, and descriptions for the bits of code. By doing this, your reader can understand why such steps are being taken and what some cells are doing. So try to keep your notebooks more goal-driven and more storytelling. Keep it to more about the steps taken and why those are implemented. Keep the functions and classes saved in scripts that can be imported into your notebook for use. Also, [don't repeat yourself](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself). This does not mean the functions and classes are not important; they are essential to a technical person assessing the quality of your code. So make sure your python scripts are well written and have detailed documentation. As it is impossible to create a single template that will meet the needs of every project, this example should be considered a starting point and changed based upon the working and evolution of your project. So, feel free to add more files, more functions, and more notebooks. *(This is where you simplify and summarize everything that is going on inside this project repo. This is where you aim to explain what your intentions were for the project, and you can include the context for why this project is being done. You can detail the structure of your files, the steps taken, and add some visualizations. Provide your readers some background, as giving some context would be highly beneficial to anyone reading. Make sure the descriptions don't go too technical as the README is the perfect place for non-technical people to see your project.)* ## Getting Started [Create a GitHub account](https://github.com/join) or [sign in to GitHub](https://github.com/login) [Star this repo](https://docs.github.com/en/github/getting-started-with-github/saving-repositories-with-stars) makes it easy for you to find this repository again later. You can see all the repositories and topics you have starred by going to [your stars page](https://github.com/stars). [Fork this repo](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo), to make a copy of this repo, and use this as a starting point for your own project. ![fork and star](/assets/fork_star_repo.png) *(Here you want to describe how to run your codes, and what packages to install.)* ## Results *(For those readers who do not want to run your codes or look at your notebooks, you may want to show them the results and findings here. If it interests them, they might find out more from your notebooks. Feel free to show images or even a web demo too.)* ## Contributors - [Jingles](https://github.com/jinglescode) *(Here you want to add other major contributors, such as your team mates)* ## Contribute After you have finished this DS project, and you think you would like to help future students by "upgrading" this repo, so they have better starter kit, feel free to do a [pull request](https://docs.github.com/articles/using-pull-requests). *(This part is optional, where you can tell others how they could improve your repo. For example, if they built a new model, or if they found some mistakes in the codes.)*
    • github上传
      这是您的模板存储库! 您将在changes分支中进行所有更改。 该存储库已获得 (c)2019 GitHub,Inc.的许可。
    • images:GitHub图像存储
      图片 GitHub图像存储 jsdelivr CDN
    • GitHub
      储存对象 Este proyecto se encarga de manejar losplanes de la liga de la justicia 诺塔斯 小学一年级数学上册期末试卷小学奥数网...
    • 测试github存储
    • github测试
      github测试 我正在学习如何使用GitHub,这是我的第一个“真实”文件上传。 这组句子是从我的本地存储库创建的。 手指交叉,将其推送到远程仓库。
    • GitHubApp
      GitHubApp 创建具有两个屏幕的应用程序 -屏幕1:应具有您的Github个人资料。 使用响应中的至少4个字段来更新视图。... -屏幕2:使用从您的个人资料获得的存储库填充recyclerView。 每个项目视图中至少要使用3个字段。
    • github-traffic-stats:一个小的Python项目,使用GitHub API提取和存储GitHub项目的流量统计
      一个小型的Python项目,用于使用GitHub API提取和存储GitHub项目的流量统计信息。 目前,GitHub仅向仓库提供14天的流量数据。 该数据包括每天的观看次数和唯一身份访问者人数。 但是,如果您要存储和查看超过14天...
    • github-linkify
      浏览器扩展,显示指向GitHub配置文件和GitHub页面中存储库的链接 GitHub档案和回购链接有时位于最前面,但并非总是如此。 该扩展程序会在您访问GitHub Pages网站时进行检测,并在地址栏中添加指向相应配置文件或...
    • GitHits:永久存储GitHub的流量见解
      一个GitHub动作,用于*永久存储GitHub生成的流量信息,即视图和计数。 * GitHub仅提供14天之内的流量洞察。 此GitHub Action旨在克服该限制。 专为与使用而。 用法 创建/检索具有存储库级别访问权限的个人访问...