genetic-data-analysis-rcc-1 联合开发网

Pudn.com > 下载中心 > 数据挖掘/数据仓库 > genetic-data-analysis-rcc-1

genetic-data-analysis-rcc-1

所属分类：数据挖掘/数据仓库
开发工具：matlab
文件大小：34KB
下载次数：0
上传日期：2016-11-17 16:44:07
上传者：sh-1993

说明：研究计算中心遗传数据分析和可视化导论，第1部分
(Research Computing Center introduction to analysis and visualization of genetic data, Part 1)

文件列表:

LICENSE (1108, 2016-11-18)
code (0, 2016-11-18)
code\demo.ggplot.R (735, 2016-11-18)
code\geno_pca.m (2183, 2016-11-18)
code\misc.R (254, 2016-11-18)
code\plotadmix.R (1375, 2016-11-18)
code\plotpca.R (3011, 2016-11-18)
code\read.data.R (2176, 2016-11-18)
code\svdk.m (4128, 2016-11-18)
code\traw2mat.m (1775, 2016-11-18)
code\write_mean_genotypes.m (376, 2016-11-18)
code\write_pc_matrix.m (529, 2016-11-18)
code\write_rot_matrix.m (527, 2016-11-18)
conduct.md (1461, 2016-11-18)
data (0, 2016-11-18)
data\omni_samples.20141118.panel (32474, 2016-11-18)
episodes (0, 2016-11-18)
episodes\01-setup.md (3510, 2016-11-18)
episodes\02-pca.md (9854, 2016-11-18)
episodes\03-pca-project.md (5566, 2016-11-18)
episodes\04-admixture.md (5641, 2016-11-18)
extras (0, 2016-11-18)
extras\1kg.md (4325, 2016-11-18)
results (0, 2016-11-18)

# Analysis of Genetic Data, Part 1 **Research Computing Center, University of Chicago**
November 2, 2016
2:00 pm - 4:00 pm
**Instructor:** Peter Carbonetto
**Helper:** Will Graybeal Register [here](http://training.uchicago.edu/course_detail.cfm?course_id=1714). ## General Information In this 2-hour workshop, participants will apply simple approaches to investigate and visualize large-scale genetic data sets, with an emphasis on practical skills that can be applied to genetics research. This is intended to be a more informal, hands-on workshop, and no background in genetics is required; anyone with intermediate computing skills (see "Prerequisites") who is curious about human genetics and the "genomics revolution" is encouraged to register. Over the course of the 2 hours, interesting insights will be generated directly from "raw" genetic data, and participants can continue to explore the data independently using the techniques introduced in class. **Level:** Intermediate **Prerequistes:** This workshop assumes some experience performing simple tasks in a UNIX-like shell environment, as well as basic familiarity with R. Participants must be able to log in to the RCC compute cluster, although experience using the RCC cluster is not required. *All participants must bring a laptop with a Mac, Linux, or Windows operating sytem that they have administrative privileges on.* **Where:** Kathleen A. Zar Room, John Crerar Library, University of Chicago ([OpenStreetMap](https://www.openstreetmap.org/search?query=john%20crerar%20library#map=18/41.79053/-87.60282)). **Additional info:** This workshop is an attempt to apply elements of the [Software Carpentry approach](http://software-carpentry.org/lessons) (see also [this article](http://dx.doi.org/10.12688/f1000research.3-62.v2)) to interactive instruction for computing/quantitative sciences. Some of the materials contained within are adapted from a [Stanford workshop](https://github.com/Ancestry/cehg16-workshop) given in March 2016. For a more in-depth exploration of the concepts and techniques introduced, see [John Novembre's](http://jnpopgen.org) [PopGen workshop](https://github.com/NovembreLab/HGDP_PopStruct_Exercise). Please also take a look at the [Code of Conduct](conduct.md), and the [Software License](LICENSE) which applies to all the scripts and code examples in this repository. All instructional material contained in this repository is made available under the Creative Commons Attribution license ([CC BY 4.0](https://creativecommons.org/licenses/by/4.0)). ## Aims 1. Explore the application of numeric techniques for investigating genetic diversity and population structure from large-scale genotype data. 2. Understand how large genetic data sets are commonly represented in computer files. 3. Use command-line tools to manipulate genetic data, and use R to summarize and visualize the results of a genetic data analysis. 4. Practice using the RCC shell environment (*midway*) for large-scale computation. ## Episodes | Episode | Concepts | | --- | --- | | 1. [Setup](episodes/01-setup.md) | How do I set up my shell environment on *midway* for an analysis of genetic data? | | 2. [Principal component analysis of genetic data](episodes/02-pca.md) | How do I encode genetic polymorphism data?
How do I represent genetic polymorphism data as a matrix?
How can I visualize the results of PCA to gain insight into structure of genetic data? | | 3. [Making predictions using PCA](episodes/03-pca-project.md) | How do I ensure a consistent encoding of the genotype data?
How do I map another genetic data set onto an existing PCA result?
What does this mapping tell us (and not tell us) about a sample's ancestral origins? | | 4. [ADMIXTURE analysis of genetic data](episodes/04-admixture.md) | How do I visualize and interpret the results of running ADMIXTURE on genetic data?
How do I use the ADMIXTURE results to make predictions for new samples? | ## Extras [Preparation of the 1000 Genomes genotype data](extras/1kg.md)

近期下载者：

相关文件：

评论：[我要评论] [举报此文件]

收藏者：