# Welcome to STAT 555!

Welcome to STAT 555, The Statistical Analysis of Genomics Data. The emphasis in this course will be understanding statistical testing and estimation in the context of "omics" data so that you can appropriately design and analyze a high-throughput study. Since the measurement technologies are evolving rapidly, important objectives of the course are for students to gain a basic understanding of statistical principles and familiarity with flexible software tools so that you can continue to assess and use new statistical methodology as it is developed for new types of data.

By the end of the course, you should be able to tailor the analysis of your data to your needs while maintaining statistical validity. You should come out of the course with insight so that you can assess the validity of new statistical methodologies as they are introduced as well as understand appropriate statistical analyses for data types not discussed in the class.

The emphasis throughout will be on the discovery of reproducible effects. Typically we have been taught to think of reproducibility in terms of repeating an experiment and obtaining a similar result. With the complexity of "omics" data, we also need to think of reproducibility of the data capture and statistical analysis. Reproducibility will be a theme throughout the course.

Students and visitors to this course come from many backgrounds. Accordingly, we will start with introductory material in genomics and statistics. For the first few weeks, there will also be a parallel set of lectures and exercises to introduce you to our main software package: R Studio. This means that at various points, especially in the first 4 weeks, each of you will find that there is a mix of material that you know already and new material. I hope that everyone will feel free to both ask “stupid” questions and to help correct or enhance material that I introduce.