Introduces basic principles of probability and distribution theory and statistical inference. Topics include axioms of probability theory, independence, conditional probability random variables, discrete and continuous distributions, functions of random variables, moment generating functions, central limit theorem, point and interval estimation, maximum likelihood methods, tests of significance, and the Neyman-Pearson theory of testing hypotheses.
May not be used as credit for MA students in
Regression analysis and introduction to linear models. Topics:
Multiple regression, analysis of covariance, least square means,
logistic regression, and non-linear regression. This course
includes a one hour computer lab and emphasizes hands-on
applications to datasets from the health sciences.
Advanced presentation of statistical methods for comparing populations and estimating and testing associations between variables. Topics: Point estimation, confidence intervals, hypothesis testing, ANOVA models for 1, 2 and k way classifications, multiple comparisons, chi-square test of homogeneity, Fisher's exact test, McNemar's test, measures of association, including odds ratio, relative risks, Mantel-Haenszel tests of association, and standardized rates, repeated measures ANOVA, simple regression and correlation. This course includes a one-hour computing lab and emphasizes hands-on applications to datasets from the health-related sciences.
Statistical tools for analyzing experiments involving genomic data. Topics: Basic genetics and statistics, linkage analysis and map construction using genetic markers, association studies, Quantitative Trait Loci analysis with ANOVA, variance components analysis and marker regression (including multiple and partial regression), QTL mapping with interval mapping and composite interval mapping, LOD test, supervised and unsupervised methods for gene expression microarray data across multiple conditions.
This course provides the background in special topics in
mathematics required to succeed in the biostatistics graduate
programs and is required for students who have not had an advanced
calculus and/or matrix algebra course. The basic mathematical
concepts relevant to statistical studies will be discussed. Topics:
convergence of sequences of sets, numbers, and functions,
convergence of series, uniform convergence, power series, term
by term integration and differentiation, matrix algebra, and other
topics as time permits.
Introduces alternate methods for designing and analyzing
comparative studies that may be used when some or all of the
assumptions underlying the usual parametric method are
questionable. Topics: 1- , 2- , and k-sample location problems,
randomized block and repeated measures designs, the independence
problem, rank transformation tests, randomization tests, the
2-sample dispersion problem, and other topics as time
This course provides students with useful methods for analyzing
categorical data. Topics: Cross-classification tables, tests for
independence, log-linear models, Poisson regression, ordinal
logistic regression, and multinomial regression for the logistic
It can be said there are no new problems in statistics, only new
applications. Since the completion of the human genome project,
there is a burgeoning field of new applications for statistics
involving high throughput experiments designed to gather large
amounts of information on biological systems. This course is
focused on discussing the wide array of approaches and technologies
implemented to gather this information and the statistical issues
involved from initial data processing steps to end stage research
objectives. Specifically, time permitting, the technologies we will
examine include two dimensional protein gel electrophoresis,
protein mass spectrometry, and several flavors of microarray
experiments. We will use the text "Bioinformatics and Computational
Biology Solutions Using R and Bioconductor". Much of the work for
the course will involve analyzing data sets from class and from the
text using the R language.
Introduction to fundamental principles and planning
techniques for designing and analyzing statistical
experiments. Recommended for students in applied fields. Topics:
Justification for randomized controlled clinical trials, methods of
randomization, blinding and placebos, ethical issues, parallel
groups design, crossover trials, inclusion of covariates,
determining sample size, sequential designs, interim analyses,
repeated measures studies.
Introduction to theory and practice of sample surveys
involving collection of statistical data from planned
Introduces factorial experiments, fractional factorial
experiments, confounding, lattice designs, various incomplete block
designs, efficiency of experimentation, and problems of design
Deals with statistical methods for estimation and testing
hypotheses when samples are observed and analyzed
This course presents the topic of data mining from a statistical perspective, with attention directed towards both applied and theoretical considerations. An emphasis will be placed on supervised learning methods. Topics include: linear and logistic regression, discriminant analysis, shrinkage methods, subset selection, dimension reduction techniques, classification and regression trees, ensemble methods, neural networks, and random forests. Model selection and estimation of generalization error will be emphasized. Considerations and issues that arise with high-dimensional (N<<p) applications will be highlighted. Applications will be presented in R to illustrate methods and concepts.
This course presents the topic of data mining from a statistical perspective, with attention directed towards both applied and theoretical considerations. An emphasis will be placed on unsupervised learning methods, especially those designed to discover and model patterns in data. Applications to high-dimensional data (N<<p) and big data (N>>p) will be highlighted. Topics include: market basket analysis, hierarchical and center-based clustering, self organizing maps, factor analysis, computer vision, eigenfaces, data visualization, graphical models. Applications will be presented in Matlab and R to illustrate methods and concepts.
For graduate students who have had an introduction to
probability theory and advanced calculus. Concepts,
properties, basic theory, and applications of stochastic
Introduction to methods for analyzing longitudinal and time
series data. Topics: Random coefficient regression models, growth
curve analysis, hierarchical linear models, general mixed models,
autoregressive and moving average models for time series data, and
the analysis of cross-section time series data.
The Bayesian approach to statistical design and analysis can be
viewed as a philosophical approach or as a procedure-generator. The
use of Bayesian design and analysis is burgeoning. In this
introduction to Bayesian methods, we consider basic examples of
Bayesian thinking and formalism on which more complicated and
comprehensive approaches are built. These include adjusting
estimates using related information, the use of Bayes Factors in
testing of hypotheses, the relationship of the prior and posterior
distributions, and the key steps in a Bayesian analysis. We
consider the Bayesian approach that requires a data likelihood (the
sampling distribution) and a prior distribution. From these, the
posterior distribution can be computed and used to inform
statistical design and analysis. Applications of this technique are
It can be said there are no new problems in statistics, only new applications. Since the completion of the human genome project, there is a burgeoning field of new applications for statistics involving high throughput experiments designed to gather large amounts of information on biological systems. This course is focused on discussing the wide array of approaches and technologies implemented to gather this information and the statistical issues involved from initial data processing steps to end stage research objectives. Specifically, time permitting, the technologies we will examine include two dimensional protein gel electrophoresis, protein mass spectrometry, several flavors of microarrays, and Xerogel sensor experiments.
We will use the text "Bioinformatics and Computational Biology
Solutions Using R and Bioconductor". Much of the work for the
course will involve analyzing data sets from class and from the
text using the R language.
Provides an advanced course on the use of life tables and
analysis of failure time data. Topics: Use of Kaplan-Meier survival
curves, use of log rank test, Cox proportional hazards model,
evaluating the proportionality assumption, dealing with
non-proportionality, stratified Cox procedure, extension to
time-dependent variables, and comparison with logistic regression
Presents methods for analyzing multiple outcome variables
simultaneously, and for classification and variable reduction.
Topics: Multivariate normal distribution, simple, partial, and
multiple correlation; Hotelling's T-squared, multivariate analysis
of variance, and general linear hypothesis, and discriminant
analysis, cluster analysis, principal components analysis, and
This course is intended to provide a basic introduction to principles and methods of epidemiology. The course emphasizes the conceptual aspects of epidemiologic investigation and application of these concepts in public health and related professions. Topics include overview of the epidemiologic approach to studying disease; the natural history of disease; measures of disease occurrence, association and risk; epidemiologic study designs; disease surveillance; population screening; interpreting epidemiologic associations; causal inference using epidemiologic information; and application of these basic concepts in the context of selected major diseases and risk factors. Please note that this course cannot be used for degrees that require EEH 501 unless pre-approved by the program director, or as a prerequisite for courses that require EEH 501.
Corequisite: Students must enroll in STA 527 LEC and STA 527 REC in the same term.
This course is designed for students concerned with medical data. The material covered includes: the design of clinical trials and epidemiological studies; data collection; summarizing and presenting data; probability; standard error; confidence intervals and significance tests; techniques of data analysis including multifactorial methods and the choice of statistical methods; problems of medical measurement and diagnosis; and vital statistics and calculation of sample size. The design and analysis of medical research studies will be illustrated. MINITAB is used to perform some data analysis. Descriptive statistics, probability distributions, estimation, tests of hypothesis, categorical data, regression model, analysis of variance, nonparametric methods, and others will be discussed as time permits.