Ph.D. Defense/Colloquium
April 27 (Thur) 10:30 a.m.
Alavi Commons Room, 6625 Everett Tower

Development of Traditional and Rank-Based Algorithms for Linear Models with Autoregressive Errors and Multivariate Logistic Regression with Spatial Random Effects

Shaofeng Zhang
Department of Statistics
Western Michigan University

Linear models are the most commonly used statistical methods in many disciplines. One of the model assumptions is that the error terms (residuals) are independent and identically distributed. This assumption is often violated and autoregressive error terms are often encountered by researchers. The most popular technique to deal with linear models with autoregressive errors is perhaps the autoregressive integrated moving average (ARIMA). Another common approach is the generalized least square, such as Cochrane–Orcutt estimation and Prais–Winsten estimation. However, these usually have poor behaviors when fitting small samples. To address this problem, a double bootstrap method is proposed by McKnight et Al. (2000). The purpose of this study is to transfer their algorithm from Fortran to R computing environment. Furthermore, this study fixes some flaws of the original method and develops a rank-based alternative, which is robust in terms of resistance to outliers. An R package is created and the usage is demonstrated via examples. Monte Carlo studies for different sample sizes (20, 30, 50, and 100) show that both the original and robust algorithm have the expected properties, even for small sample sizes.

This research also includes an application of the variational approximation in fitting multivariate logistic regression with spatial effects in the Bayesian framework. Variational approximation is much faster than Markov Chain Monte Carlo (MCMC), without losing accuracy. Hence this technique becomes an important alternative to MCMC. Spatial models, such as Conditional Autoregressive (CAR) Models, are extremely popular in characterizing spatial dependencies when datasets are collected over aggregated spatial regions, such as, counties, census tracts, zip codes, etc. Modeling spatially correlated multiple health outcomes requires specification of cross-correlations. Statisticians developed several forms of multivariate conditional autoregressive models (MCAR) for joint modeling of multiple diseases. More specifically, this research investigates the generalized multivariate logistic regression with the spatial random effects modeled via MCAR. For the Bayesian inference of the parameters, both variational approximation and MCMC are developed. They are then compared in terms of the parameter point estimation, confidence interval (CI) and deviance information criterion (DIC). The simulation results exhibit the speedup and accuracy of the estimation and inference of the parameters.
All statistics graduate students are expected to attend.


Past colloquiums


Department of Statistics
3304 Everett Tower
Western Michigan University
Kalamazoo MI 49008-5152 USA
(269) 387-1420 | (269) 387-1419 Fax