载入中
自定义HTML载入中... loading
some new article about linear mixed models [引用 2008-05-09 13:16:14]  删除... 
字体变小 字体变大
The heterogeneous linear mixed modelLet Yi = (Yi1,…, Yini) be the response vector for the ni measurements of the subject i with i = 1,…, N. The linear mixed model [1] for the response vector Yi is defined as:
equation M1
(1)

Xi is a ni × p design matrix for the p-vector of fixed effects β, and Zi is a ni × q design matrix associated to the q-vector of random effects ui which represents the subject specific regression coefficients. The errors εi are assumed to be normally distributed with mean zero and covariance matrix σ2Ini, and are assumed to be independent from the vector of random effects ui.

In an homogeneous mixed model [1], ui is normally distributed with mean μ and covariance matrix D i.e.

equation M2
(2)

In the heterogeneous mixed model [24], ui is assumed to follow a mixture of G multivariate Gaussians with different means (μg)g=1,G and a common covariance matrix D i.e.

equation M3
(3)

Each component g of the mixture has a probability πg and the (πg)g=1,G verify the following conditions:

equation M4
(4)

In this work, we propose a slightly more general formulation of the model described in (1) in which the effect of some covariates may depend on the components of mixture and some of the random effects may have a common mean whatever the component of mixture. Thus, the Xi design matrix is split in X1i associated with the vector β of fixed effects which are common to all the components and X2i associated with the vectors γg of fixed effects which are specific to the components. The Zi design matrix is also splitted in Z1i associated with the vector vi of random effects following a single Gaussian distribution and Z2i associated with the vector ui of random effects following a mixture of Gaussian distributions. The model is then written as:

equation M5
(5)

where vi ~ N(0, Dv) and equation M6; given the component g, the conditional distribution of the vector equation M7 is equation M8 with equation M9.

 

2.2 Likelihood
Following the previous works [3,4], we define wig the unobserved variable indicating if the subject i belongs to the component g. We have P(wig = 1) = πg. The density for the vector yi can then be written as:
equation M10
(6)

Given wig, yi follows a linear mixed model, and the density f(yi|wig = 1) denoted by ([var phi]ig is the multivariate Gaussian density with mean Eig and covariance matrix Vi given by:

equation M11
(7)

Let now θ be the vector of the m parameters of the model. θ contains ψ with equation M12 and π the vector of the G &S722; 1 first component probabilities (πg)g=1,G&S722;1. Note that πg is entirely determined by π as equation M13. Vec(D) represents the vector of the upper triangular elements of D. The estimates of θ are obtained as the vector [theta] that maximizes the observed log-likelihood:

equation M14
(8)

 

2.3 Estimation procedure
We propose to maximize directly the observed log-likelihood (8) using a modified Marquardt optimization algorithm [9], a Newton-Raphson like algorithm [10]. The diagonal of the Hessian at iteration k, H(k), is inflated to obtain a positive definite matrix as: equation M15 with equation M16 and equation M17 if ij. Initial values for λ and η are λ = 0.01 and η = 0.01. They are reduced when H* is positive definite and increased if not. The estimates θ(k) are then updated to θ(k+1) using the current modified Hessian H*(k) and the current gradient of the parameters g(θ(k)) according to the formula:
equation M18
(9)

where, if necessary, α is modified to ensure that the log-likelihood is improved at each iteration.

To ensure that the covariance matrix D is positive, we maximize the log-likelihood on the non zero elements of U, the Cholesky factor of D (i.e. UU = D) [7]. Furthermore, to deal with the constraints on π (4) we use the transformed parameters (γg)g=1,G&S722;1 with:

equation M19
(10)

Standard errors of the elements of D and (πg)g=1,G&S722;1 are computed by the Δ-method [11] while standard errors of the other parameters are directly computed using the inverse of the observed Hessian matrix.

The convergence is reached when the three following convergence criteria are satisfied: equation M20, |L(k)&S722;L(k&S722;1)| ≤ εb and g(θ(k))′H(k)&S722;1g(θ(k)) ≤ εd. The default values are εa = 10&S722;5, εb = 10&S722;5 and εd = 10&S722;8.

As the log-likelihood of a mixture model may have several maxima [8], we use a grid of initial values to find the global maximum. The multimodality of the log-likelihood in mixture models has been often discussed and some authors proposed different strategies to choose the set of initial values [12]. However, none of them seems to be optimal in a general way. We have observed, in our experience, that the results were mainly sensitive to initial values of (πg)g=1,G&S722;1 and (μg)g=1,G and less sensitive to the other parameters (Vec(U), β and σ) for which estimates of the homogeneous mixed models were good initial values.

A mixture model is estimated with a fixed number of components G, otherwise the number of parameters in the model is unknown. To choose the right number of components, one has to estimate models with different values for G and select the best model according to a test or a criterion. Some works favor a bootstrap approach to approximate the asymptotic distribution of the likelihood ratio test between models with different number of components [13] but this approach is very heavy in particular for mixture models with random effects. Criteria such as Akaike’s Information Criterion (AIC) [14] or Bayesian Information Criterion (BIC) [15] are often preferred. We use these selection criteria to select the optimal number of components.

 

2.4 A posteriori classification
After parameter estimation, mixture models allow to classify subjects according to the G components. The classification is based on the posterior probabilities (πig)g=1,G that the subject i follows each of the G components. Using [theta] = ([psi]′, [pi]′)′, these probabilities are obtained by the Bayes theorem [24] as:
equation M21
(11)

We then assign to each subject i the component to which he has the highest probability (πig)g=1,G to belong.

票数:
什么是“我顶”?
点击数:    评论数:
本文章引用通告地址(TrackBack Ping URL)为:
本文章尚未被引用。
下一篇: 无聊之感
发表评论
大 名:
(不填写则显示为匿名者)
网 址:
(您的网址,可以不填)
标 题:
内 容:
请根据下图中的字符输入验证码:
(您的评论将有可能审核后才能发表)
和讯个人门户 v1.0 | 和讯部落 | 客服中心