Back to the Top
Hi Everyone,
I have been using the Akaike criterion for quite a while as everybody
else
as a measure of goodness of fit in WiNonlin when comparing 2 models. I
have
always been quite cautious in making decision between 2 models based on
the
difference between the AIC values of the models to test i.e the bigger
the
difference the more confident I usually am in chosing the model with the
lower AIC value. However I have been unable to find a limit below which
you
cannot reject 2 models and strictly speaking a difference of 1 would be
enough to prefer one model over another. I understand that diagnostic
plots
are also usefull to make decision. BUT can we use this value as a
statistical test (and on its own only) powerfull enough to say that a
model
is definitely better over another even though the difference is small?
What is your experience and common practice?
Thank you for your help,
Pascal
[AIC is just one criteria for choosing models, parameter uncertainty
and weighted residual plots are also important. If there is a
consistent problem choosing a bigger model, maybe smaller is better.
That is, if most subjects produce close AIC values maybe a smaller
model is more appropriate, other criteria and reasons for the models
are important. If only one or two subjects have similar AIC values
maybe the bigger model is useful for all subjects - db]
Dear Pascal,Back to the Top
In addition to the answer of David Bourne attached to the message to the
PharmPK group:
The model with the lower AIC fits better, that's it. Whether or not this
better fit (1) results in an acceptable fit, (2) is really an
improvement,
and (3) what is the statistical power are other questions.
Ad (1): As stated by David Bourne, parameter uncertainty and weighted
residual plots are very important here. If these do not look 'good',
the fit
should not be accepted.
Ad (2): It is not always meaningful nor necessary to look for the best
fitting model. A good model may be good enough. There is no law
dictating
that one should look for the most complex model; on the contrary, the
simplest model that describes the data adequately is to be preferred in
most
cases.
Ad (3): This is a difficult topic, and it is quite difficult to get
understandable (for a non-statistician) literature about this. The
alternative F-test is more clear at this point, but it is probably (in
my
experience) more 'conservative', i.e. it does less frequently accept a
more
complex model. I would appreciate the opinion of experts on this topic.
Best regards,
Hans Proost
Johannes H. Proost
Dept. of Pharmacokinetics and Drug Delivery
University Centre for Pharmacy
Antonius Deusinglaan 1
9713 AV Groningen, The Netherlands
tel. 31-50 363 3292
fax 31-50 363 3247
Email: j.h.proost.aaa.farm.rug.nl
Back to the Top
Dear Pascal,
the Akaike Criterion makes a statement on the likelihood of a model,
based on information theory (in particular, the concept of entropy of
information). You cannot use the AIC like a statistical test.
This means, that you should avoid the terminology "to reject a model
based on the AIC", since the word "reject" belongs to hypothesis
testing.
You may read KP Burnham, DR Anderson, Model selection and multimodel
inference - a practical information-theoretic approach, Springer 2002.
Kind regards,
Hans
--
Dr. Hans Mielke
Bundesinstitut fur Risikobewertung
Fed. Institute for Risk Assessment
Thielallee 88-92, D - 14195 Berlin
Tel. ++49 1888 412 3969 Fax 3970
Back to the Top
I have to disagree with a recent comment that you cannot treat the AIC
as a statistic. A statistic is any function of the data and the AIC
certainly meets that criteria. Ths strict information-theoretic
approach states that the model with the smallest AIC is the best model.
Period. So, then a model with an AIC of 42.9999999 is better than a
model with an AIC of 43.0? Common sense says that there is no
difference between these two AIC values. No, the reason the AIC is not
used as a statistic is that the sampling distribution of the AIC has not
been identified. You need to do some type of bootstrap or jackknife to
know what the standard error of the AIC is so that you can assess
whether the change in AIC is statistically significant. This is
overkill, especially when you have a lot of models to compare, so it's
easier to just go with the lower is better rule. Burnham and Anderson
would probably disagree with me.
Pete Bonate
Back to the Top
Hi all,
I completely agree with Peter. A statistic is any function of the data
and a model. However, it is going to be computationally intensive
exercise to derive the actual significance level for AIC to be used as
a statistical criteria. So, as long as you are comparing two
structural models and NOT the error models, WinNonlin derived AIC
values should help in selecting one model over the other. The
difference required to accept/reject the model is always going to be
subjective.
However, I would like to add that traditional diagnostic plots
(Observed vs model predicted values, weighted residuals vs time etc.)
are always recommended by the experts more than comparison of some
number like AIC in WinNonlin or Loglikelihood ratio (LLR) in NONMEM.
And not to exclude the physiological basis of the two competing
models.
I would like to ask a very simple question. Several model selection
tools/criteria are available to us ranging from a numerical value
comparing two models to more formal qualification tools,
external/internal validation, posterior predictive check (PPC) etc.
Has anybody encountered a case where traditional plots were
uninformative than any other tools mentioned above? Or in other words,
are there examples where use of these tools reversed the decision made
based on the diagnostic plots?
Looking forward to some input.
Pravin
--
Pravin Jadhav
Graduate Student
Department of Pharmaceutics
MCV/Virginia Commonwealth University
DPE1/CDER/OCPB/Food and Drug Administration
Phone: (301) 594-5652
Fax: (301) 480-3212
Back to the Top
Dear Peter,
You wrote:
> I have to disagree with a recent comment that you cannot treat the AIC
> as a statistic.
Probably you refer to the comment of Hans Mielke:
> the Akaike Criterion makes a statement on the likelihood of a model,
> based on information theory (in particular, the concept of entropy of
> information). You cannot use the AIC like a statistical test.
> This means, that you should avoid the terminology "to reject a model
> based on the AIC", since the word "reject" belongs to hypothesis
> testing.
I fully agree with this comment of Hans Mielke. The meaning of the
phrase
"You cannot use the AIC like a statistical test" is explained in the
next
sentence. This does not imply that "you cannot treat the AIC as a
statistic". As you explained, you may derive some statistical test from
AIC.
I agree. But, as far as I know, this is not common practice, and you do
not
seem to be in favor of it either.
In short, there is a difference between a statistic and a statistical
test.
Best regards,
Hans
Johannes H. Proost
Dept. of Pharmacokinetics and Drug Delivery
University Centre for Pharmacy
Antonius Deusinglaan 1
9713 AV Groningen, The Netherlands
tel. 31-50 363 3292
fax 31-50 363 3247
Email: j.h.proost.at.farm.rug.nl
Hans,Back to the Top
I, on the other hand, have to take exception to the idea that
statistics
owns the concept of rejection. It is perfectly valid to reject models
on
many bases, including statistical. One may reject a model because it is
in conflict with understanding of the biology, or because predictions
based on a model are nonsensical, or because the AIC has some value.
Rejection of a model is not strictly a statistical issue. I suspect we
can all agree that one cannot reject a statistical test/hypothesis based
on AIC, but we frequently reject a model based on AIC. I agree with
Peter, AIC is a statistic. We cannot reject hypotheses based (solely)
on
AIC only because there isn't a general solution for the sampling
distribution. However, in a more restricted case, I wonder if the AIC
could be found to have an exact solution, using MCMC for example. In
any
case, having a general solution for the sampling distribution is not a
requirement to be a statistic.
Mark
Mark Sale M.D.
Global Director, Research Modeling and Simulation
GlaxoSmithKline
919-483-1808
Back to the Top
Dear Mark,
I do not understand the phrase "on the other hand" in your
comment. This is exactly what I wrote in my two earlier
message. Indeed, we agree.
Best regards,
Hans
Johannes H. Proost
Dept. of Pharmacokinetics and Drug Delivery
University Centre for Pharmacy
Antonius Deusinglaan 1
9713 AV Groningen, The Netherlands
tel. 31-50 363 3292
fax 31-50 363 3247
Email: j.h.proost.-a-.farm.rug.nl
Back to the Top
Hi all,
I use the Akaike criterion for comparing models in WinNonlin, but, I'm
also
taking into account the precision (CV%) obtained from each parameter
estimation. I consider it because when you use your own models (User
models)
you have to introduce in the program initial estimations for each
parameter, and these initial estimations influence in the calculations,
where the lower AIC is not in all case correlated with the lower CV%
values.
For this reason, I undestand that perhaps the model with the smallest
AIC is
not always the more appropriate or the best model.
I would appreciate the opinion of experts on this topic.
Best regards,
M'Carmen Gomez
Metabolism & Pharmacokinetics Service
Research & Development Department
IPSEN-PHARMA S.A. Laboratories
Beaufour-Ipsen Group
Ctra. Laurea Miro 395
Sant Feliu de Llobregat, Barcelona, Spain
Telf.: 936858100
e-mail: m-carmen.gomez.at.ipsen.com
Back to the Top
Mark,
I guess, this was addressed to me:
> Hans,
> I, on the other hand, have to take exception to the idea that
statistics
> owns the concept of rejection. It is perfectly valid to reject
models on
> many bases, including statistical. One may reject a model because it
is
> in conflict with understanding of the biology, or because predictions
> based on a model are nonsensical, or because the AIC has some value.
> Rejection of a model is not strictly a statistical issue. I suspect
we
> can all agree that one cannot reject a statistical test/hypothesis
based
> on AIC, but we frequently reject a model based on AIC.
as a reply to my warning:
> This means, that you should avoid the terminology "to reject a model
> based on the AIC", since the word "reject" belongs to hypothesis
> testing.
Of course you are perfectly right that no method can claim exclusive
rights on a word, especially not on such a common one as "to reject" is.
So let me try to do a better job this time.
Without doubt, I prefer the rejection of a model based on biology or
nonsensical predictions over any formal criterion. In fact, when using a
statistical test or the AIC, I do not think the method can _reject_ a
model.
A statistical test accepts or rejects a hypothesis like "model A fits
the data significantly better than model B". Based on the rejection or
acceptance of this hypothesis, or based on actual AIC values, you decide
which model to prefer over the other. If you like, you may call it
rejection of the other model - but be aware that one easily thinks: Aha,
he proved one model to be significantly better than the other one.
"Significant" is another word to be used cautiosly, at least in a
context where statistics is not too far away. Remember: You are allowed
to use whatever wording you want - but the others will understand based
on their respective background.
That is why I would recommend avoiding the terminology "to reject a
model based on the AIC", while I do not see any problem with the
terminology "to reject a model based on biological insight".
Hans
--
Dr. Hans Mielke
Fed. Institute for Risk Assessment
Thielallee 88-92, D - 14195 Berlin
Tel. 01888 412 3969 Fax 3970
PharmPK Discussion List Archive Index page
Copyright 1995-2010 David W. A. Bourne (david@boomer.org)