Back to the Top
Statistics again. What are the views regarding:
1. Use of data below ELOQ/LOQ for PK/PD calculations?
2. Assumptions of normality without testing in ANOVAs for TOX data
3. Using arcsine transformation of ratio and percent data prior to
4. Using non-parametric analysis in place of parametric analysis in
place of making assumptions of normality?
I appreciate your comments.
Back to the Top
Testing of statistical assumptions for ANOVA in (toxicity) studies:
In general, toxicity studies proceed along standardised protocols yielding much
information on historical data over the years. Since conditioned testing
(evidence of statistical property A allows the application of the statistical
procedure B) is always a critical issue in modelling [see also subset selection
(!)] giving rise to inflated type I/II error rates, it should be restricted to
cases where it is really necessary. Thus, historical control data should be
scrutinised and the distributional properties as well as the design features
should be used to establish standard statistical evaluation routines.
Otherwise, when testing for normality, equal variances etc. on a routine basis
one may end up with different statistical methods for one item/parameter even
within one study. In addition, as it is the case with bioequivalence, no
evidence against normality does not translate into evidenced normality etc.
Proved to be useful in the analysis of categorical litter data (reprotoxicity)
and also in other areas, like mutagenicity.
At least one-way ANOVA / t test are rather robust against deviations from
normality. However, the question is too general and the choice will depend on
the data at hand. Some introductory statistical textbooks might help.
Back to the Top
1) Data below LOQ should not be used.....LOQ should be subsitituted for
any values less than LOQ.
2) Although ANOVA is fairly robust, I always check for normality and
homoscedasticity (i.e. equal variance among groups) before applying
ANOVA. The other assumptions of random sampling and independence are
usually design protocol issues and somewhat controllable. If the
evidence indicates that the normality and homoscedasticity assumptions
are not met, two approaches may be taken...a) apply a different test
that does not require the rejected assumption (i.e. apply non-parametric
tests such as a Kruskal-Wallis or Friedman rather than parametric
ANOVA), or b) transform the variable of interest so that it meets the
3) Because of the intrinsic characteristic of a binomial distribution,
the variance is a function of the mean in such a distibution. Hence,
for percentages or ratios, it is frequently necessary and appropriate to
apply the arcsin transformation prior to applying ANOVA. (The arcsin
transformation prevents the variation alterations as a function of
4) As mentioned in answer 2a above, one approach is to apply
non-parametric analysis. However, when the assumptions are met for an
ANOVA and one can choose either parametic ANOVA or non-parametric, it is
more efficient to use the ANOVA. As an example recently, I applied a
non-parametric Kruskal-Wallis to data that met all assumptions of an
ANOVA for analysis of a four group study. The Kruskal-Wallis missed
significance between two groups that were apparent after I transformed
the data and applied an ANOVA.
Briefly, I go through the following steps when analyzing data
(continuos0 for a >2 group study:
Test for normality
Test for equality of variances among groups
If both tests "pass" perform ANOVA. (Be aware that for equal variance
test you "pass" if p>0.01 or >0.05, i.e. you accept your null hypothesis
that the variances are equal.)
If ANOVA shows Between Groups significanc (p<0.05), I applysubsequent
tests (Bonferroni's t-test, LSD, Tukeys) to find which groups differ
from each other
If the test for normality or equal variance "fails", I apply
transformations and re-test normality and equality of variances. If
pass, then run ANOVA on transformed variates, if fail retry another
transformation. If I can find no transformation that results in equal
variances and normality, I use a nonparametric analysis.
Sorry for the rather lengthy response I have offered.
Michael McLane, Ph.D.
Back to the Top
Reply-To: "Stephen Duffull"
From: "Stephen Duffull"
Subject: Re: PharmPK Statistics -Reply
Date: Mon, 23 Aug 1999 09:14:02 +0100
Michael McLane wrote:
>1) Data below LOQ should not be used.....LOQ should be
>any values less than LOQ.
I presume that Michael missed the lengthy discussion on this
very topic on the NONMEM users group. While I do not wish
to cover the entire discussion on this topic this was not
the conclusion of the group. Indeed about the only point
that everyone managed to agree on was that values less than
LOQ should not be set at the LOQ. For my "2 cents" worth I
would just like to comment that LOQ is an artefact - the
cut-off value is arbitrary (although seemingly agreed upon
in the literature) and has no modelling value (other than
making things more complex by providing non-random censoring
of data). I think we would all be better off, from a
modelling perspective, if all concentrations were reported
as seen and LOQ ignored.
School of Pharmacy
University of Manchester
Manchester, M13 9PL, UK
Ph +44 161 275 2355
Fax +44 161 275 2396
Date: Mon, 23 Aug 1999 16:52:08 -0700
From: Roger Jelliffe
Subject: Re: PharmPK Statistics -Reply
What are your reasons for stating, "1) Data below LOQ should not be
used.....LOQ should be subsitituted for
>any values less than LOQ."???
Further, what are you, and all the rest of us, actually trying to
accomplish with these statistical analyses such as those you describe?
It seems to me that what we are trying to accomplish is not simply to
report what the best overall estimate of a population pharmacokinetic
parameter value is, but rather to be able to take the most informed course
of action (optimal, most precise dosage regimen, for example) based on the
raw data in the population studied.
I think the first thing,before all others, is to know your
assay well, and
to know the relationship between what you measure and the precision of that
measure as shown by replicate determinations. These do not have to be
standards, and are probably best obtained as regular samples. One can
nevertheless quantify the relationship between a measured level and its SD,
and can do so - all the way down to a blank. In PK studies and in TDM,
THERE ACTUALLY IS NO LOQ. The important thing is to know the Fisher
information of each assay measurement throughout its entire range. The
whole question of a LOQ arises only from a toxicological point of view,
when the assay info is the ONLY source of info as to whether the drug is
present or not. Then, clearly, you must have a LOQ, a lower detectable
limit of quantification.
What is this limit? Usually about 2 SD above the blank. Why is that? So
you can be confident that something is present if you get a result above
that. But what should this be? 2SD? 3SD? It is a statistical thing that on
decides to buy in on, depending on the probability that the result is not a
In TDM, though, and in PK/PD studies, we KNOW the sample is
not a blank,
because we know how long it was since the last dose, and we know that the
last molecule of a drug almost never gets excreted. The question here is
thus totally different. We are NOT asking if the drug is present or not. We
know it is there. We are now asking instead, HOW MUCH is there? - a totally
different question. One can report these below LOQ levels as, for example,
a gent level of 0.2 ug/ml, below our usual detection limits of 0.5 ug/ml.
Then both the toxicologists and the TDM people and the pop modeling people
have what they need to make their pop or patient-specific model, and the
toxicologists also have what they need if there was no other info about the
time since the last dose.
What do you do now with such below LOQ data, and why? Do you simply
withold such a result? If so, Why? What are the reasons you will give to
support your view that you should substitute the LOQ for the ACTUAL below
LOQ result you obtain? Again, it is perfectly possible to assign accurate
measures of credibility to ANY measured concentration, all the way down to
a blank, if you do it right. This is discussed more fully in an article by
our group in Therapeutic Drug Monitoring, 15: 380-393, 1993. Take a look.
Everyone knows that if you weight the data differently you will get
different parameter estimates. What weighting scheme should you trust? I
would suggest to trust nothing except what you have determined about your
actual assay, over its entire working range. It is good if you have
measured values in the range where they are easily quantifiable. No
argument about that. What concerns me here is the idea of using something
other than the REAL result for a below LOQ value, especially when its
weight can now be determined easily and in a cost effective way. Read the
paper. Then let's talk more.
Finally, what about the parameter values that one gets with a
PK/PD model? How will these be used? It seems to me that the real use one
wishes to make of these is to develop the very best dosage regimen possible
based on the raw data of the previous population of patients studied. How
should we do this? Currently we have used the method of maximum aposteriori
probability (MAP) Bayesian adaptive control to apply our past knowledge of
the behavior of a drug (the population model, with its parameter values,
one for each parameter, as measures of the central tendency, and their
standard deviations or covariances as a measure of the dispersion of these
values within the population studied, and the measured serum
concentrations, to obtain the individualized Bayesian posterior parameter
values for that patient, and then to compute the dosage regimen which is
required to achieve the desured target goals. Exact achievement is what is
But, what is the actual objective function being minimized in the MAP
Bayesian fitting process? The denominator in the part describing the serum
concentrations is actually the variance of the measured (or the predicted)
concentrations. This is an important example of the need to start by
knowing the assay error explicitly and optimally, by determining the assay
precision (SD) over its entire working range, so you can give proper
credibility to each measurement during the fitting process.
Yes, there ARE many other sources of error besides the assay error.
However, many people assume that the assay error is just a small part of
the overall errors produced by errors in preparation of each dose, in the
administration of the various doses, and in the recording of the times when
serum concentrations are measured. They use a lumped term for
In our iterative Bayesian (IT2B) population modeling
software, we prefer
to start with the KNOWN assay error, described as a polynomial describing
the relationship between the measured concentration and its SD, for correct
weighting in the modeling process by the Fisher information of each
measurement. Then, we consider the other sources of error as a separate
measure of intraindividual variability. This we call gamma, and we use it
to scale the assay error polynomial. In this way, if gamma =1, then the
assay error is the only source of variation. If gamma = 2, then half the
overal error is due to the assay. If it is 3, then 1/3rd, etc. In this way
you can see just what fraction of the overall intraindividual variability
is due to the assay error, and you can incorporate all this into the
correctly weighted population PK/PD model.
Now, what will be done with all this? Having used the strength of the
parametric or iterative Bayesian population modeling approach in this way,
to utilize the assay error (properly determined, not simply assumed) and
the value of gamma, we can now make a NONPARAMETRIC population model of the
behavior of the drug. This is nonparametric, NOT noncompartmental. Please
note the difference. Many still have confused these terms or used them
What does this do? Nonparametric population modeling methods
models. They give us parameter results not only as means, medians, modes,
and variances and correlations, but also the ENTIRE discrete probability
joint density. You get basically one set of parameter values for each
patient you have studied, plus an estimate of the probability of each such
set. You thus get N sets of parameter values (not just one) for the N
What is the very best population model you could ever get? It
would be the
exactly known parameter values for each subject studied. No statistical
summaries can improve on this. Alain Mallet has shown this back in the
eighties. The nonparametric methods (either his NPML or Alan Schumitzky's
NPEM approach) are the optimal solution to the problem short of the new
hierarchical Bayesian approaches. What do you have now? You have N sets of
parameter values. Instead of only one prediction of future concentrations
based on the population model, you now have N such predictions. Because of
this, you can now compare each of these predictions which will result from
a candidate dosage regimen with a desired target goal at a target time. You
now have a performance criterion (the weighted squared error of the failure
to achieve the desired goal) when a candidate regimen is given to a
nonparametric population model.
The separation or Heuristic certainty equivalence principle (see
Bertsekas, D., Dynamic Programming: Deterministic and Stochastic Models.
Englewood Cliffs NJ, Prentice-Hall, 1987, pp 144-146) states that whenever
you first get single point parameter estimates for a model, and then use
these single point estimates to control the system (develop the regimen to
achieve the goals) the job is inevitably done suboptimally, as there is no
performance criterion being optimized.
Nonparametric models, with N sets of parameter values for the
studied, get around this problem. They give us a criterion to predict and
evaluate the performance of a dosage regimen, based on the entire discrete
probability distribution, not just on a single value for each parameter.
Because of this, a dosage regimen can now be specifically optimized so that
the desired target is reached with maximal precision (minimal weighted
squared error). This is the "multiple model" method of adaptive control. It
is well known in flight control systems (the F16, the Boeing 777, the
Airbus, etc, and in spacecraft and missile guidance and control systems.
The combination of nonparametric models which give us multiple discrete
sets of parameter values and multiple model dosage design now permits
development of dosage regimans which specifically optimize the precision
with which desired targets are hit.
This is discussed more fully in an article by our group in Clinical
Pharmacokinetics, 34: 57-77, 1998, and in another by Taright N, Mentre F,
Mallet A, and Jouvent R.: Nonparametric Estimation of Population
Characteristics of the Kinetics of Lithium from Observational and
Experimental Data: Individualization of Chronic Dosing Regimen using a new
Bayesian Approach, Therapeutic Drug Monitoring, 16: 258-269, 1994.
We think that the ability to develop optimal dosage regimens
is the real
reason for making models at all, not just to get parameter estimates - to
develop the best (here, the most precise) course of action based on the
raw data of the patients studied in the past.
Then, as feedback is obtained from monitoring serum concentrations, the
set of suport points is re-evaluated. Those which now predict the measured
levels well become now much more probable, and those which do so poorly
become much less probable. In this way the Bayesian posterior discrete
joint probability density is obtained, and this revised density is then
used as the individualized model to develop the next dosage regimen to
achieve the goals, again with maximal precision. This is multiple model
Bayesian adaptive control of dosage regimens.
There is more to statistics, and especially to the design of dosage
regimens, than the usual summaries achieved by the culture of the analysis
of variance and its built-in assumptions based on means and covariances.
Those approaches, based only on single point parameter estimates, are
limited by the separation principle and are supoptimal. Alain Mallet and
Alan Schumitzky really have done something good. It is also good to start
by making no assumptions about the form of the assay error, but to
determine it explicitly for each assay, then to get gamma, and then to do
nonparametric population modeling so that one can then develop multiple
model dosage regimens to hit desired targets most precisely. A clinical
version of the multiple model dosage design software is being implemented,
and should be available for clinical use hopefully within a year.
Hope this helps a bit. I look forward to hearing from you.
Roger W. Jelliffe, M.D.
USC Lab of Applied Pharmacokinetics
CSC 134-B, 2250 Alcazar St, Los Angeles CA 90033
**Note our new area codes below, since 6/15/98!**
Phone (323)342-1300, Fax (323)342-1302
You might also look at our Web page for announcements of
new software and upcoming workshops and events. It is
Roger W. Jelliffe, M.D. Professor of Medicine, USC
USC Laboratory of Applied Pharmacokinetics
2250 Alcazar St, Los Angeles CA 90033, USA
Phone (323)442-1300, fax (323)442-1302, email= jelliffe.at.hsc.usc.edu
Our web site= http://www.usc.edu/hsc/lab_apk
Back to the Top
Mike: That is in essence how it should be done, from LOQ (we use ELOQ which
is the lowest standard actually run-not a computed value-no
extrapolation). But we haven't a protocol for examining data other than a
Bartlett and ANOVA-no transforms or non paras set in place. Thanks for
PharmPK Discussion List Archive Index page
Copyright 1995-2010 David W. A. Bourne (firstname.lastname@example.org)