Back to the Top
Dear All,
For average BE analysis of replicated crossover studies, FDA
recommends the following SAS code is from 'Statistical Approaches to
Establishing Bioequivalence', Guidelines
PROC MIXED;
CLASSES SEQ SUBJ PER TRT;
MODEL Y=SEQ PER TRT/DDFM=SATTERTH;
RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G;
REPEATED/GRP=TRT SUB=SUBJ;
ESTIMATE 'T vs. R' TRT 1-1/CL ALPHA=0.1;
Why this TYPE=FA0(2) option is used in the RANDOM statement?
What is the use of REPEATED statement?.
What is the main diffence between DDFM=SATTERTH and
DDFM=KENWARDROGER?.Which one is better?.
In this code there is no SUB(SEQ) effect. Suppose if I want to include
SUB(SEQ) effect, is there any problem if I use the following code.
proc mixed data=ABCD method=reml ITDETAILS maxiter0;
class sequence subject period formula;
model ln_auc= sequence period formula/ddfm=KENWARDROGER outp=out;
random subject(sequence);
lsmeans formula;
estimate 'T-R' formula -1 1/cl alpha=0.10;
run;
Is there any problem if I analyse a 2x4 replicate design using this
code.
How I can calculate the intra CV's and inter CV for this replicate
design?.
Regards
Matz
Back to the Top
The following message was posted to: PharmPK
Dear Matz,
You could use the Bioequivalence Wizard in WinNonlin to do this. It
follows the FDA guidance for constructing the model for average BE
analysis. You might find it easier to use as it is specialized for this
task.
Regards,
--
Jason Chittenden
Product Manager
Pharsight Corporation
Back to the Top
The following message was posted to: PharmPK
Hi Matthew,
I was so happy when I saw your question, because I was eager to see
the answer too. However, it appears that you've asked at a time when
either the biostatisticians in the group are on holiday, or perhaps
it's just a complicated question to answer. I'm not a biostatistician,
but wanting to know the answer too, I did a little research, and I can
share what I found with you with the caveat that I'm not a
biostatistician.
First off, if your question is not academic and this is for a
regulatory submission, my direct response would simply be, use what
the FDA specified, because that's what they specified. It's not much
of an answer, but then - so many things in life are like this. Why did
the FDA choose SAS, why did they choose 80-125%, why is the keyboard
I'm typing on arranged in the QWERTY configuration? QWERTY keyboards
were created to slow early typewriters down in the 1800's because the
keys would jam. Not a good reason. OK, I digress.
Now for the "why" part, or at the very least, "where". Your first
question:
Why this TYPE=FA0(2) option is used in the RANDOM statement?
This isn't a crossover study anymore, and we can now estimate the true
between- and within-subject variances because people take a product
more than once. FA0(2) specifies that the Factor Analytic structure be
assumed for the between subject variances for the test and reference
products. You will see in the SAS output that FA0(2) is a 2x2 matrix
that may look something like this:
Covariance Parameter Estimates
Cov Parm Subject Group Estimate
FA(1,1) Subject 0.5928
FA(2,1) Subject 0.4968
FA(2,2) Subject 2E-17
Residual Subject Treatment A 0.124
Residual Subject Treatment B 0.242
(sorry about the formatting, picture tabs in between columns instead
of spaces.)
If the following means absolutely anything to you, I found this on FA0
modeling
http://www.asu.edu/sas/sasdoc/sashtml/stat/chap41/sect20.htm
TYPE=FA(q)
specifies the factor-analytic structure with q factors (Jennrich and
Schluchter 1986). This structure is of the form AA'+D, where A is a t
x q rectangular matrix and D is a t x t diagonal matrix with t
different parameters. When q > 1, the elements of A in its upper right-
hand corner (that is, the elements in the ith row and jth column for j
> i) are set to zero to fix the rotation of the structure.
TYPE=FA0(q)
is similar to the FA(q) structure except that no diagonal matrix D is
included. When q < t, that is, when the number of factors is less than
the dimension of the matrix, this structure is nonnegative definite
but not of full rank. In this situation, you can use it for
approximating an unstructured G matrix in the RANDOM statement or for
combining with the LOCAL option in the REPEATED statement. When q = t,
you can use this structure to constrain G to be nonnegative definite
in the RANDOM statement.
Nice of Jennrich and Shluchter not to name the variance structure
after them, particularly since Schluchter is so hard to say. Moving
along, your inter- and intra-subject variabilities are calculated from
these lines. You can access the covariance matrix by adding these
lines to your model (e.g. for lnauct):
ods output estimates=se_lnauct;
make "CovParms" out=cov_lnauct;
ods output "Least Squares Means"=LSM_lnauct;
The intra-subject variability is still calculated from the residuals
as in a 2-way crossover, as:
IntraCV_trtA = 100%*sqrt(exp(MSR_trtA)-1) (in this case, =
100*sqrt(exp(0.124)-1) = 36.3%
IntraCV_trtB = 100%*sqrt(exp(MSR_trtB)-1) (in this case, =
100*sqrt(exp(0.242)-1) = 52.3%
The parameter FA(2,2) is the subject-by-formulation interaction, and
is composed of the following terms:
sig^2_D = sig_BT^2 + sig_BR^2 - 2*p*sig_BT*sig_BR (sorry no subscript
in plain text, the underscore here denotes subscript)
I want to say that FA(1,1) and FA(1,2) are the inter-subject variances
for Test and Reference products but I can't find any references to
back this up. This is an easy question for a biostatistician in this
area.
Your second question: What is the use of REPEATED statement?.
The repeated statement is there to indicate the within-subject
variance estimates for Test and Reference should be derived
separately. This comes out of Scott Patterson and Byron Jones' book,
"Bioequivalence and Statistics in Clinical Pharmacology", Chapman &
Hall/CRC, 2006. Take the repeated statement out of the model, you only
get one residual, and one intraCV. Too bad for you.
Your third question: What is the main diffence between DDFM=SATTERTH
and DDFM=KENWARDROGER?.Which one is better?.
These are two different methods in dealing with degrees of freedom.
SATTERTH comes from Satterthwaite F. Synthesis of variance.
Psychometrika 1941; 6:309-316. The main assumption here is that the
variance term is estimated based on the sum of independent centrally
chi^2 distributed variates. Kenward-Roger is a newer method, and
involves creating a scale factor for a Wald statistic (aren't you
sorry you asked?). Here is an excerpt from their paper:
[Hmm. no excerpt with the message - db]
Back to the Top
The following message was posted to: PharmPK
Hi Matthew,
You can probably tell I was only half-way through my response to you
when I accidentally hit the "send" button. What might even be more
confusing is that the finished post I actually sent midnight this
morning wasn't posted, so this is post #3 of a potential 3 part
series. In any event, here it is, the new and improved version 1.1 of
my reply. [note to DB: if you haven't posted my last version, please
delete - OK, I found it in my PharmPK mailbox - I have anything where
the Subject starts with PharmPK filtered and I don't always see it to
send on the list. As a general note - it is better if posters don't
start their Subject with 'PharmPK' as I may not notice that it is a
new message for the list... - db].
In this post:
- How to calculate Inter- and Intra-subject CV for replicate designs -
finally revealed!!!
- Easier to follow, less rambling, question-and-answer type formatting
- Re-read and spell-checked for flowability
- No needless distracting references to QWERTY keyboards
Getting on with it:
First off, if your question is not academic, and this is for a
regulatory submission, my direct response would simply be, use what
the FDA specified, because that's what they specified. It's not much
of an answer, but then - so many things in life are like this. Why did
the FDA choose SAS, why did they choose 80-125%? Not a helpful answer.
Now for the "why" part, or at the very least, "where". Your first
question:
Why this TYPE=FA0(2) option is used in the RANDOM statement?
This isn't a crossover study anymore, and we can now estimate the true
between- and within-subject variances because people take a product
more than once. FA0(2) specifies that the Factor Analytic structure be
assumed for the between subject variances for the test and reference
products. You will see in the SAS output that the FA0(2) part may look
something like this:
Covariance Parameter Estimates
Cov Parm Subject Group Estimate
FA(1,1) Subject 0.5928 <---- (sig_BT, the between-subject standard
deviation for the Test product)
FA(2,1) Subject 0.4968 <---- (sig_BR, the between-subject standard
deviation for the Reference product)
FA(2,2) Subject 2E-17 <---- (sig_D, the subject-by-formulation
interaction term)
Residual Subject Treatment A 0.124 <---- (sig_WT^2, the within-
subject standard deviation for the Test product)
Residual Subject Treatment B 0.242 <---- (sig_WR^2, the within-
subject standard deviation for the Reference product)
If the following means absolutely anything to you, I found this on FA0
modeling (http://www.asu.edu/sas/sasdoc/sashtml/stat/chap41/sect20.htm).
Moving along, your inter- and intra-subject variabilities are
calculated from these lines. You can access the covariance matrix by
adding these lines to your model (e.g. for lnauct):
ods output estimates=se_lnauct;
make "CovParms" out=cov_lnauct;
ods output "Least Squares Means"=LSM_lnauct;
You asked: How I can calculate the intra CV's and inter CV for this
replicate design?
The intra-subject variabilities are still calculated from the
residuals as in a 2-way crossover, using:
IntraCV_T = 100%*sqrt(exp(sig_WT^2)-1) (in this case, =
100*sqrt(exp(0.124)-1) = 36.3%
IntraCV_R = 100%*sqrt(exp(sig_WR^2)-1) (in this case, =
100*sqrt(exp(0.242)-1) = 52.3%
FA(1,1) and FA(1,2) are the inter-subject standard deviations for Test
and Reference products, respectively:
InterCV_T = 100%*sqrt(exp(sig_BT^2)-1=100*sqrt(exp(0.5928^2)-1) = 64.9%
IntraCV_R = 100%*sqrt(exp(sig_BR^2)-1) = 100*sqrt(exp(0.4968^2)-1) =
52.9%
I wasn't 100% sure about this, but then asked a VERY kind collegue of
mine today, who had previously found the answer in Chow and Liu,
Design and Analysis of Crossover Trials (p345). It seems counter-
intuitive that the output for the residual would be a variance, and
the FA(x,x) output would be a standard deviation (note, we didn't
square the number for intra-CV but we did for inter-CV). How are we
supposed to know this just by looking at the SAS output? C'est la vie,
I guess. One of the many mind-boggling issues that keeps
biostatisticians in high demand.
Going beyond your question:
- The parameter FA(2,2) is the subject-by-formulation interaction, and
is composed of the following terms:
sig_D^2 = sig_BT^2 + sig_BR^2 - 2*p*sig_BT*sig_BR (sorry no subscript
in plain text, the underscore here denotes subscript)
Only keep in mind FA(2,2) is sig_D, not sig_D^2.
- Here's some news, if the sig_D > 0.15, that means you have evidence
of a subject-by-formulation interaction. Otherwise the interaction is
considered negligible.
- The total CV is calculated using:
CV=sqrt(sig_D^2 + (sig_WT^2+sig_WR^2)/2)
You asked: What is the use of REPEATED statement?.
The repeated statement is there to indicate the within-subject
variance estimates for Test and Reference should be derived
separately. This comes out of Scott Patterson and Byron Jones' book,
"Bioequivalence and Statistics in Clinical Pharmacology", Chapman &
Hall/CRC, 2006. Take the repeated statement out of the model, you only
get one residual, and one intraCV. Too bad for you.
You asked: What is the main difference between DDFM=SATTERTH and
DDFM=KENWARDROGER?.Which one is better?.
These are two different methods in dealing with degrees of freedom.
The SATTERTH method comes from Satterthwaite F. Synthesis of variance.
Psychometrika 1941; 6:309-316. The main assumption here (according to
Patterson) is that the variance term is estimated based on the sum of
independent centrally chi^2 distributed variates. Kenward-Roger is a
newer method, and involves creating a scale factor for a Wald
statistic (aren't you sorry you asked?). Here is an excerpt from the
original Kenward-Roger paper:
"The calculation of the scale factor for the Wald statistics allows us
to include as special cases those settings where the statistics have
exact F distributions. Further, it leads to consistency among the
degrees of freedom associated with Wald statistics for nested linear
combinations of
fixed effects. The overall procedure can therefore be applied in an
automatic way to construct tests and confidence intervals for fixed
effects without first having to separate out special cases, such as
analysis of variance F-ratios. All such special forms will be
reproduced exactly by the procedure.
The degrees of freedom and scale factor also provide some insight into
the structure of the data and model." (Kenward M, Roger J. Small
sample inference for fixed effects from restricted maximum likelihood.
Biometrics 1997; 33:983-997.)
It makes for great chit-chat at parties. Really though, KENWARDROGER
can be applied to replicate studies, according to Patterson's book,
but the FDA specifies SATTHERTH in this case. For everything else in
average bioequivalence, people seem to stick with KENWARDROGER if they
use PROC MIXED. Which one is better? I can't really answer that, since
I don't really have a good working knowledge of either of them.
Comments from the board are welcomed, again. I will tell you one thing
though, if you have a balanced study population, it will make little
if any difference. All of this reminds me of my favourite quote,
"models are to be used, and not believed in." Although this can work
on message boards, it is not to be used as a fall-back during a thesis
defense.
You asked: In this code there is no SUB(SEQ) effect. Suppose if I want
to include SUB(SEQ) effect, is there any problem if I use the
following code. (your code provided).
In so far as the code is wrong, yes, there is a problem. You won't be
able to budget your variances properly and tease out the intraCVs for
Test and Reference products - which means you just blew the analysis
of a really long study. I can see you want to use "random
subject(sequence)", because you're used to seeing it in a 2-way
crossover design. However, you're missing the point of the replicate
analysis by using the 2-way approach on it. This statement does not
belong in the replicate design, it is replaced by the random line
where (I believe) subject is defined as a random variable.
I hope this helps a bit, it sure helped my understanding of the
replicate design analysis (now I feel like I've graduated from barely
scratching the surface, to barely scratching the surface with a few
extra references).
Cheers,
-Dave
--
David Dubins, Ph.D., B.Eng.
Global Bioequivalence Consulting
Assistant Professor, Leslie Dan Faculty of Pharmacy
University of Toronto
Back to the Top
Dear Dave,
Thank you so much for this long explanation about the analysis of
replicate design. Now I got some ideas about the SAS statements and
their interpretation.It will be very useful for me.
One more doubt , how I can include the SUB(SEQ) effect in the
analysis(as per sponsors requirement). What is the use of the term?
Regards
Matz
Back to the Top
The following message was posted to: PharmPK
Hi Matthew,
You can include it in the model like this:
/* RANDOM SUB(SEQ) */
Declaring Subject(Sequence) as a random variable is acknowledging that
different subjects will be coming in each time. For instance, there
will always be 2 periods in a 2-way crossover (so that's a fixed
variable), but subject is a random variable because you can't set who
walks in the door each time. This statement doesn't go in the
replicate design, it's a different model. I'm not good enough at
modeling to tell you why, only that the statement doesn't get used for
replicates. Perhaps subject is defined as a random variable in this
line?
RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G
Cheers,
-Dave
Want to post a follow-up message on this topic?
If this link does not work with your browser send a follow-up message to PharmPK@boomer.org with "Replicate Design" as the subject | Support PharmPK by using the |
Copyright 1995-2011 David W. A. Bourne (david@boomer.org)