- On 8 Mar 2006 at 11:57:00, "Tata, Prasad N" (Prasad.Tata.-at-.TycoHealthcare.com) sent the message

Back to the Top

In the BE studies it is customary to calculate within subject

variability for the purpose of power calculations. We routinely log

transform Cmax and AUCs for calculating 90% CI. My question is do we

get different values for within subject variability (%CV) if we use

log transformed Cmax AUC values as opposed to using non transformed

values.

Prasad

Prasad NV Tata, Ph.D., FCP

Manager-Pharmacokinetics

Mallinckrodt, Inc.

675 McDonnell Blvd.

Saint Louis, MO 63134

Tel: (314) 654-5325

Fax: (314) 654-9325

e-mail: prasad.tata.at.tycohealthcare.com - On 9 Mar 2006 at 14:39:27, "Heuvel, M.W. van den (Michiel)" (michiel.vandenheuvel.at.organon.com) sent the message

Back to the Top

The following message was posted to: PharmPK

Hello,

You certainly get different values, because you use different formulae.

Since we standardly use log transformations in BE studies (because PK

parameters tend to be more lognormally distributed), the within-subject

CV estimate based on log transformed data [calculated as

CV=100*sqrt(exp(MSE)-1))] is a better estimate of the real

within-subject variation than the one based on non-transformed data

[calculated as CV=100*SD/arithmeticmean].

Kind regards,

Michiel van den Heuvel

Organon - On 10 Mar 2006 at 15:54:16, "Hans Proost" (j.h.proost.aaa.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Michiel,

You wrote:

> Since we standardly use log transformations in BE studies (because PK

> parameters tend to be more lognormally distributed), the within-

subject

> CV estimate based on log transformed data [calculated as

> CV=100*sqrt(exp(MSE)-1))] is a better estimate of the real

> within-subject variation than the one based on non-transformed data

> [calculated as CV=100*SD/arithmeticmean].

I agree, but why do you not use simply

CV=100*sqrt(MSE)

instead of

CV=100*sqrt(exp(MSE)-1))

The latter equation converts the standard deviation of the log-normal

distribution to the standard deviation of the 'corresponding' normal

distribution. But I do not see any reason for doing so, since for the

statistical comparison of the geometric (logarithmic) means the standard

deviation of the log-normal distribution [ sqrt(MSE) ] is used, so again

there is no need for conversion. Please note that the latter equation

gives

higher values for CV than the first equation; at CV = 20% the

difference is

small (20.2% versus 20%), but the difference increases rapidly with CV.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.-at-.rug.nl - On 10 Mar 2006 at 15:24:22, "Wang, Yaning" (yaning.wang.aaa.fda.hhs.gov) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Michiel,

You wrote:

> Since we standardly use log transformations in BE studies (because PK

> parameters tend to be more lognormally distributed), the within-

subject

> CV estimate based on log transformed data [calculated as

> CV=100*sqrt(exp(MSE)-1))] is a better estimate of the real

> within-subject variation than the one based on non-transformed data

> [calculated as CV=100*SD/arithmeticmean].

If a random variable (e.g. Cmax) follows a log-normal distribution,

those

two equations should give you the same answer (not exactly the same

numbers,

but both equations are correct). In BE study, CV=100*sqrt(exp

(MSE)-1))] is

used because other fixed effects, such as sequence, period and treatment

effects, need to be corrected first. In other words, those Cmax's

were not

just a random sample of a log-normal distribution. They are a mixture of

many log-normal distributions with different means. Therefore,

CV=100*SD/arithmeticmean] cannot be directly used to estimate the

true CV.

Dear Hans:

You wrote:

>I agree, but why do you not use simply

>CV=100*sqrt(MSE)

>instead of

>CV=100*sqrt(exp(MSE)-1))

CV=100*sqrt(MSE) is simply an approximation of CV=100*sqrt(exp

(MSE)-1)) (the

exact equation to calculate CV) for a log-normally distributed variable.

Reporting the EXACT CV cannot be worse than an approximation. As you

mentioned, when CV increases, the differece increases. Since the

approximation is lower than the exact CV. The exact CV should be

reported.

Yaning Wang, Ph.D.

Senior Pharmacometrician and Clinical Pharmacologist

Office of Clinical Pharmacology and Biopharmaceutics

Center of Drug Evaluation and Research

Food and Drug Administration

Office: 301-796-1624

The contents of this message are mine personally and do not necessarily

reflect any position of the Government or the Food and Drug

Administration. - On 13 Mar 2006 at 09:51:57, "Hans Proost" (j.h.proost.at.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Yaning,

Thank you for your comments. You wrote:

> CV=100*sqrt(MSE) is simply an approximation of CV=100*sqrt(exp

> (MSE)-1)) (the

> exact equation to calculate CV) for a log-normally distributed

variable.

I assume that we are talking about a different definition of CV. What

is a

CV of a log-normal distribution? We can approach this in two ways:

1) CV is defined as the CV of the data assuming that the distribution is

normal. Then it can be calculated:

a) in the same way as for a normal distribution, i.e. CV = 100 * SD /

arithmeticmean

b) from CV = 100 * sqrt(exp(MSE)-1)

These equations provide different numbers, but for large n both values

approach assymptotically (as can be concluded from the source code given

below). I agree that equation b) should be preferred.

The problem with this definition is that it calculates CV assuming

that the

distribution is normal, in spite of the assumption that the

distribution is

log-normal! This makes no sense to me.

2) CV is defined as 100 * SD of the log-normal distribution (only

valid if

logarithms with base e are used!). This implies that CV = 100 * sqrt

(MSE).

In this case the distribution is really treated as a log-normal

distribution.

To demonstrate the differences between the methods, I made a small

program

in Pascal. In total 1,000,000 numbers were randomly drawn from a log-

normal

distribution with mean = 100 and sigma = 0.5.

The arithmetic mean is 113.4 (!), the standard deviation is 60.4, and CV

according to 1a) is 53.3%.

The geometric mean is 100.0, the standard deviation is 0.500, and the CV

according to 2) is 50.0%.

The CV according to method 1b) is 53.3%.

Method 1a) and 1b) provide very similar results, and therefore it can be

concluded that method 1b) indeed gives the CV of the data assuming a

normal

distribution. Method 2) gives CV = 50% if sigma = 0.5. This sounds

good to

me.

--

PROGRAM TestDist;

FUNCTION NormRandom : DOUBLE;

{ Generates a random value (mean value 0 and standard deviation 1) :

N(0,1) }

BEGIN

NormRandom:=Cos(2*Pi*Random)*Sqrt(-2*Ln(Random))

END; {NormRandom}

CONST

N = 1000000;

Mean = 100;

Sigma = 0.5;

VAR

I : LONGINT;

X, Sx, Sxx, Slx, Slxx, M, S, S2 : DOUBLE;

BEGIN

Randomize;

Sx:=0; Sxx:=0; Slx:=0; Slxx:=0;

FOR I:=1 TO N DO

BEGIN

X:=Mean*Exp(Sigma*NormRandom);

Sx:=Sx+X;

Sxx:=Sxx+Sqr(X);

Slx:=Slx+Ln(X);

Slxx:=Slxx+Sqr(Ln(X))

END;

WriteLn;

M:=Sx/N;

S2:=(Sxx-Sqr(Sx)/N)/(N-1);

S:=Sqrt(S2);

WriteLn('normal',M:20:2,S:20:2,100*S/M:20:2);

M:=Exp(Slx/N);

S2:=(Slxx-Sqr(Slx)/N)/(N-1);

S:=Sqrt(S2);

WriteLn('log ',M:20:2,S*M:20:2,100*S:20:2);

S:=Sqrt(Exp(S2)-1);

WriteLn('log ',M:20:2,S*M:20:2,100*S:20:2);

ReadLn

END.

--

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.at.rug.nl - On 15 Mar 2006 at 16:31:08, "Heuvel, M.W. van den (Michiel)" (michiel.vandenheuvel.-a-.organon.com) sent the message

Back to the Top

Dear Hans,

You wrote:

"1) CV is defined as the CV of the data assuming that the

distribution is normal. Then it can be calculated:

b) from CV = 100 * sqrt(exp(MSE)-1))

I agree that equation b) should be preferred. The problem with this

definition is that it calculates CV assuming that the distribution is

normal, in spite of the assumption that the distribution is log-

normal! This makes no sense to me."

Let me explain where this formula comes from. This is just stats

theory, unfortunately I do not have a good reference by hand.

Let's assume X is log-normally distributed, then Y = ln(X) is

normally distributed; let's say with mean MU and variance S2 (ln is

natural logarithm).

Then it can be derived that the expectation of X equals E(X) = exp(MU

\0x01AE.5*S2), known as geometric mean

(the equation above was E(X) = exp(MUT.5*S2) where the T had a little

tail to the right - I'm not sure what will be sent in the plain text

conversion. Michiel, I hope the rest of your message got through OK -db)

And variance Var(X) = exp(2*MU+S2)*(exp(S2)-1)

So that the relative standard deviation, i.e. coefficient of

variation CV, equals

CV(X) = 100 * sqrt(Var(X)) / E(X) = 100 * sqrt(exp(S2)-1)

Where in ANOVA settings S2 is estimated by MSE from your ANOVA.

As noted in a previous mail, sqrt(exp(MSE)-1) can be approximated by

sqrt(MSE), which is simple mathematical Taylor approximation from exp

(MSE)-1 into MSE for MSE close to zero. This is the case which you

called incorrectly a log-normal distribution in your formula 2). The

only thing we assumed here is that the logvalues of the

concentrations are normally distributed.

Apart from this mathematics: the most obvious differences between

normal and lognormal distribution has to do with this CV. For a log-

normal distribution CV is independent from MU (mean), which means

that the RELATIVE standard deviation is the same for all levels and

thus standard deviation increases with levels, which we observe very

much in our PK data. For a normal distribution one assumes that

standard deviation is constant with mean, which means that CV

decreases with increasing mean.

With respect to the simulation you made in Pascal:

I think you should calculate X as Exp(Mean+Sigma*NormRandom). Then

you will find other results.

Furthermore you used a rather small SD of 0.5 compared to the mean of

100 and then the difference between a normal and a lognormal

distribution is not so big yet. Maybe try SD of 50 and mean of 100?

Kind regards,

Michiel van den Heuvel

Organon

Netherlands - On 15 Mar 2006 at 18:37:59, "Heuvel, M.W. van den (Michiel)" (michiel.vandenheuvel.at.organon.com) sent the message

Back to the Top

[This might be clearer, Michiel resent the message as plain text. My

attempt wasn't quite right - db]

Dear Hans,

You wrote:

"1) CV is defined as the CV of the data assuming that the distribution

is normal. Then it can be calculated:

b) from CV = 100 * sqrt(exp(MSE)-1))

I agree that equation b) should be preferred. The problem with this

definition is that it calculates CV assuming that the distribution is

normal, in spite of the assumption that the distribution is log-normal!

This makes no sense to me."

Let me explain where this formula comes from. This is just stats theory,

unfortunately I do not have a good reference by hand.

Let's assume X is log-normally distributed, then Y = ln(X) is normally

distributed; let's say with mean MU and variance S2 (ln is natural

logarithm).

Then it can be derived that the expectation of X equals E(X) exp(MU

+0.5*S2), known as geometric mean

And variance Var(X) = exp(2*MU+S2)*(exp(S2)-1)

So that the relative standard deviation, i.e. coefficient of variation

CV, equals

CV(X) = 100 * sqrt(Var(X)) / E(X) = 100 * sqrt(exp(S2)-1)

Where in ANOVA settings S2 is estimated by MSE from your ANOVA.

As noted in a previous mail, sqrt(exp(MSE)-1) can be approximated by

sqrt(MSE), which is simple mathematical Taylor approximation from

exp(MSE)-1 into MSE for MSE close to zero. This is the case which you

called incorrectly a log-normal distribution in your formula 2). The

only thing we assumed here is that the logvalues of the concentrations

are normally distributed.

Apart from this mathematics: the most obvious differences between normal

and lognormal distribution has to do with this CV. For a log-normal

distribution CV is independent from MU (mean), which means that the

RELATIVE standard deviation is the same for all levels and thus standard

deviation increases with levels, which we observe very much in our PK

data. For a normal distribution one assumes that standard deviation is

constant with mean, which means that CV decreases with increasing mean.

With respect to the simulation you made in Pascal:

I think you should calculate X as Exp(Mean+Sigma*NormRandom). Then you

will find other results.

Furthermore you used a rather small SD of 0.5 compared to the mean of

100 and then the difference between a normal and a lognormal

distribution is not so big yet. Maybe try SD of 50 and mean of 100?

Kind regards,

Michiel van den Heuvel

Organon

Netherlands - On 15 Mar 2006 at 16:19:06, "Wang, Yaning" (yaning.wang.aaa.fda.hhs.gov) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Hans:

You wrote:

>I assume that we are talking about a different definition of CV.

As far as I know, there is only one kind of definition for CV. That is

SD/Mean irrespetive of the distribution. If you want to convert it to a

percentage, then do 100*SD/Mean.

>What is a CV of a log-normal distribution?

See Michiel van den Heuvel's reply or read

http://en.wikipedia.org/wiki/Log-normal_distribution and apply

CV=SD/Mean=sqrt[Var(X)]/E(X). The only thing I want to add is that the

expectation of X, E(X)=exp(MU+0.5*S2), is the arithmetic mean, not the

geometric mean, of X. The geometric mean of X is equal to the median

of X.

(Note: X follows log-normal distribution, i.e. ln(X)~N(MU, S2))

Yaning Wang, Ph.D.

Senior Pharmacometrician and Clinical Pharmacologist

Office of Clinical Pharmacology and Biopharmaceutics

Center of Drug Evaluation and Research

Food and Drug Administration

Office: 301-796-1624

The contents of this message are mine personally and do not necessarily

reflect any position of the Government or the Food and Drug

Administration. - On 16 Mar 2006 at 14:43:27, "J.H.Proost" (J.H.Proost.-a-.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Michiel,

Thank you for your reply. Your theoretical explanation is

correct and beyond discussion. You wrote:

> As noted in a previous mail, sqrt(exp(MSE)-1) can be

> approximated by sqrt(MSE), which is simple mathematical

> Taylor approximation from exp(MSE)-1 into MSE for MSE

> close to zero.

Yes, stated this way, sqrt(MSE) is an approximation. But

this is not what I meant. Sqrt(MSE) is also exactly equal

to the standard deviation of the lognormal distribution.

> This is the case which you called incorrectly a

> log-normal distribution in your formula 2). The

> only thing we assumed here is that the logvalues of

> the concentrations are normally distributed.

According to my sources, and to your statement given

above, both expressions are equivalent. What did I say

incorrectly?

> Apart from this mathematics: the most obvious

> differences between normal and lognormal distribution

> has to do with this CV. For a log-normal

> distribution CV is independent from MU (mean), which

> means that the RELATIVE standard deviation is the

> same for all levels

Yes, we agree. But, a few line earlier you wrote:

> variance Var(X) = exp(2*MU+S2)*(exp(S2)-1)

This would imply that var(x) is dependent on MU (albeit

only to a limited extent).

> With respect to the simulation you made in Pascal:

> I think you should calculate X as

> Exp(Mean+Sigma*NormRandom).

> Then you will find other results.

No. In my simulations, Mean = 100, and refers to the

geometric mean of the assumed log-normal distribution.

This would give values around 2.688E+43, which is not a

usual range.

> Furthermore you used a rather small SD of 0.5

> compared to the mean of 100 and then the difference

> between a normal and a lognormal

> distribution is not so big yet. Maybe try SD of 50 and

> mean of 100?

I did, so I don't understand your question. I used the

term Sigma for the standard deviation of the log-normal

distribution.

In my earlier statement I wrote:

>> The problem with this definition is that it calculates

>> CV assuming that the distribution is normal, in spite

>> of the assumption that the distribution is log-normal!

>> This makes no sense to me.

I still do not have a reply to this comment. Perhaps I can

state it in a different way. Assume, we have a series of

data, and we have sufficient evidence that the data are

log-normally distributed, or some authority states that we

have to analyse the data assuming a log-normal

distribution.

Why would one calculate a CV?

a) To report a value. OK, any method is acceptable.

b) To get some idea of the degree of variability. OK, any

method is acceptable.

c) To apply in a statistical test.

Ad c): The only meaningful tests are either a

distribution-free test, or a test based on the assumption

of a log-normal distribution, which is equivalent to a

test based on the assumption of a normal distribution

applied to the logarithms of the values. In such a test

one should use the variance or sd. This is not

variance Var(X) = exp(2*MU+S2)*(exp(S2)-1)

Instead the variance of Y should be used, which is S2, and

is obtained e.g. from ANOVA Var(Y) = MSE = Sigma^2 in my

code example.

So we do not need to calculate CV according to 'your'

definition. We need MSE or Sigma as a measure of

variability. And I don't see why one should not interpret

a value Sigma = 0.5 as the 'log-normal equivalent' of a CV

= 50%. This is not an approximation, but a different view.

If a data are log-normally distributed, they should be

analysed and interpreted as their logarithms. For

practical purposes, mean(Y) is usually transformed back to

the original units, thus providing a geometric mean as the

measure of central tendency. For Var, SD and CV there is

no reason for back-transformation.

In the example of my previous message: In my view, the

rational value of 'mean' is 100, sigma = 0.5, and CV 50%.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.aaa.rug.nl - On 16 Mar 2006 at 15:00:53, "J.H.Proost" (J.H.Proost.aaa.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Yaning,

Thank you for your reply. You wrote:

> As far as I know, there is only one kind of definition

> for CV. That is SD/Mean irrespetive of the distribution.

OK. Perhaps my statement was somewhat provocative, and not

fully to the point. Indeed, an (arithmic) mean and a

standard deviation can be calculated irrespective of the

distribution. But what I actually meant, is that mean and

sd cannot be interpreted without assuming a distribution.

Yes, they can be interpreted as a measure of central

tendency and a measure of variability, respectively. But

they cannot be used for a statistical test if the

distribution is unknown. And to me it makes no sense to

calculate a statistical parameter that cannot be

interpreted. But if a log-normal distribution is assumed,

we can calculated the geometric mean as a useful measure

of central tendency and the standard deviation of that

distribution as a useful measure of variability. This

standard deviation is 'sqrt(MSE)', in terms of the earlier

messages, and NOT 'sqrt(exp(MSE)-1)', as was demonstrated

in the example of my earlier message.

As explained more extensively in my message to Michiel, I

do not see any use in the equation 'sqrt(exp(MSE)-1)'.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.aaa.rug.nl - On 25 May 2006 at 21:00:11, "Weining" (wyv666888.-a-.yahoo.com) sent the message

Back to the Top

In BE study, what is the formula to calculate the intersubject CV?

Where can those components to calculate intersubject CV be found from

the output of PROC MIXED?

W - On 28 May 2006 at 22:07:24, =?ISO-8859-1?Q?Helmut_Sch=FCtz?= (helmut.schuetz.-at-.bebac.at) sent the message

Back to the Top

The following message was posted to: PharmPK

Hi Weining!

You wrote:

>

>In BE study, what is the formula to calculate the intersubject CV?

>

If you have perfomed you analysis on ln-transformed data (which I hope):

CV = sqrt ( exp (MSE) -1)

where MSE = Sum of squared residuals / DF

DF = degrees of freedom = n1 + n2 -2

n1, n2 = number of subjects in sequences 1 and 2

for a standard 2x2 cross-over design

best regards,

Helmut

--

Helmut Schuetz

BEBAC

Consultancy Services for Bioequivalence and Bioavailability Studies

Neubaugasse 36/11

1070 Vienna/Austria

tel/fax +43 1 2311746

Web http://BEBAC.at

BE/BA Forum http://forum.bebac.at - On 21 Aug 2006 at 09:11:33, gang li (gangli_stat.-at-.yahoo.com) sent the message

Back to the Top

Hi,

I saw the discussion on this interesting topic. I have a few comments

to make:

1) Both STD and CV are statistics used to measure the variability of

the data. The question is "Which one is more appropriate to use?". If

it is appropriate to assume the data following the normal

distribution, then STD is independent of the mean and thus it is

enough to characterize the variability; If it is more appropriate to

assume the data following the log-normal distribution, then STD is

dependent of the mean and thus it alone is not appropriate to

characterize the variability; while CV is independent on the mean and

it is more appropriate to be used to measure the variability.

2) If the data follows log-normal distribution, CV=sqrt(exp(std^2-1))

from probability theory. In such case, the linear mixed model with

random subjects are usually fitted. To calculate the intra-subject

CV, the std^2 is estimated by the MSE; to calculate the inter-subject

CV, the std^2 is estimated by the variance estimate for the random

subject effect from the proc mixed procedure (proc glm can be used

too but with some more calculations).

Gerry Li

Senior Statistician, PhD

Collegeville, PA, 19426 - On 24 Aug 2006 at 08:49:54, "gang li" (gangli01.aaa.gmail.com) sent the message

Back to the Top

Hi,

I saw the discussion on this interesting topic. I have a few comments

to make:

1) Both STD and CV are statistics used to measure the variability of

the data. The question is "Which one is more appropriate to use?". If

it is appropriate to assume the data following the normal

distribution, then STD is independent of the mean and thus it is

enough to characterize the variability; If it is more appropriate to

assume the data following the log-normal distribution, then STD is

dependent of the mean and thus it alone is not appropriate to

characterize the variability; while CV is independent on the mean and

it is more appropriate to be used to measure the variability.

2) If the data follows log-normal distribution, CV=sqrt(exp(std^2-1))

from probability theory. In such case, the linear mixed model with

random subjects are usually fitted. To calculate the intra-subject

CV, the std^2 is estimated by the MSE; to calculate the inter-subject

CV, the std^2 is estimated by the variance estimate for the random

subject effect from the proc mixed procedure (proc glm can be used

too but with some more calculations).

Gerry Li, PhD

GlaxoSmithKline

Collegeville, PA, 19426 - On 26 Aug 2006 at 13:36:30, "yogesh sonawane" (yogesh.sonawane.at.rediffmail.com) sent the message

Back to the Top

Dear Gerry,

Yes, CV is independent on the mean and it is more appropriate to be

used to measure the variability. Also it is independent of unit.

About STD can We say directly , std is independent of mean.

std is sqrt of varience = 1/n * sum over i(i=1 to n) (Xi-Mean of X)

Yogesh - On 28 Aug 2006 at 09:27:53, "gang li" (gangli01.-at-.gmail.com) sent the message

Back to the Top

Hi Yogesh,

If the measurements are independent and follow the same normal

distribution (this condition may be relaxed further), then the sample

STD by your formula (which is not unbiased in small sample, but

works fine in large sample) is independent of sample mean. Which one

(CV and STD) is more appropriate to describe the variablity really

depends on the data generating mechnism.

Gerry

Want to post a follow-up message on this topic? If this link does not work with your browser send a follow-up message to PharmPK@boomer.org with "Simple Statistics question in BE studies" as the subject

PharmPK Discussion List Archive Index page

Copyright 1995-2010 David W. A. Bourne (david@boomer.org)