- On 30 Aug 2005 at 11:00:12, "sulagna" (sdas.aaa.clinsearchlabs.com) sent the message

Back to the Top

Dear All

Could any one Tell me what is the distinction between the multiple

comparison Tests - Turkey's and Student Newman

How is Turkeys test better ? and why ?...........( as it seems so )

Regards

Sulagna - On 30 Aug 2005 at 11:50:52, Navdeep Randhawa (Navdeep.Randhawa.aaa.biovail.com) sent the message

Back to the Top

Hi Sulgana,

The Student Newman test has more power in comparison to Tukey's test.

Having said that it does not control for Type I error (i.e. it could

be greater than 5%), whereas with Tukey's test you can adjust for

multiple comparisons. Also, it (Student Newman) test does not

generate 95% CI's for each difference.

I would suggest to use Tukey's test.

Best regards,

Nav

--

Nav Randhawa, MSc

Biostatistician, Pharmacokinetics and Statistics

Biovail Contract Research

Email: navdeep.randhawa.aaa.biovail.com

URL: www.biovail-cro.com - On 30 Aug 2005 at 12:17:25, Angusmdmclean.-at-.aol.com sent the message

Back to the Top

The multiple comparison tests (of means) you cite are brought into

play when you want to compare all pairs of means you have: the Tukey

and Student-Newman tests are related and report identical results

when comparing the largest with the smallest means. With other

comparisons, Tukey's method is more conservative, but may miss real

differences too often. On the other hand the Student-Newman-Keuls

method is more discriminating and may mistakenly find differences too

often. There may not be general agreement about which one to use

i.e. which is the better of the two tests (of the means).

Perhaps the best approach is to test both ways and report the results

with an interpretive commentary appropriate for the science under

consideration.

Hope above helps,

Angus McLean Ph.D,

8125 Langport Terrace,

Suite 100,

Gaithersburg,

MD 20877

tel 301-869-1009

fax 301-869-5737

BioPharm Global Inc. - On 30 Aug 2005 at 20:33:55, pharmaco kinetics (phar_res1.-a-.yahoo.com) sent the message

Back to the Top

Dear Sulgna,

The Newman-Keuls test has more power. This means it can find that a

diference between two groups is 'statistically significant' in some

cases where the Tukey test would conclude that the difference is 'not

statistically significant'. But this extra power comes at a price.

Although the whole point of multiple comparison post tests is to keep

the chance of a Type I error in any comparison to be 5%, in fact the

Newman-Keuls test doesn't do this. In some cases, the chance of a

Type I error can be greater than 5%. Another problem is because the

Newman-Keuls test works in a sequential fashion, it can not produce

95% confidence intervals for each difference. Because the Newman-

Keuls test doesn't control error rate, doesn't generate confidence

intervals, Tukey test is better. - On 31 Aug 2005 at 11:09:47, "Hans Proost" (j.h.proost.-a-.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Angus,

You wrote:

> Tukey's method is more conservative, but may miss real

> differences too often.

This sounds quite strange. How do you know if you miss a 'real

difference'

(by the way, what is a 'real difference'?) ?

> Perhaps the best approach is to test both ways and report the results

> with an interpretive commentary appropriate for the science under

> consideration.

This also sounds strange. Using two different tests is considered bad

practice in statistics. The appropriate statistical test must be

selected

before the experiment is performed, or at least before the data are

known.

Others have pointed out that the Newman-Keuls test does not restrict the

type I error to the chosen alpha (usually 0.05). This indeed means

that it

may conclude mistakenly to a significant difference 'too often'. In my

opinion this also means that the Newman-Keuls test is not an appropriate

test.

Any comments?

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.aaa.rug.nl - On 31 Aug 2005 at 12:13:09, Angusmdmclean.-at-.aol.com sent the message

Back to the Top

To Johannes H. Proost:

From a regulatory perspective one should definitely a priori define

the statistical test and criteria for difference one will use in a

clinical or preclinical protocol and be able to justify it: it is not

at all good practice to use multiple tests subsequently. That is for

sure, since it makes it look at though one is searching for a test

allowing interpretation of data the "way you want to." In other

words you are selecting a statistical test, which is supportive of an

argument you have already decided to make.

On the other hand for inspection of preliminary experimental data I

have not seen a lot wrong with exploring different statistical

options upfront and discussing them upfront. Perhaps at that point

one would justify selecting a statistical test for future studies.

The essence of commentary provided by yourself on Newman-Keuls/Tukey

and others in the discussion group is most helpful information and

evidently could be referred to at that point evidently generally in

favor of Tukey.

One wonders whether there are any instances depending on the type of

data that you do have and the comparison you are making (where you

have other relevant information available) where it would be more

appropriate despite the limitations described to use the more

discriminating test i.e. Newman Keuls or indeed should our position

be for multiple comparisons to discard the Newman-Keuls completely in

favor of Tukey test?

Best Regards,

Angus McLean Ph.D,

8125 Langport Terrace,

Suite 100,

Gaithersburg,

MD 20877

tel 301-869-1009

fax 301-869-5737 - On 1 Sep 2005 at 13:06:04, "J.H.Proost" (J.H.Proost.aaa.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Argus,

Thank you for your reply. I agree with your view on statistics for

inspection of preliminary experimental data, i.e. this is indeed the

usual way, I presume.

But my comment was also intended to point to this practice, which is

at least a dangerous one. The results of such a statistical test

practice are no more than 'preliminary statistical data', which is

almost a 'contradictio in terminis'. We apply statistics to make

clear decisions; they are always arbitrary (e.g. alpha = 0.05 is

completely arbitrary), but should not be subjective. There is nothing

like 'tend to be statistically different' or 'likely to be

statistically different'.

Please note that I am not an expert in statistics, and I did not

provide the information on the Newman-Keuls and Tukey tests, but I

trust this information was correct. So I would like to hear the

answer to your final question from the experts.

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.at.rug.nl - On 1 Sep 2005 at 12:33:51, Angusmdmclean.at.aol.com sent the message

Back to the Top

Johannes H. Proost

I am not a statistical expert either; I put my question reproduced

here below to Harvey Motulsky MD and president of Graph Pad Prism in

San Diego.

Again my question relating to multiple comparison optional tests is :

"One wonders whether there are any instances depending on the type of

data that you do have and the comparison you are making (where you

have other relevant information available) where it would be more

appropriate, despite the limitations described, to use the more

discriminating test i.e. Newman Keuls or indeed should our position

be for multiple comparisons to discard the Newman-Keuls completely in

favor of Tukey test?"

Harvey sent me to his Website and I reproduce it here:

" How do I decide between the Tukey and Newman-Keuls multiple

comparison test? FAQ# 1093

Both the Tukey test (also called Tukey-Kramer test) and the

Newman-Keuls (also called Student-Newman-Keuls test) are used to

compare all pairs of means following one-way ANOVA. Although these

are called post tests, they can be performed regardless of the

results of the overall ANOVA results.

The Newman-Keuls test has more power. This means it can find that

a difference between two groups is 'statistically significant' in

some cases where the Tukey test would conclude that the difference is

'not statistically significant'. But this extra power comes at a

price. Although the whole point of multiple comparison post tests is

to keep the chance of a Type I error in any comparison to be 5%, in

fact the Newman-Keuls test doesn't do this1. In some cases, the

chance of a Type I error can be greater than 5%. Another problem is

because the Newman-Keuls test works in a sequential fashion, it can

not produce 95% confidence intervals for each difference.

Because the Newman-Keuls test has two strikes against it

(doesn't control error rate, doesn't generate confidence intervals)

we recommend that you use the Tukey test instead.

1 MA Seaman, JR LEvin and RC Serlin, Psychological Bulletin

110:577-586, 1991."

Additionally Harvey wrote to me personally on this topic of multiple

comparison tests (see below)

"My understanding (really just repeating what I've read, and not an

independent judgement) is that Newman Keuls belongs in the same

category as Fisher's LSD and Duncan's: of historical interest, and

not to be used.

So what should you use? If you want both significance testing and

confidence intervals, then I don't think anything beats Tukey's (all

comparisons) or Dunnett (all vs. control). If you just want

significance testing and don't want or need confidence intervals,

then I believe that Holm's test is what you want. I do plan to add

this to prism 5, but haven't done so yet. It isn't hard to do by

hand. Glantz's "Primer of Biostatistics" explains it well"

Perhaps this question should be posted to a statistical discussion

group.

Hope above helps,

Best Regards,

Angus McLean Ph.D,

8125 Langport Terrace,

Suite 100,

Gaithersburg,

MD 20877

tel 301-869-1009

fax 301-869-5737

[The Graph Pad Prism URL http://www.graphpad.com/prism/Prism.htm and

the link to the FAQ http://www.graphpad.com/faq/viewfaq.cfm?faq=1093

- db] - On 1 Sep 2005 at 15:07:07, Prah.James.-a-.epamail.epa.gov sent the message

Back to the Top

The following message was posted to: PharmPK

In response to Dr. Proost's comment that: "We apply statistics to make

clear decisions; they are always arbitrary (e.g. alpha = 0.05 is

completely arbitrary)"

A p-value decision is often made based on what is typically done and

without much thought about why.

A decision to select a p value should not "completely arbitrary"; the

p-value decision should be based upon

considering several factors:

1. A p value of 0.05 is commonly used as an adequate risk that the

experimental differences had a relatively small likelihood of occurring

by chance alone. A p-value (alpha) of 0.05 implies that if one did the

same study 20 times only once would the difference be attributable to

chance so one can fairly reject the null hypothesis with a probability

of 0.05 of falsely rejecting the null hypothesis when it is true (type

I error). One can also determine the probability of accepting the null

hypothesis hypothesis when it is false (type II error (beta)). And

consequently, the probability of not making a type II error (beta) can

be determined.

2. How many subjects are needed to have adequate statistical power is

determined by 1- beta. The number of subjects needed to have an

adequate statistical power for a given N can be calculated given the

means and standard deviations. The larger the mean difference the

greater the statistical power (1-beta). Power is the probability that

the difference is a true difference. Statistical power should be at

least 0.80.

3. One should also know if a one-tailed test or a two-tailed test is

desirable. If one knows which direction the effect should occur then a

one-tailed test would require fewer subjects to be statistically

adequate both for significance and statistical power.

4. How important is the outcome? example: in determining the

effectiveness of a new anesthetic, which carries the possibility of an

adverse outcome (death), one would want the p value to be much smaller

than in determining the preferred color of a new medication.

5. Multiple statistical comparisons on the same data set carries with

it the penalty of correcting for the number of tests (Bonferroni

correction). This is an acceptable procedure for pilot studies which

should be confirmed with another study.

James D. Prah, PhD

US EPA

Human Studies Division MD (58B)

Research Triangle Park, NC, 27711

919 966 6244

919 966 6367 FAX - On 1 Sep 2005 at 16:50:13, "Edmond B. Edwards, Ph.D." (editr.-at-.sympatico.ca) sent the message

Back to the Top

I've noticed in all the previous discussions that Duncan's Multiple

Range Test or Cramer's Modified Duncan's Multiple Range Test does not

seem to be in vogue any longer. At the risk of showing my advancing

age, would someone please tell me what the current status is of this

particular test. At one time it was very popular (demonstrated in

Federer's Experimental Design ) and I even have a paper somewhere

showing that this test was the 'best' (however that was defined)

after an extensive set of computer simulations involving all the

multiple range tests. Obviously something happened to it along the

way. If it won't take up too much time, would someone please comment

on the fate of this historical multiple range test.

Thank you.

Edmond B. Edwards, Ph.D. - On 2 Sep 2005 at 09:42:54, Navdeep Randhawa (Navdeep.Randhawa.-a-.biovail.com) sent the message

Back to the Top

Hi Edmond,

In brief, Duncan's test does not adjust for multiple comparisons as

some of the other tests. This means that when conducting a number of

tests a significance level of 0.05 (alpha: Type I error) is used and

therefore the possibility of finding a significant difference by

chance alone is inflated. Whereas, when one uses a test such as

Tukey's, the significant level is adjusted to account and penalize

one depending on the number of tests being conducted and this in turn

controls the type I error rate.

You might find useful an article by Gerard Dallal that discusses

various procedures for multiple comparisons, http://www.tufts.edu/

~gdallal/mc.htm

Best regards,

Nav

--

Nav Randhawa, MSc

Biostatistician, Pharmacokinetics and Statistics

Biovail Contract Research

Tel: (416) 752-3636 Ext. 369

Email: navdeep.randhawa.-a-.biovail.com

URL: www.biovail-cro.com - On 5 Sep 2005 at 09:13:24, "Hans Proost" (j.h.proost.aaa.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Angus,

Thanks for your reply. I agree with most of your comments. I

understand your

view with respect to 'inspection of preliminary experimental data',

but my

counter question is: do we need a statistical test for these data?

What do

we learn from the statistical test saying that there is a statistical

difference with test A, but not with test B? This creates a grey

area. We

have observed a difference in our 'preliminary experimental data' and

we can

make our (preliminary) conclusions from the order of magnitude of the

difference. But this could be concluded also from test A or test B

only, and

very likely also from inspection of the data without any statistical

test.

Statistical test are designed to make conclusions from complete

experimental

data. I do not see any rationale for a statistical test in this stage

(but I

admit that I do it myself quite often ... It is a good thing to be

aware of

strange habits). In short, one should use an (or the) appropriate

test, and

nothing more, and apply statistical tests only to complete experimental

data.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.-at-.rug.nl - On 5 Sep 2005 at 09:32:24, "Hans Proost" (j.h.proost.-a-.rug.nl) sent the message

Back to the Top

Dear James,

Thank you for your extensive comments. A few comments:

ad 1: OK. But the value of 0.05 remains arbitrary. It could be 0.03,

0.06 or

whatever. Intuitively 0.05 is a nice value, but it is arbitrary. The

most

important thing is that almost everybody uses 0.05, so the meaning of

'significantly different' is at least at this point identical in more

than

99% of (biomedical) literature.

ad 2:

> the difference is a true difference. Statistical power should be

at least

0.80.

This is a specific demand that is used (or published) only in special

cases.

Should one do this always? I guess that in the majority of 'non-

significant

differences' the power is less than 0.80. This implies that no

conclusion

can be drawn. How often we read in the results: 'the difference

between A

and B was not significant' (and say, the mean value of A is 80 and of

B is

100), and in the conclusions and abstract (and in each citation):

'there was

no difference in effect between A and B', or even 'A and B were

similar'?

ad 3:

> 3. One should also know if a one-tailed test or a two-tailed test is

> desirable. If one knows which direction the effect should occur

then a

> one-tailed test would require fewer subjects to be statistically

> adequate both for significance and statistical power.

I am rather sceptical to one-tailed testing, unless there is really a

valid

reason to do so. E.g. assuming that the effect of a drug is in the

'therapeutic direction' is not allowed, in my opinion.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.-at-.rug.nl - On 5 Sep 2005 at 13:13:41, Angusmdmclean.-a-.aol.com sent the message

Back to the Top

Johannes: an example of preliminary work I have in mind could be a

preclinical study not done under GLP conditions. The purpose of such

an upfront study is exploratory in nature for a number of reasons

paricularly from the point of view of establishing study conditions

[ "The best laid schemes o' Mice an' Men, gang aft agley." ] prior

to designing a larger formal study under GLP.

Often people do such studies first to evaluate the prospect of

performing a larger study under GLP conditions. Usually such a

preliminary study (at least to the limit of my experience) is not

subject to any formal statistical criteria. As such you may not

bother performing statistics at all and sometimes I do not do this.

Usually the sample size in such a study is a priori too small to

allow the meaningful test and concusions to be made at the p value

you need. Often I confess that, despite being well aware of

limitations about making conclusions, that I have done some

statistical testing at this stage (sometimes altering the p value)

just to satisfy my curiosity to see what I get.

Best Regards,

Angus McLean Ph.D,

8125 Langport Terrace,

Suite 100,

Gaithersburg,

MD 20877

tel 301-869-1009

fax 301-869-5737 - On 6 Sep 2005 at 11:34:52, "Edmond B. Edwards, Ph.D." (editr.at.sympatico.ca) sent the message

Back to the Top

Good Morning Nav,

Thank you for putting this old chestnut in perspective for me. I

particularly appreciated the reference to Gerard Dallal's URL - are

you a graduate of Tufts by any chance? It looks like this man

provides a superior education in statistical analysis.

Best Wishes,

Edmond - On 6 Sep 2005 at 12:25:56, Navdeep Randhawa (Navdeep.Randhawa.at.biovail.com) sent the message

Back to the Top

Hi Edmond,

I am glad you found the reference useful, you might also like his

'The Little Handbook of Statistical Practice', its on the net.

Best regards,

Nav

[See http://www.tufts.edu/~gdallal/LHSP.HTM - db] - On 7 Sep 2005 at 10:05:01, Prah.James.-a-.epamail.epa.gov sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Dr. Proost,

ad 1: OK. But the value of 0.05 remains arbitrary. It could be 0.03,

0.06 or whatever. Intuitively 0.05 is a nice value, but it is

arbitrary. The

most important thing is that almost everybody uses 0.05, so the

meaning of

'significantly different' is at least at this point identical in more

than

99% of (biomedical) literature.

The choice of 0.05 is made by convention and is based on the differences

between the experimental and control condition. It should not be an

arbitrary choice. One should have knowledge of the statistical

background for applying p values to medical research and not rely on

what one regards as an arbitrary decision. While one uses

"significantly different" for p values of 0.05 and 0.01 the statistical

difference between them is a factor of 5 - thus the "meaning" is very

different. It begs the question to say that everyone does it so I can

ignore the rational basis for selection of p values a priori.

ad 2:

the difference is a true difference. Statistical power should be

at least 0.80.

This is a specific demand that is used (or published) only in special

cases. Should one do this always? I guess that in the majority of 'non-

significant differences' the power is less than 0.80. This implies

that no

conclusion can be drawn. How often we read in the results: 'the

difference

between A and B was not significant' (and say, the mean value of A is

80 and of

B is 100), and in the conclusions and abstract (and in each citation):

'there was no difference in effect between A and B', or even 'A and B

were

similar'?

Power lends insight into how much confidence one can have in the

significant outcome. Most research articles do not publish the power

but should. In fact, prior to doing a study the statistical plan should

be laid out with justification including the tests, p value, the power,

and the number of subjects one needs to do an adequate study. Couple

this with knowledge of the direction of the outcome then one can indeed

apply a one-sided test to determine if the result is beneficial assuming

that the goal is a unidirectional outcome. If it one just wants to know

if the experimental condition is different from the control then, of

course use a two-tailed test.

James D. Prah, PhD

US EPA

Human Studies Division MD (58B)

Research Triangle Park, NC, 27711

919 966 6244

919 966 6367 FAX - On 8 Sep 2005 at 09:13:54, "J.H.Proost" (J.H.Proost.-a-.rug.nl) sent the message

Back to the Top

The following message was posted to: PharmPK

Dear James,

Thank you for your reply.

Ad 1: I agree that the choice of 0.05 is made by

convention, but I don't see how this can be based on the

differences between the experimental and control

conditions. And I still see no argument how p = 0.05 can

be argued firmly, i.e. not intuitively and not by

convention ('are doctors happy because 95% of their

patient will not die due to their intervention?'). I fully

agree that one cannot ignore the rational basis for

selection of p values, but I don't see how this can be

done in a non-intuitive way. And I repeat that the

convention 'p<0.05' is not bad in this case.

ad 2: OK. But again there are some practical problems.

(a) For prior estimations of power, e.g. for sample size

determinations, one should have some reasonable estimate

of variability, which is often unknown.

(b) Determination of power requires that the true

difference between A and B that can be detected must be

defined. This difference should be based on a relevant

difference between A and B, and will be different in

different situations. A 20% is useful in many cases, but

certainy not in all cases. Of course one can calculate a

graph of power against the true difference between A and B

(operational chart), but this is impossible in practice, I

guess.

> Couple this with knowledge of the direction of the

> outcome then one can indeed apply a one-sided test

> to determine if the result is beneficial assuming

> that the goal is a unidirectional outcome.

As explained in my previous mail, one should be aware of

misleading conclusions when applying one-sided tests. I

agree that such knowledge can indeed be used in some

cases, but I do not agree with your remark as a general

statement.

Best regards,

Hans Proost

Johannes H. Proost

Dept. of Pharmacokinetics and Drug Delivery

University Centre for Pharmacy

Antonius Deusinglaan 1

9713 AV Groningen, The Netherlands

tel. 31-50 363 3292

fax 31-50 363 3247

Email: j.h.proost.-a-.rug.nl - On 8 Sep 2005 at 15:30:50, jose-antonio.cordero.at.ipsen.com sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Dr. Proost and Dr. Prah,

Very interesting conversation.

I'm not an statistical expert but, please, let me add one comment

regarding the significance level of 0.05 and expresions like

"significantly different". I think that to find "statistically

significant" differences is only a matter of the sample size that you

use regardless of the significance level considered. More difficult

from my

view is to establish "relevant" differences with the classical

statistical tools. How we can take decisions if the statistical tool

that you use

(p-value or CI) ignores the technical opinion of a particular matter

(i.e. "relevant" decrease in blood pressure, etc.) and is not able to

"learn" or accumulate experience from one experiment to another? What

do you think

about the bayesian approach and the possibility of its "extensive"

implementation?

Best Regards,

Jose-Antonio Cordero

Barcelona

SPAIN - On 8 Sep 2005 at 15:30:50, jose-antonio.cordero.-at-.ipsen.com sent the message

Back to the Top

The following message was posted to: PharmPK

Dear Dr. Proost and Dr. Prah,

Very interesting conversation.

I'm not an statistical expert but, please, let me add one comment

regarding the significance level of 0.05 and expresions like

"significantly different". I think that to find "statistically

significant" differences is only a matter of the sample size that you

use regardless of the significance level considered. More difficult

from my

view is to establish "relevant" differences with the classical

statistical tools. How we can take decisions if the statistical tool

that you use

(p-value or CI) ignores the technical opinion of a particular matter

(i.e. "relevant" decrease in blood pressure, etc.) and is not able to

"learn" or accumulate experience from one experiment to another? What

do you think

about the bayesian approach and the possibility of its "extensive"

implementation?

Best Regards,

Jose-Antonio Cordero

Barcelona

SPAIN - On 8 Sep 2005 at 16:01:12, Prah.James.aaa.epamail.epa.gov sent the message

Back to the Top

Dear Dr. Proost,

My comments are embedded below.

Ad 1: I agree that the choice of 0.05 is made by

> convention, but I don't see how this can be based on the

> differences between the experimental and control

> conditions. And I still see no argument how p = 0.05 can

> be argued firmly, i.e. not intuitively and not by

> convention ('are doctors happy because 95% of their

> patient will not die due to their intervention?'). I fully

> agree that one cannot ignore the rational basis for

> selection of p values, but I don't see how this can be

> done in a non-intuitive way. And I repeat that the

> convention 'p<0.05' is not bad in this case.

>

> Prah's REPLY: Intuition is defined as :" the immediate knowing or

learning of

> something without the conscious use of reasoning"

I would hope that the choice of a p value or other scientific decisions

are not made

> on this basis. The convention use of

> p values of 0.01 and 0.05 came about before the ready availability of

> computers. Now it is much easier to have a stat program provide and

exact p value. Depending upon one's

> willingness to take a risk of making type I or type II errors power

> calculations should also be made. If one knows a few

> the difference one concludes is clinically significant or confirms

one

> hypothesis, then one can easily determine

> the number of subjects needed for the study for a given statisical

> power. Given the importance of statistics in ultimate

> decision making in drug development, this subject should be

understood

> as thoroughly as one understands the drug's biological mechanism

> or how one's analytical instruments are calibrated.

> As an aside, a doctor might be happy if 95% of the patients don't

> die if

> 100 percent will die without treatment.

>

> ad 2: OK. But again there are some practical problems.

> (a) For prior estimations of power, e.g. for sample size

> determinations, one should have some reasonable estimate

> of variability, which is often unknown.

> (b) Determination of power requires that the true

> difference between A and B that can be detected must be

> defined. This difference should be based on a relevant

> difference between A and B, and will be different in

> different situations. A 20% is useful in many cases, but

> certainy not in all cases. Of course one can calculate a

> graph of power against the true difference between A and B

> (operational chart), but this is impossible in practice, I

> guess.

>

>> Couple this with knowledge of the direction of the

>> outcome then one can indeed apply a one-sided test

>> to determine if the result is beneficial assuming

>> that the goal is a unidirectional outcome.

>>

>

> As explained in my previous mail, one should be aware of

> misleading conclusions when applying one-sided tests. I

> agree that such knowledge can indeed be used in some

> cases, but I do not agree with your remark as a general

> statement.

>

Prah's REPLY: The statement wasn't intended to be a general statement

but one-sided tests

can be used in some circumstances which could reduce the

number of subjects tested thereby reducing risk and cost.

>James D. Prah, PhD

US EPA

Human Studies Division MD (58B)

Research Triangle Park, NC, 27711

919 966 6244

919 966 6367 FAX

Want to post a follow-up message on this topic? If this link does not work with your browser send a follow-up message to PharmPK@boomer.org with "Multiple comparison Test" as the subject

PharmPK Discussion List Archive Index page

Copyright 1995-2010 David W. A. Bourne (david@boomer.org)