Back to the Top
The hypothetical data with equal means presented by Reeve,
Treat 1 Treat 2
50 70
70 80
90 90
110 100
130 110
can be subject to an entirely different interpretation. The means are
identical but the standard deviations are very different. (28.3 vs 14.1)
Imagine that there is an optimal value for some physiological variable
and it is (in this hypothetical case) 90. If we look upon the data as
paired outcomes from 2 treatments then treatment 2 is clearly better
than treatment 1 in getting all subjects closer to the optimal value.
However, any means test will show the means not significantly different
from each other. An F-Test of Two-Samples for Variances (as implemented
in Excel) will also not show significant difference for the variance as
the sample size is too small. Doubling each observation will yield a
significant result. The point is that your knowledge and understanding
of the data should be primary and statistical tests secondary (unless
you only know statistics).
=========================================
Meyer Katzper, Ph.D.
FDA/CDER/ODEV/DAAODP
katzper.aaa.cder.fda.gov
Tel (301) 827-2563
Back to the Top
Knowledge from 5 data points convince you that trt 2 is better than trt
1? Why is it better than an appropriate statistical test? I think
statistics should test if what we know about data is correct, the less we
assume the less bias will be introduced. Having said, clinical judgement
of the investigators is equally important. So it will be nice if we find
a balance.
Of course looking at the data provided, you can argue due to the size of
variability, there is not enough power to detect a difference. But that
doesn't necessarily mean a second sample of 5 in each treatment group
will replicate the first experiment exactly. After all, before a trial,
you don't know what is going to happen, maybe trt 2 will be even worse.
Peace.
Back to the Top
Regarding statistical testing;
To me it seems that people tend to forget what statistics is all about. When
someone states that "Statistics should test if what we know is correct" I
think that the matter has gotten way ot of line.
Statistics deals with stocastic variables and has in its essence nothing to
do with "truth" or proof of hypothesis.
This is exactly why our 0. hypothesis usually states that two populations
are equal. What statistics can tell us somthing about is the chances that
such would be the case because of stocastics. So in my mind statistics is a
base-level filter. Its purpose is to rule out(at a certain level of
confidence) that de differences we see can be ascribed to random variation.
Now, with that possibility ruled out we can go ahead and ask ourselves what
might then have caused the differences observed. What have to be kept in
mind is that any non-random error (eg. non-stocastic) can of course cause
such observed differendes between populations. So again, statistics do not
tell us any truth, only that what we see probably is not caused by stocastic
variation. And of course if we can not rule out that possibility, then any
further discussion is a vaste of time.
N.O Hoem
Back to the Top
Hi All,
I have some comments on a recent email posted under the above topic.
Reeve proposed the hypothetical dataset:
tmt1 tmt2
50 70
70 80
90 90
110 100
130 110
Meyer Katzper commented:
i) The standard deviations are 28.3 and 14.1.
ii) Regarding the F ratio test, "doubling each observation will yield
a significant result".
iii) Treatment 2 is "clearly better" regarding hitting a target of 90 ?
Unfortunately,
i) is incorrect. This is surely a 'sample' drawn from all possible
subjects. Hence the 'sample' standard deviations should be
calculated (31.6 and 15.8 respectively).
ii) is incorrect. From my days at school, the F ratio = (sample var1
/ sample var2). If I double each observation, var1 becomes 4*var,
and var2 becomes 4*var2. Hence the F ratio, if used, is (4*var1) /
(4*var2). The 4's cancel. This makes sense.
iii) What a confident statement! Aren't pharmaceutical companies
silly spending lots of money when N=5 could be sufficient for a NDA
to demonstrate 'clearly better'. Faced with this question:
Researcher 1 (bright) says, "I can see tmt2 is more variable", and
perhaps additionally calculates a correct SD for each sample.
Researcher 2 (brighter) does the same as Researcher 1, but considers
and uses appropriate statistic methods, because he/she knows that
these are only samples (which are subject to random error), and
wishes to generalise the result to the population as a whole.
Researcher 3 (brightest) does the same as Researcher 2, but asks
themselves the question afterwards "How did I end up with data which
is unable to clearly answer my question?", and vows not to do the
same again.
Researcher 4 (new role model for Researcher 3) always takes the easy
option. She/he prefers not to guess. She/he
i) thinks through the question they wish to answer.
ii) Prospectively plans what data, how much data, AND what analysis
they will carry out.
iii) Does ii).
iv) Gets results which are clear. Perhaps not positive, but clear.
I like Researcher 4, although perhaps I 'only know statistics'.
RA Fisher
Back to the Top
Dear Fiscer
I have used book on staistical tables extensively.
Which is the latest edition now available. In case you have the table in the
soft form would you be able to send those to me
Dr. prashant
PharmPK Discussion List Archive Index page
Copyright 1995-2010 David W. A. Bourne (david@boomer.org)