On Jun 6, 7:39 am, Fred <fred.l.kleinschm...@boeing.com> wrote:
> On Jun 5, 12:22 pm, Lew Pitcher <lpitc...@teksavvy.com> wrote:
>
>
>
>
>
> > On June 5, 2011 14:57, in comp.lang.c, lpitc...@teksavvy.com wrote:
>
> > > On June 5, 2011 14:25, in comp.lang.c, nos...@nspam.invalid wrote:
>
> > >> I have some code here and a snippet of unfinished, untested code
> > >> which
> > >> is an attempt at a function called stddev. This is of course meant to
> > >> calculate a standard deviation.
> > > [snip]
> > >> double stddev(double mean, double *prices)
> > >> {
> > >> double price = 0.0;
> > >> int i = 0;
> > >> for (; i < prices; ++i) {
> > >> if (prices[i] > mean) {
> > >> price = prices - mean;
> > >> return prices;
> > >> } else if (prices[i] < mean) {
> > >> price = mean - prices;
> > >> return prices;
> > >> }
>
> > FWIW, from the algorithm and data given on the Wikipedia page, I coded this
>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <math.h>
>
> > double StdDev(unsigned int samplesize, double population[])
> > {
> > double sum, mean, spread;
> > unsigned int index;
>
> > if (samplesize == 0) return 0.0; /* catch obvious error */
>
> > /* compute mean of sample population */
> > for (index = 0, sum = 0.0 ; index < samplesize; ++index)
> > sum += population[index];
> > mean = sum / samplesize;
>
> > /* compute variances */
> > for (index = 0, sum = 0.0 ; index < samplesize; ++index)
> > {
> > double delta;
>
> > delta = population[index] - mean;
> > sum += (delta * delta);
> > }
> > return sqrt(sum/samplesize); /* standard deviation */
> > }
>
> > /*
> > ** Population values taken from the Wikipedia example
> > */
> > int main(void)
> > {
> > double pop[] = {2,4,4,4,5,5,7,9};
> > unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));
>
> > printf("The standard deviation = %f\n",StdDev(popsize,pop));
>
> > return EXIT_SUCCESS;
> > }
>
> > When I compile and run this code
> > $ cc -lm -o stddev stddev.c
> > $ stddev
> > The standard deviation = 2.000000
> > $
> > I get the same Standard Deviation value as the Wikipedia article's example
>
> The above algorithm, while mathematically correct, is not good enough
> for a computer. If your population is very large, or the individual
> items in the population vary greatly in magnitude, you may run into
> severe truncation and roundoff errors.
>
> A more accurate way is to include the Leveque computational
> correction,
> computing the variance as:
>
> var = { sum[(x[i] - mean)^2] - (1/n)*sum[(x[i] - mean) } / (n-1)
> then stddev = sqrt(var)
Oops, missing a square. The variance with Leveque correction is
{ sum[(x[i] - mean)^2] - (1/n)* sum[(x[i] - mean)]^2 } / (n-1)
i.e., in the first term you sum the squares of x[i]-mean,
and in the second term you square the sum of x[i]-mean
See the Stanford Computer Science report by Chan, Golub, and Leveque
> Note that you computing the mean is not really as simple as summing
> the
> items and dividing by the number of items. What happens on a 32-bit
> machine if the first item is of magnitude 10^18, followed by 10^20
> items that are of magnitude 1? None of the latter items will
> contribute to your sum, and your answer will be a couple of orders of
> magnitude in error.
-- Fred K