-  HOME  -  WORKOUTS  -  RACES  -  IRONMAN  -  SPONSORS  -
  -  FORUM  -  ADVICE  -  PEOPLE  -  PICTURES  -  INNER CIRCLE -
 

For Math-Phobic Triathletes: An Introduction to the Standard Deviation

Ed Colet
October 30, 2002

The Average and Standard Deviation have graciously appeared alongside the results of each race on our Results page. While an Average is known and understood, it's statistical sibling, Standard Deviation doesn't garner the attention that it deserves. Were it not for the attention from the statistically inclined, the Standard Deviation could very well be overlooked, lurking in the shadows of obscurity alongside it's more popular sibling the Average. So today, we've brought the Average and the Standard Deviation along to chat with you for purposes of getting to know the Standard Deviation.

Average (Avg)   "Thanks for having us. I'm always glad to speak to Triathletes. While most of you are already familiar with me, let me tell you about a race we recently worked. I saw 1800 of yous Numbers at Lake Placid. Each number is a race finish time. The first number was 8:39:19, and somewhere down the line we had a 9:43:12, and then a little later we got a 10:42:25 and a 12:52:40....and then a 13:24:56 and so on until we got to the last number, a 17:00:13."

"I tell you, dealing with 1800 different numbers can be a bit cumbersome and clumsy to handle, so what we like to do is find a single number that best represents the entire field of 1800 numbers. If we can agree on what this single number should be, then we've neatly summarized the field of 1800 different and unique numbers and boiled it down to a single number. I think you know where I'm going with this, and while I'm tempted to go on talking about myself, let me instead make a long story short and say that the agreed upon way to choose one number to represent the field is to choose -- Me, Mr. Average. Sometimes, I like to go by the term, Mean -- especially because I'm mean if people call me average. But I digress. Anyway, for this group of 1800 folks, I am (Drumroll sound here. . .) 12:58:27."

Standard Deviation (Std.Dev)  "Hey, I thought this was supposed to be about me! You gonna talk about me at all?"

Avg (continuing)   "Yes. I was just getting there before you rudely interrupted."
"Where was I? . . . Yes . . . I am 12:58:27, and I have been chosen to represent the 1800 different IM finish times -- the fast, the less fast, the males, the females, the young, the old . . . "

Std.Dev   "Yo! Can we get to me TODAY?"

Avg   "Please don't be rude. Or else I'll be mean."

Std.Dev  "You're already the Mean!"

Avg  "Where was I?. Yes. . . some of you may not like me as much as others do. And perhaps there's good reason for that. One reason may be that when you look at me at 12:58:27 you can rightly point out that if I'm supppsed to represent everybody, Mr 8:39:19 doesn't seem close to me at all. He's over 4 hours different. And Ms 9:43:12 is over 3 hours different. And I'm not just talking about the Elites. On the other side of me, Mr 17:00:13 is also over 4 hours different. Sure, Mr. 12:52:40 is pretty close to me, and so is Ms 13:04:45. But what do we do about all of these Differences? In the spirit of tolerance, everyone's differences should be respected no matter how small or large. And that's where my buddy, Standard Deviation can really help. So without further ado, let me call on my . . . "

     (Standard Deviation bounds up onto the podium to stand beside Average)

Std.Dev  "Finally. Thought you were never going to get to the Differences, Meanie."
"OK here I go. This is the important stuff so, listen carefully folks. I am here to address your differences. Mr 8:39 and change, I am your friend. Mr 17:00+ I am here to help. Here's what I do. Bring me your differences and I will organize them. To keep things orderly, let's start in the order of your finishing times. Just call out your differences. . Mr 8.39:19 you're first. 4 hours, 20 minutes less than the mean? Check. Next!."
(a computationally quick nanosecond later, we're already halfway through the field of 1800 numbers)
"A 5 minute difference below the mean? Check. . . . 3 minute difference? Check. 20 seconds off? Check."
(another nanosecond later we're finishing up)
"4 hours and 2 minutes above the mean. Is that the last one? OK Great. Thanks."

Narrator   At this point there are now 1800 numbers again. But now, each number is the difference from the average of 12:58:27. There are 1800 such Differences.

Std.Dev (calling out to AVG.)   "Hey Bro, 1800 Differences gets a little clumsy to handle. You want to help me out here?"

Narrator  Recall that If you've got 1800 numbers to deal with we came up with a single number that best represents the field of 1800 numbers. We averaged the 1800 numbers.
(Idea lightbulb lights up for the audience)
So how, 'bout we take an average of these 1800 Differences? After all, it's just another set of numbers. We can call the result the "Average Difference".
(A dimming of lights, a puff of smoke, and a quick nanosecond computation and then. . . )

Ladies and Gentlemen, please allow me to re-introduce the Standard Deviation (drumroll sound here) . It is 1:43:45.

Std.Dev  "Yes, I am the Standard Deviation and for you, I am 1:43:45."
(pause for applause to settle down)
"Thank you. While The Average has a cool alias The Mean, I don't really have an alias. Please don't call me Stan. But thanks to my pal Webster Thesaurus, you can remember me this way. Another word for Deviation is Difference. I suppose you can call me Standard Difference if you like, but we can do even better. Another word for Standard is Average. And if you think of me as the Average Difference, well then you're pretty close to knowing how I came to be, and understanding me really well. Yeah, that's it. I am essentially the "Average Difference" from the mean. So, I'm glad we've finally met. Just remember to ask about me whenever you see an Average."

Avg  "Thanks Stan. So, let me wrap things up. While I am the single number to represent the field, the Standard Deviation tells you how different the numbers are from me. Together, we provide an accurate representation because it will always be the case that 68% of the field lie within 1 standard deviation of either side of me. And did you know that 94% of all the numbers lie within 2 standard deviations of either side of me? That's no lie."

Footnote.
The computation to arrive at 1:43:45 wasn't exactly the computation of an average as we know it. But intuitively, it's still pretty close. We had to use the smoke and dim the lights earlier only to hide what happened behind the scenes. You can't simply add up all the differences and divide by the total as you would in computing an average because things cancel out. If you tried to simply sum things up, differences higher than the average (e.g. +20secs) would cancel out differences below the average (-20 secs), and your sum will be zero. We could have avoided this by taking the absolute value of the difference, but instead we took each Difference and squared it. We square things to give extra weight to outliers like Mr. 8:39:19 and Mr. 17:00:13, because they're kinda special. What we really sum up is all 1800 squared differences. This is called a Sum of Squares (we kept him behind the scenes because he's kinda square and always prefers to remain behind the scenes anyway). We divided the Sum of Squares by about 1800, pulled out a square root, and then popped out the Standard Deviation. And that's who it really is.

Questions/comments?  Contact the author, Ed Colet: edc3@prodigy.net

 

Copyright © 2002, Westchester Triathlon Club. All rights reserved.