How To Analyze Data Using the Average

@franz: Thanks, glad you liked it. I think expected value would be a great follow-up article.

In my mind, I see expected value as a form of weighted average, where the more-likely scenarios have more impact than the least-likely ones. When you don’t have clear-cut outcomes, you have to take the probability into account as best you can. Definitely a good idea for a follow-up.

this was fantastic! a definite thumbs up on stumble.

For the life of me I cannot understand how you get 7.14 widgets/hour with the Machine productivity model when the slowest the machine runs is 10 widgets/hour.
My intuition will just not let me get that.
I understand how you got 14.28 widgets/hour as the harmonic mean just as I understand how 60mph one way and 30mph back gets you 40mph average speed.
Why wouldn

@minsoo: Thanks!

@Anonymous: Great question. The core issue is separating the time needed to process an individual widget vs. how fast “things” are coming out of the pipeline.

Let’s say you can wash 2 pants/hour, and can dry 4 pants/hour. How long does it take to wash 1 pair of pants?

Well, there’s 1/2 hour for washing, and 1/4 hour for drying for 3/4 hour (45 mins). Even though the slowest machine runs at 30 mins, it takes 45 for the entire cycle (for one item).

Now, once you have a bunch of backlog, pants may be leaving the dryer every 15 mins. But each of those pants had an additional 30-minute wash cycle at some point. These means help you figure out how fast a process is for 1 item (without pipelining speedups).

math great

Hi Khalid.You did a very nice work! Thank you!
I had difficulties grasping the harmonic average from the car speed example and then I think I might have found a somewhat simpler algorithm to calculate it.The point is that you may not know always which is the supposed output or input.So here’s my algorithm:

1.Detect the requested item!
(for instance:
Question: what is requested?
Answer: An(average)speed :AS)

2.GET SF! (the SPEED formula):
SPEED=DISTANCE/TIME

3.Get ASF! (average speed formula:
average speed(AS) = TOTAL distance/TOTAL time).

4.Refine ASF in order to get RASF(refined ASF)!
AS=(distance1+distance2)/(time1+time2)
5.TRACE all the components of the RASF!
(We need to DETECT 4 items: 2 distances and 2 times,for being able to calculate AS.
Here’s the proceed:
We don’t know the 2 distances exactly so we use letters instead of numbers:
Distance 1 is D (from home to work place)_
Distance 2 is D (from work place to home)
Getting time 1(rearrange SF in order to get the TF! (time formula))So if SPEED=DISTANCE/TIME then TIME=DISTANCE/SPEED.
time 1 is then D/60 (60(miles per hour)is the first speed)
time 2=D/30 (30 is the second speed)

6.Do the calculations
AS= 2D/(D/60+D/30) You get AS=40(miles per hour).

Sorry for my previous expressing mistakes.I meant not “harmonic average but harmonic mean”(I’m romanian so my english is not so good).Also I enumerated the speeds backwards (60-30) instead of (30-60).
It seems to me that the mentioned example is not about whether one uses arithmetic mean or harmonic mean but about sticking to the speed formula in order not to get wrong results.

Sorry for my previous expressing mistakes.I meant not “harmonic average but harmonic mean”(I’m romanian so my english is not so good).Also I enumerated the speeds backwards (60-30) instead of (30-60).
It seems to me that the mentioned example is not about whether one uses the arithmetic mean or the harmonic mean but about sticking to the speed formula in order not to get wrong results.

I puzzled over this line:
"Year-over-year average: (.83)^(1/4) = 4.6% loss per year."
I kept getting 0.955 and wondering what I didn’t understand. Finally it occurred to me that the line should read “Year-over-year average: 100*(1-(.83)^(1/4) = 4.6% loss per year.”

[…] 6.How To Analyze Data Using the Average | BetterExplained The average is a simple term with several meanings. The type of average to use depends on whether you’re adding, multiplying, grouping or dividing work among … Easy to calculate: just add and divide. … Pingback by How To Analyze Data Using the Average | BetterExplained | Teach The Boss — March 15, 2008 @ 7:47 am… http://betterexplained.com/articles/how-to-analyze-data-using-the-average/ […]

Amazing article. Wonderful insight here, and not just for beginners =)

this iz great thing 2 help kids in math so keep it up let the kids learn more things about mean and mode most imp…

A boat takes t1 time to travel a certain distance when it travels in the direction of flow of water, and takes t2 time to travel the same distance against the stream. How much time will it take to travel the same distance in still water ?

Is it a case of Harmonic Mean ?

Please Explain, how ?

In problems involving tanks and taps or problems such as follows I find solving the problem without reciprocals easier.
Ex:If 12 men and 16 boys can do a piece of work in 5 days; 13men and 24 boys can do it in 4 days, then blah…
Soln:From the first condition 1 job=125 mandays+165 boydays
From the second condition 1 job=134 mandays+244 boydays.
Equating the two we get 60mandays+80boydays=52mandays+96boydays.
So, 1manday=2boydays.
We can solve any question from here onward.

Ex:If a tap takes 12 hours to fill a tank and another tap takes 10 hrs to fill the same tank how long will both the taps together take to fill the tank?
Multiply12*10=120 ;
In 120hrs first tap fills 10 tanks and second tap fills 12 tanks, a total of 22 tanks.
So both taps together take 120/22 hrs.

The data transmission is incorrect for GB/$.
If you’re only paying one side it’s exactly the amount you pay there’s no average to be had.
But if you’re paying both client and server, it’s not each doing a half the work, it’s each doing a whole part each. If you’re paying for a client and server and you get 10GB/$ on both of them, when you transmit 10GB from the client to the server, both have consumed 10GB worth of transmitted data, you pay 2 dollars. So in effect two 10GB/$ machines averages to 5GB/$.
That’s half the estimate if both had 10GB/$, which would be 2/(1/10+1/10)=10GB/$, which is clearly wrong.
A 20GB/$ server and a 10GB/$ client transmitting the data would average 6.66GB/$ since you need to set the same ratio for GB.
Take this to the extreme with 1000GB/$ or 10000000GB/$ and and 10GB/$ and the cost should move towards being significantly towards 10GB/$ because the servers cost is almost nothing.

Just amazing. such a great ideas u shared with us. thank u very much.

I found it all, from 4 “Easy Permutations and Combinations” to 7 “How To Analyze Data Using the Average” extremely interesting and presented in a very clear way, so we could not only understand it better but be able to connect to other areas of the Statistic and Probability.

It is amazing presentation

Sincerely

Luis H. Alvarez

Hi Adam, that’s an awesome analogy. I hadn’t thought of that numerator vs. denominator difference, but that’s exactly it. Normally we measure “output per input” i.e. (miles per gallon) or “dollars per hour”, and we don’t really think of averaging the “input” side of things. I like it.

After giving this serious consideration, there are four things I now realize which seem fundamental the correct application of the harmonic mean.

  1. Any of the basic arithmetic operations (+ - / *) that are applied to a fraction can be thought of as being applied to the numerators of those fractions, while the denominators stay constant.

For example, the arithmetic mean of 1/4 and 3/4 is 2/4, in which the numerators are “averaged” while the denominators are constant. In this sense, adding, subtracting, multiplying and even dividing effect the numerator while leaving the denominator constant. As another example 3/4 divided by 2 could be thought of as (3/2)/4, exemplifieing the direct impact of the division function on the numerator.

  1. The harmonic mean is the arithmetic mean of the denominators. For example the harmonic mean of 1/3 and 1/5 is 1/4.

  2. “Miles per hour” and “hours per mile” refer to the same physical abstraction, namely speed. The difference is that the arithmetic operates on each differently. Doubling the “miles per hour” doubles speed, while doubling the “hours per mile” halves it. In the same way, “widgets per hour” refers to the same abstraction as “hours per widget”.

  3. So the question is, what quantity do I want to operate on? In considering between the harmonic mean and the arithmetic mean, what quantities do I want to average? If they are the numerators, the correct choice is the arithmetic mean. If they are the denominators, the correct choice is the harmonic mean.

As an example, if I want the average of the hours spent in traveling a mile and I am given a list of “miles per hour” units, then I would use the harmonic mean which would calculate an average of the hours using “hours per mile” units and invert the result back into the “miles per hour” unit. If I was given a list of “hours per mile” units, I would just use the arithmetic mean to make the same calculation.

I’d notice if someone swapped a walrus for a person.