@Bart
Your second question points toward the answer to some of the missteps in the first question.
Please don’t be offended by use use of the word ‘missteps’. I’ve made them a plenty and will make more. In the physical sciences it is a beautiful thing to make missteps and wrong guesses, greater still to voice them and ask questions. It demonstrates growth.
I can answer your question by telling you a thing that I claim is true. I only warn you ahead of time that it’s kind of a lie. Imagine the chagrin of the poor souls who came up with the corpuscular theory of light (http://en.wikipedia.org/wiki/Corpuscular_theory_of_light)
only to learn it’s all wrong, light is a wave not a particle. Then along comes quantum physics and tells us it is a particle, absolutely it’s a particle. We know this because the equations that govern it are, well, wave equations… but I digress.
Here is the great lie I have to tell you that, for now, is “true”.
You are right, the sound arriving at the microphone is already the sum of all frequencies. What arrives at the microphone is just a single pressure wave with a complicated shape, not a whole bunch of sin and cos waves. The Fourier Transform (FT) just lets us look at it as if it were a bunch of sin and cos waves. Then we can perform some tricky math with it. We can send it through a computer and do all sorts of jazzy things like take an old Parlophone mono recording of a concert with a rude telephone ringing in the background and remove just that sound. I saw a recording engineer at Abbey Roads studio do this with some pretty fancy software, blew my mind!
You also start asking, “What is the sensor device that can sense all those different frequency to allow sampling”? For a lovely game of chasing rabbits down holes, go ahead and see what Wikipedia has to say about the Nyquist Sampling Theorem. It’s good fun and a lovely headache. Let me see if I can shorten it. Lets say you have the setup you’ve provided, a human voice saying the word hello, and that sound hits a microphone. We want to record it to play back later, but we want to record it digitally! Start with what the mic gives us, an electric voltage wave that has the same complicated shape as the pressure wave that hit the mic. Take a ‘sample’ to see how strong the voltage is every so often. The strength will just get converted to a number (this is a binary number that can be stored digitally). As the voltage varies from -100% to +100% the digital number that comes out of our ADC (analog to digital converter) is a number between 0 and 256 (for 8 bit encoding). But how often is ‘every so often’? By the Nyquist Theorem, twice for each cycle of the highest frequency we want to represent. Humans have a tough time hearing any sound at a frequency higher than 20K Hz (repeats every 0.05ms). So you take 2 samples every 0.05ms. Each sample is an 8 bit number. Store it as a file. To play it back; read the first 8 bits, convert it in a DAC (3 guesses what DAC stands for) to a voltage between -2V and +2V, send it to an amp then speakers, repeat for each subsequent group of 8 bits. I left out a detail or two. This lets you faithfully record a sound up to 20kHz. If a higher frequency sound hit the mic our sampling would have missed it.
As for my ‘true’ lie… the Fourier Transform isn’t just about sound. It can be about anything that is represented as a wave (like a quantum wave function for an electron, or an atom, or a cat, or the universe). It lets us take a ‘thing’ that we want to look at and express it in one of 2 ways:
-single waveform complicated shape
-many waveforms simple shape
From the standpoint of sound wave we like to think the ‘real’ thing is the single complicated waveform. The other mess of an infinite number of basic shaped sin and cos waves are not really ‘real’ they are just a convenient mathematic way to treat the object. If the wave is the quantum waveform of a photon we may like to view it the same way. The photon is a ‘real’ object. The ‘real’ part is the single quantum waveform, or is it? However, the single waveform can have a position operator applied to it, to ask the question “where are you, oh little photon?”. Likewise it can have a momentum operator performed on it to ask the question “how fast are you going?” Werner Heisenberg discovered an interesting relationship between those two tasks. Yes it was math alone that caused him to propose his Uncertainty Principle. In this case the sum of all the ‘little’ equations (where, when, how fast, how much charge, how much mass, etc.) may seem to be more real than just the composite wave function, which really is just a probability cloud of nothingness.
The Fourier Transform just gives us a way to go back and forth between the parts, and the composite. Its up to us to decide which version we want to work with.