It’s one thing to have an idea, another to put it into practice on the workbench, a third to bring it to scale, and a fourth to compete successfully to be widely adopted. There were many attempts to join sound with the moving image–people tried playing a phonograph record alongside a movie–each with its own flaw, such as the record drifting out of sync with the film. Not all early loudspeakers were equally efficient or readily produced. But when sound-on-film motion picture playback met the moving coil loudspeaker, the talkies were ready for business.
The sound of one person’s voice recorded by one mic needs only one channel of sound to reproduce it. One mono signal can be fed to one speaker or thirteen but, it’s all still monophonic sound. It lacks the sense of sound in space:
Sound arrives at one ear at a slightly different time than the other and humans generally are able to interpret distance and position from that (which, in prehistory, might’ve helped spindly hominids experimenting with tools not get eaten). Stereophonic sound mimics this. Two mics are placed in two different positions. (There are many, but not infinite combinations of mic types and positions.) What happens acoustically in front of those mics gets recorded to two separate tracks of audio. One track is played back on one speaker on the left side of a room, and the other on another speaker on the right. When people sit or stand in the right part of such a room, they’re likely to experience sound in many of the ways they would have “if they were there.” (Headphones work on the same principle, just with tiny speakers in our ears.)
“Lifelike” was a big concept in selling stereo, and it was hailed with great fanfare when movies and commercially released music recordings went stereo, but here’s the rub. Hearing-abled people generally hear with two ears, but hear sound coming from all around us–a twig snapping behind you in the woods, for instance. And, wherever you put two speakers in a room, there’s still a sense of the sound coming from that direction (behind the screen, in a cinema) but not anywhere else.
So what if there were a speaker at each corner of a room with audience in the middle? That arrangement was called quadraphonic sound, also sometimes called 4.0 surround:
It didn’t last long but, with variations and additions, it led quickly to 5.1 surround sound, which is still in use today:
Dialog is so important that it’s helpful to have a channel dedicated to it in the center of the screen. Unless a character is hollering from the next room, dialogue most typically is mixed as coming from on screen, predominantly or exclusively in the center channel. Compared with quadraphonic, this is the fifth channel–the “5” in 5.1.
The lower a frequency is, the more energy it takes to propagate at the same perceptual loudness as higher frequencies, but the less directional it is. The higher a frequency, the less power it takes to propagate, but the more it feels like it’s coming from where the loudspeaker is.
The lowest frequencies audible by humans take so much power to reproduce that another channel was developed to go to a separate amp dedicated to powering a specialized speakers (often called a subwoofer). This channel is called “low frequency enhancement” (LFE). LFE is the “.1” in a surround name, such as 5.1.
Can a Quicktime file contain mono audio? Sure. Can it have stereo (or two unrelated mono tracks, if that’s of use)? You bet. What if it could contain six channels? It can. (Do you have equipment to play six channels of audio? Headphones only reproduce two.) But it’s possible, and, with a couple extra steps, you can export such a thing that can be played via QLab in Weitz Cinema, although it’s less error-prone and more widely useful to the rest of the world to make a DCP. (And depending what software you use to do that, you might need to make a Quicktime file with 5.1 audio and ProRes video as an intermediate file that then generates a DCP.)
There are standard channel assignments:
1 L (left)
2 R (right)
3 C (center)
4 LFE (low frequency enhancement)
5 Ls (left surround)
6 Rs (right surround)
Use these (unless someone asks for different channel assignments, or a step later in your workflow indicates it).
Let’s say you hire an audio mixer to create a surround mix of your work and they send you back a .WAV file, which you open with Quicktime player, use the Media Inspector (command + I), and see that it has six channels, assigned in the standard way:
Let’s say you drop this onto the timeline in Premiere? Surround won’t be reproduced properly unless you make a new, surround sequence. Under Audio, look for “Master” and choose 5.1 instead of stereo:
You also have to make a track in this sequence be a 5.1 track. Look under Track Type. (It’s also a good idea to give the track a name more meaningful than “Audio 1” under Track Name.)
Now you’re ready to drop your 5.1 surround .WAV file onto the 5.1 track in your surround sequence. No surprise there are a few details to attend to at export time. Here’s the Export window in Adobe Premiere after following the workflow I just described:
Two really important things to notice here. See under Summary –> Source where it says things about the video and then it says “48000 Hz 5.1“? This verifies that, in the previous steps, we successfully created a surround sequence with at least one 5.1 track. If Source doesn’t say 5.1 in the Export window, you need to revisit those steps.
Finally, find the dropdown box under Audio Channel Configuration –> Output Channels. Be sure to choose 5.1 (L, R, C, LFE, Ls, Rs) before exporting.
Find the file you just exported. Open it with Quicktime player and use the Media Inspector (command + I). If it has six channels assigned in the standard way, there’s a good chance it will play back as intended via QLab in the cinema (and serve you well as an intermediate file in DCP-o-Matic):
If you’re reading this page on a phone or laptop, maybe with headphones, you won’t be able to hear in surround, but you can see on the meters how channels 5 & 6 (left surround and right surround) are generally lower than 1 & 2 (left and right), and notice when voiceover comes in on channel 3 (center channel). This mix doesn’t seem to have much, if any signal on channel 4 (LFE):