Binaural Cues – Part 2

In my last post, I explained the importance of Interaural Time Differences (ITDs) and Interaural Intensity Differences (IIDs) in aiding our ability to accurately locate sounds. These time and level differences provide the brain with sufficient information to determine whether a sound came from the left or the right (Hofman and Van Opstal, 2002), however they are sometimes insufficient for the brain to deduce whether a sound originated from in front of, or behind the head. There can be circumstances where a sound emanating from the front produces exactly the same ITD and IID as if the same sound source was positioned behind the head. Additionally, changing the elevation of a sound source whilst maintaining its position on the horizontal plane does not necessarily affect the ITD, nor the
IID.

The Cone of Confusion

The area where ITDs and IIDs are identical regardless of sound source elevation or front-back position can be visualised as a cone-shape which radiates out from the ear, known as the Cone of Confusion. A Cone of Confusion exists at each ear. Any cross-section of these cones represents a set of locations where a sound source will produce identical phase delays and transient disparities, thus making the use of such binaural cues futile.

figure-4

Cones of Confusion should be thought of as extending outwards from the ear indefinitely. Fortunately, there are mechanisms which can be used by the human auditory system to resolve confusion. These are:

  • Head-Related Transfer Functions – Spectral changes caused by the natural filtering effects of head, torso, and pinnae (explained below in more detail). HRTFs have the most effect on higher frequencies.
  • Head Movements – For sound waves of lower frequencies, head movements can be relied upon. Rotating the head changes interaural cues which in turn provide unambiguous information about the position of a sound source (Figure 5).

figure-5

Head-Related Transfer Functions (HRTFs)

As stated above, HRTFs are relied upon for ascertaining the elevation of a sound source as well as whether the sound is coming from behind or from the front. The term HRTF relates to how the head, torso, and pinnae interfere with sound waves before they enter the auditory canal and reach the sensory structures within the middle and inner ear. Each of these external body parts modifies incoming sound waves, altering their frequency spectra and perceived volume through specific diffractions and reflections depending on the angle at which the sound waves strike them. These natural filtering effects produce subtle
colourations that are very important in allowing the auditory system to determine the origin of a sound.

HRTFs are specific to an individual. Every human possesses different shapes and sizes of head, torso and ear, the variation of which can be great. The structures of the pinnae are arguably the most important outer ear function associated with localisation. A pinna acts as an acoustic antenna, capturing and collecting sound. Its geometry is made up of unique skin folds and cartilage which cause certain frequencies to be attenuated, while resonant cavities, such as the concha, amplify other frequencies. Hofman and Van Opstal (2002) conducted studies which demonstrated the significance the role of the pinnae in sound localisation through running listening experiments involving the insertion of moulds into the pinnae of subjects in order to alter their shape and concurrently disrupt original spectral cues. Results suggested that the moulds hindered the ability of subjects to accurately identify the elevation of sound stimuli in both bilateral and unilateral test conditions.

Just how these binaural cues are incorporated and implemented in the creation of binaural recordings will be explained in next week’s post.

Bibliography

Hofman, M., Van Opstal, J. (2002) Binaural Weighting of Pinna Cues in Human Sound Localisation. Berlin: Springer-Verlag.

Plack, C. J. (2005) The Sense of Hearing. New York: Psychology Press.

Wenzel, E., Begault, D. R. (2012) The Role of Dynamic Information in Virtual Acoustic
Displays. Available from the NASA website: http://humansystems.arc.nasa.gov/groups/ACD/projects/dynamic_info.php.

 

Advertisements

Binaural Cues – Part 1

Binaural recordings are reproductions of sound that create a three-dimensional effect which in turn provides the listener with the sensation of being immersed within an environment or scene. Effective binaural audio creates convincing impressions of 360º sound direction. Recordists are able to create this effect through successfully capturing key elements of physical acoustics within their recordings which provide the human auditory system with information that helps with deducing the location of the sound source without visual aid.

Francis Rumsey in his book, Spatial Audio, explained that “Binaural approaches to spatial sound representation are based on the premise that the most accurate reproduction of natural spatial listening cues will be achieved if the ears of the listener can be provided with the same signals that they would have experienced in the source environment or during natural listening.” (2001).

So, what are the all important binaural cues needed for accurate spatial perception? Here I will begin to outline them for you.

Interaural Time Difference (ITD)

The term ITD is used in the field of Acoustics to describe the difference between the time taken for a sound to arrive at the ear which is closest to the source and the time taken for it to reach the ear which is furthest away. These time differences are an auditory cue relied upon by the human auditory system to accurately locate sound sources. As human ears are separated by the head, the time taken for a sound wave to reach both ears will differ in proportion to the angle, distance and direction of the sound source in relation to the listener.

The sound waves emitted from a source located directly in front or behind the listener will arrive at both ears simultaneously. If the source deviates from these positions, time differences between both ears will occur, as illustrated in Figure 1.

figure-1

As well as the time difference introduced by the angle of incidence, additional time delays are created by the head, which sound must travel around in a curved path to reach the furthest ear. For a particular angle in relation to the head, there is a maximum frequency beyond which ITD is no longer a reliable cue for sound localisation. Above this maximum frequency the head begins to interfere with sound waves more significantly, preventing them from diffracting efficiently and thus attenuating sound intensity level. According to Howard and Angus (2006: 100), for steady state signals, the Interaural Time Difference is analysed as an Interaural Phase Difference (IPD) by the auditory system when determining the direction a sound is coming from. Howard and Angus (2006: 101) calculated that, when the angle of incidence is 90°, the maximum frequency that can be successfully located using IPD is 743Hz. Jan Mohamed and Cabrera (2008: 2) also state that the effectiveness of the IPD cue is limited to below 700Hz , supporting Plack’s theory (2014: 161) that ITDs work best for low-frequency tones.

Interaural Intensity Difference (IID)

This cue is relied upon by the auditory system for locating relatively high-frequency sounds. A listener’s head acts as a physical barrier for these sounds, creating interpretable intensity level differences at each ear.

This effect could be described as an acoustic shadowing effect because sound waves are unable to pass through the head to the ear furthest away from the source, thus leaving it ‘in the shade’ (Figure 3). Figure 2 illustrates how the angle at which a sound source is situated in relation to a listener’s head affects intensity level (amplitude). When on the median plane, the sound emitted from the source is received at equal levels at each ear, whereas amplitude progressively increases at one ear and decreases at the other as the source is moved further away.

figure-2

According to Howard and Angus (2006: 102), an object does not act as a significant sound diffuser until it’s size is around two-thirds of the wavelength (λ) of a particular sound, therefore IID is less useful as a localisation cue below a minimum frequency. Figure 2 demonstrates how lower frequency sounds (250 Hz) are less inhibited by the head, leading to less pronounced IIDs.

Additional phenomena worth noting are:
• Larger heads create a greater IID at a given frequency.
• Higher frequencies create larger IIDs at a given angle

figure-3

Look out for the next post which will cover additional cues and examples of when, even with the aid of ITDs and IIDs, locating a sound source can be very dificult.

Bibiography

Howard, H. M., Angus, J. A. (2006). Acoustics and Psychoacoustics. Oxford: Focal Press.

Jan Mohamed, M.I., Cabrera, D, (2008) Human Sensitivity to Interaural Phase Difference for Very Low Frequency Sound. Acoustics 2008: Proceedings of the Australian Acoustical Society Conference. Geelong, Australia.

Plack, C. J. (2014) The Sense of Hearing: Second Edition. New York: Psychology Press.

Rumsey, F. (2001) Spatial Audio. Oxford: Focal Press.

Shannan, B. (2010) Audiology Update. Available from the Scottish Sensory Centre, University of Edinburgh website: http://www.ssc.education.ed.ac.uk/courses/deaf/dnov10i.html.

 

Binaural Audio Intro.

Advances in mobile computing power, coupled with the affordability of new mobile technology has enabled more people than ever before to consume audiovisual media using smartphones and tablet computers. Mobile devices are now major entertainment platforms in their own right, providing unbounded access to film and music releases, and millions of applications for those with internet connectivity.

Headphones are widely used for consuming audiovisual media on mobile devices and are often preferred to built-in speakers because they provide superior sound quality and isolation from the surrounding environment. Coinciding with the rise in use and popularity of mobile devices and headphones for AV media consumption, commercial interest in 3D binaural audio has also increased. Creators of games, radio, music, and film are looking to binaural audio as a method of enhancing consumer envelopment and engagement.

Past research has provided evidence of binaural audio providing realistic three-dimensional listening experiences through accurate reproduction of natural spatial listening cues. In comparison, headphone reproduction of conventional stereo is known to sound as if audio events occur inside the head, rather than externally. In reality however, are the benefits really that significant, and is there really enough demand for a supposed improvement to stereo that only works on headphones?

Over the next few weeks, I will be posting a series of blog entries with the aim of researching binaural audio a little further. Hopefully some pretty interesting stuff will be revealed!

Recording with Sugar House Music

Ccs2qSaXIAAC-GG

At the beginning of March, I recorded a new Glass Ankle song at Catalyst Studios in St. Helens, Merseyside, England. The sessions were led by Lee and Ady of Sugar House Music. They are a music production team who have been successful in producing tracks which have received radio play on national UK stations, including Viola Beach‘s “Swings & Waterslides“. They were chosen by label, Playing With Sound, to record, develop, and mix a version of this particular Glass Ankle song. Late last year I signed a record deal with PWS to release a single, the details of which can be found on the Glass Ankle website.

The new song was efficiently recorded and mixed in the space of just 16 hours over two days. This is the first time I will release a song I have written which I have not also recorded and produced myself.

Keep checking back for more details about the song, which is to be called “Without You Here”.