Head-related transfer function
The head-related transfer function HRTF, also called the anatomical transfer function ATF, describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the head, pinna, and torso, before the sound reaches the transduction machinery of the eardrum and inner ear (see auditory system). Biologically, the source-location-specific prefiltering effects of these external structures aid in the neural determination of source location, particularly the determination of source elevation.
Linear systems analysis defines the transfer function as the complex ratio between the output signal spectrum and the input signal spectrum as a function of frequency. Blauert (1974; cited in Blauert, 1981) initially defined the transfer function as the free-field transfer function FFTF. Other terms include free-field to eardrum transfer function and the pressure transformation from the free-field to the eardrum. Less specific descriptions include the pinna transfer function, the outer ear transfer function, the pinna response, directional transfer function DTF or what is commonly termed the head-related transfer function HRTF.
The transfer function H(f) of any linear time-invariant system at frequency f is:
- H(f) = Output (f) / Input (f)
One method used to obtain the HRTF from a given source location is therefore to measure the head-related impulse response HRIR , h(t), at the ear drum for the impulse Δ(t) placed at the source. The HRTF H(f) is the Fourier transform of the HRIR h(t).
Even when measured for a dummy head of idealized geometry, head-related transfer functions are complicated functions of frequency and the three spatial variables. For distances greater than 1 m from the head, however, the HRTF can be said to attenuate inversely with range. It is this far field HRTF, H(f, θ, φ), that is normally measured.
HRTFs are typically measured in an anechoic chamber to minimize the influence of early reflections and reverberation on the measured response. HRTFs are measured at small increments of θ such as 15° or 30° in the horizontal plane, with interpolation used to synthesize HRTFs for arbitrary positions of θ. Even with small increments, however, interpolation can lead to front-back confusion, and optimizing the interpolation procedure is an active area of research. Humans are less sensitive to changes in the azimuth, φ, and HRTFs are often measured only on the horizontal plane or with 45° increments in the median plane.
In order to maximize the signal-to-noise ratio (SNR) in a measured HRTF, it is important that the impulse being generated be of high volume. In practice, however, it can be difficult to generate impulses at high volumes and, if generated, they can be damaging to human ears, so it is more common for HRTFs to be directly calculated in the frequency domain using a frequency-swept sine wave or by using maximum length sequences. User fatigue is still a problem, however, highlighting the need for the ability to interpolate based on fewer measurements.
Head related transfer functions have to do with a person's ability to localize sound. The ability to localize sound is not something that people are born with. A baby has to train its auditory system to recognize where certain sounds are located.
The head related transfer function is involved in resolving the cone of confusion a series of points where ITD and IID are identical for sound sources from many locations around the "0" part of the cone. When a sound is received by the ear it can either go straight down the ear into the ear canal or it can be reflected off the pinnae of the ear, into the ear canal a fraction of a second later. The sound will contain many frequencies, so therefore many copies of this signal will go down the ear all at different times depending on their frequency (according to reflection, diffraction, and their interaction with high and low frequencies and the size of the structures of the ear.) These copies overlap each other, and during this, certain signals are enhanced (where the phases of the signals match) while other copies are canceled out(where the phases of the signal do not match). Essentially, the brain is looking for frequency notches in the signal that correspond to particular known directions of sound. If another person's ears were substituted, the individual would not be able to localize sound, as the patterns of enhancement and cancellation would be different to those patterns that were learned in infancy.
- Binaural recording
- Environmental audio extensions
- Dummy head recording
- Sound Retrieval System