39
Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustic s

Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics

Embed Size (px)

Citation preview

  • Slide 1
  • Slide 2
  • Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics
  • Slide 3
  • 2 Course Map Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 4
  • Where are we? Week 2 Received signal is sampled & quantized q = PCM[ r ] Week 4 Sampled signal first transformed into frequency domain Q = DFT[ q ] Week 3 Quantized Signal is Coded c =code[ q ] Week 5 signal oversampled & low pass filtered Q = LPF[ DFT(q+n) ] Week 6 Transformed signal analyzed Using human psychoacoustic models Week 7 Acoustically Interesting signal is perceptually coded C = MP3[ Q] Over Sample DFT LPF DecodeProduce r(t)r(t) p(t)p(t) q + n C Perceptual Coding Store / Transmit Q + N Q Week 4 Week 6 Week 5Week 3 [Painter & Spanias. Proc.IEEE, 88(4):451512, 2000] 3 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 5
  • 4 The Physical Ear External Sound Waves Guided by outer ear into auditory canal Excite Inner Ear Through mechanical linkage connecting ear drum to cochlea [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98117, 2008] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 6
  • 5 The Physical Ear Initiates signal processing frequency domain analysis Via analog computation Video: CochleaCochlea What part of the Cochlea vibrates for an 800 Hz square wave? [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98117, 2008] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 7
  • 6 The Cognitive Ear Modern Psychoacoustics Benefits greatly from o decades of neural recording o contemporary brain imaging technology [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98117, 2008] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 8
  • 7 Power Spectrum Model of Hearing Rough Picture (main content of todays lecture): Critical Bands: Auditory system contains finite array of adaptively tunable, overlapping bandpass filters Frequency Bins: humans process a signals component (against noisy background) in the one filter with closest center frequency Masking: certain signal components in a given band are favored and others are filtered out Established through decades of psychoacoustic experiments B.C.J. Moore. Int.Rev.Neurobiol., 70:4986, 2005. Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 9
  • 8 Auditory Thresholds In the lab, you varied the frequency, amplitude and phase of signals What was the effect of each, if any, on the sound you heard? Frequency Amplitude Phase Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 10
  • Auditory Thresholds Harvey Fletcher (1940) Played pure tones varying o frequency, f [ Hz] o Intensity, I [Dyn cm -2 ] = 10 -5 [N cm -2 ] = 0.1 Pa o phase changes tend to be inaudible Large listener population o Young o Acute Recorded extreme thresholds faintest audible greatest tolerable Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah (http://www.et.byu.edu/)
  • Slide 11
  • 10 Auditory Thresholds Results: pain-free hearing range extends at most over 20 Hz 20 KHz with sensitivity 2 10 -4 0.1 Pa = 20 Pa Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah 0.1 Pa [H. Fletcher. Rev. Mod. Phys., 12(1):4765, 1940].
  • Slide 12
  • 11 The decibel unit Define standard pressure: p 0 = 0.0002 0.1 Pa = 20 Pa Threshold of human hearing Compute Sound Pressure Level as: L SPL = 20 log 10 (p/p 0 ) dB L SPL for p 1 = 20 Pa, for p 2 = 200 Pa, for p 3 = 20 mPa Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Compare to Ambient sea-level pressure: 1 Atmosphere = 10 5 Pascal Q: why use log-log scale? A 1 : dynamic range A 2 : loudness is a power function 0.1 Pa
  • Slide 13
  • 12 The decibel unit Hearing intensity Week 6 Psychoacoustics (http://www.dspguide.com/ch22/1.htm)
  • Slide 14
  • 13 Lets try to reproduce these results! Week 6 Psychoacoustics (http://www.dspguide.com/ch22/1.htm) We will listen to single sine tones starting at a frequency of 10KHz, all the way up to 20KHz, so each student can figure out their cut-off frequency Suggestions to improve this experiment?
  • Slide 15
  • 14 Animal hearing ranges Dogs: Greater hearing range: 40Hz to 60KHz Ultrasonic dog whistles Mice: Large ears in comparison to their bodies Hearing range: 1KHz to 70KHz Cant hear low frequency noises Communicate with high frequency Distress call (40KHz), alert of predator [Pictures from Wikipedia] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 16
  • 15 Why Sinusoids? Why not some other harmonic series? Fouriers analysis shows harmonic analysis could be based on arbitrary smooth periodic fundamental Why does the animal receiver use sinusoids? Hamiltonian Mechanics Simplest physical model of vibrating masses Coupled spring-mass-damper mechanics Produce sinusoidal harmonics Video: CochleaCochlea m x b k . all sound is produced by vibrating masses . Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 17
  • 16 Masking - Spatial Masking Paradigms Masker masking maskee Tone Masking Noise o pure tone of 80 SPL at 1 kHz o just masks critical band noise of 56 SPL centered at 1 kHz Masker-to-Maskee ratio o Constant for fixed relative frequency and varying amplitude o Changes with varying relative frequency [T. Painter and A. Spanias. Proc. IEEE, 88(4):451512, 2000.] 1 Bark frequency interval Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 18
  • 17 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):4765, 1940]. The first graph shows the masking pattern for a 200Hz tone Mostly masks tones around 200Hz, but also at harmonics The second graph shows the same plot for different frequencies, but only the fundamental part Notice that the band gets wider for increasing frequencies masker at fundamental can somewhat mask maskees at the harmonics but the spreading curve is traditionally depicted over the fundamental only Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 19
  • 18 Tone Masking Noise Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked? 200 Hz tone at 80dB 200 Hz tone at 40dB 300 Hz tone at 40dB 400 Hz tone at 40dB 700 Hz tone at 30dB
  • Slide 20
  • 19 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):4765, 1940]. Tone Masking Noise (Fig 12) value above quiet threshold such that a signal at the abscissa frequency can be heard in presence of top: 200 Hz tone bottom: various frequencies Noise Masking Tone (Fig 13) dots show pure tone magnitude (in dB) required to be audible above noise o Of the magnitude on the middle curve o centered at that frequency o with bandwidth at least wider than the bars of Fig 12 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 21
  • 20 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB Noise Masking Tone
  • Slide 22
  • 21 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB o Yes! 1KHz at 60dB Noise Masking Tone noise
  • Slide 23
  • 22 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB o No! 1KHz at 60dB Noise Masking Tone
  • Slide 24
  • 23 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! Noise Masking Tone
  • Slide 25
  • 24 Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! Noise Masking Tone
  • Slide 26
  • 25 Masking - Temporal Temporal Masking Masker effect persists for tenths of a second Masker effect is acausal o on ~ 2/100 timescales Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 27
  • 26 Pitch JND JND = just noticeable difference change in stimulus that just elicits perceptual notice where just means that a smaller variations of stimulus cannot be discerned [H. Fletcher. Rev. Mod. Phys., 12(1):4765, 1940]. Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah What can you say about the JND: Below 1000 Hz? o roughly constant o ~ 3 Hz Above 1000 Hz? o roughly log-log linear o Log[Jnd(f 2 )] - Log[ Jnd(f 1 )] ~ n (Log[f 2 ] - Log[f 1 ]) Suggests that as frequency increases broader frequency bands assigned to same length of cochlear tissue Remember cochlea model What is n? e.g. f 1 =2000 f 2 =4000 6 = 10 4 ~ n( Log 10 [2] ) ) n ~ 20
  • Slide 28
  • 27 JND experiment Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah The following audio files contain a single tone playing for 10 seconds. The sine starts at 200Hz, then changes to a higher frequency (201, 202, 203, 205, 210). This change occurs after a number of noises: 1, 2, 3, 4, 5, 6, 7, 8 or 9. Can you notice when the change happens?
  • Slide 29
  • 28 Critical Bands Decades of empirical study reveal that human audio frequency perception is quantized into < 30 critical bands of perceptually near-identical pitch classes corresponding to ~equal length bands of cochlear tissue (neurons) Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 30
  • 29 Critical Bands: Evidence Tone masking Noise (Fig. a & c) o noise audibility threshold o for small bandwidth noise o remains constant o until tone frequency locus o falls away from critical bandwidth Noise masking Tone (Fig. b & d) o same effect o with masker and maskee roles reversed [T. Painter and A. Spanias. Proc. IEEE, 88(4):451512, 2000.] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 31
  • 30 The Bark Scale Bark units: Uniform JND scale for frequency Maps frequency intervals into their respective critical band number [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 32
  • 31 The Bark Scale Frequency-to-Bark function First Principles vs. Empirical Modeling [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah
  • Slide 33
  • 32 Compression opportunities Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Consider the following recording Any ways to improve the compression?
  • Slide 34
  • 33 Compression opportunities Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208 dB 80 0 20 40 60 100 120 Masked
  • Slide 35
  • 34 Compression opportunities Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208 dB 80 0 20 40 60 100 120 JND: Could only represent integer frequency values
  • Slide 36
  • 35 Compression opportunities Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz 193 194 196 197 198 199 201 202 203 204 206 207 208 dB 80 0 20 40 60 100 120
  • Slide 37
  • 36 Next Week Week 6 Psychoacoustics ESE 250 S13 DeHon Kadric Kod Wilson-Shah How can we use what we know about human perception to compress music? Frequency hearing range Masking o Temporal o Spatial o JND o Barks
  • Slide 38
  • 37 Big Ideas Sound is a pressure wave that makes the Cochlea vibrate with frequencies from ~20Hz (at the tip) to ~20KHz (at the base) This vibration is sinusoidal (physics) This is why sound harmonics are best represented as sinusoidal signals Masking Temporal A masker tone can mask another tone that is present either right before or a little after the masker Spatial A single tone can mask an entire frequency band (that contains the tone) if its intensity is high enough There are