18
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University of Kentucky

Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Embed Size (px)

Citation preview

Page 1: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Audio Scene Analysis and Music Cognitive Elements of Music Listening

Kevin D. DonohueDatabeam Professor

Electrical and Computer EngineeringUniversity of Kentucky

Page 2: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

What is Music?

1 a : the science or art of ordering tones or sounds in succession, in combination, and in temporal relationships to produce a composition having unity and continuity b : vocal, instrumental, or mechanical sounds having rhythm, melody, or harmony

Merrian-Webster Online Dictionary:

http://www.m-w.com/dictionary/music

Page 3: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Auditory Scene: Input Sensory organs (ears) separate acoustic energy into

frequency bands and convert band energy into neural firings

The auditory cortex receives the neural responses and abstracts an auditory scene.

http://hyperphysics.phy-astr.gsu.edu/hbase/sound/hearcon.html Time

0

0.05

0.1

Frequency

12

34

Page 4: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Auditory Scene: Perception Perception derives a useful representation of reality

from sensory input. Auditory Stream refers to a perceptual unit associated

with a single happening (A.S. Bregman, 1990) .

Acoustic to Neural

Conversion

Organize into Auditory Streams

Representation of Reality

Page 5: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Auditory Stream ExperimentBergman & Campbell (1971) Streams tend to form by grouping notes close in time and frequency

(similarity and proximity). Click on spectrograms to play tone sequence. Identify changes in tone

grouping based on separation in time and frequency.

http://www.psych.mcgill.ca/labs/auditory/demo2.htmlhttp://www.psych.mcgill.ca/labs/auditory/demo3.html

Note change in grouping/phrasing from inserting a pair of closely spaced tones around the lower tone.

Page 6: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Circularity in Pitch Judgement

Shepard’s Scale (1964)

(Auditory Demonstrations CD, from the Acoustical Society of America)

Page 7: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Perceptual OrganizationOrganization properties: Belongingness – a sensory element belongs to an

organization (or stream) of which is a part. Exclusive allocation – a sensory element cannot belong to

more than one organization at a time. Bregman & Rudnicky (1975) Click on spectrogram to listen to tone sequence. Note in

first case the later tonal group sounds as one stream due to time proximity. In the second case flanking the lower tones with a sequence at same frequency, separates the lower tone from the upper tones creating 2 separate streams.

Page 8: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Perceptual OrganizationOrganization properties: Closure – perceived continuity, a tendency to close strong

perceptual forms, response to missing evidence. Click on time waveform plots to listen. In the first case a

low level tone is playing and then stops, but the gap is covered by a white noise mask. Most will hear the tone playing through the mask.

Tone pattern first spectrogram

White noise only, used in masking

Page 9: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Sequential and Spectral Integration

Sequential IntegrationGrouping sensory elements over time or events at

different times and considered as from the same source. Melody, rhythm

Spectral IntegrationFusing simultaneous sensory elements over frequency

into one Timbre, harmony

Page 10: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Timbre and Spectral Integration The time harmonic structure (spectral envelope) and time envelope give

rise the timbre of the sound. Click on spectra to hear sound. Note Impact of spectral and time envelopes

0 2000 4000 6000 8000 10000 12000-60

-40

-20

0

Hertz

dB

0 0.2 0.4 0.6 0.8 1-0.5

0

0.5

Seconds

Am

plitu

de0 2000 4000 6000 8000 10000 12000

-60

-40

-20

0

Hertz

dB

0 0.2 0.4 0.6 0.8 1-1

-0.5

0

0.5

1

Seconds

Am

plitu

de

0 2000 4000 6000 8000 10000 12000-60

-40

-20

0

Hertz

dB

0 0.2 0.4 0.6 0.8 1-1

-0.5

0

0.5

1

Seconds

Am

plitu

de

Page 11: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Timbre and Spectral Integration Simultaneous tones grouped by timbre Click on spectrograms to play sounds. Note that different spectral

bands do not sound like different streams. Just one stream is heard.

Seconds

Her

tz

0.1 0.2 0.3 0.40

1000

2000

3000

4000

5000

Seconds

Her

tz

0.1 0.2 0.3 0.40

1000

2000

3000

4000

5000

Same Note (A) 2 Notes (F and A)

Page 12: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Auditory Scene Organization

Primitive Stream Segregation Inherent constraints in auditory scene analysis (perceptual organization

demonstrated by infants/children) Music: Organization of musical sensory units

Schema-based segregation Learned constraints in auditory scene analysis (differences in perceptual

organization resulting from training and culture) Music: Differences between musicians and non-musicians Music: Differences resulting from acculturation

(A.S. Bregman, Auditory Scene Analysis, MIT Press 1990, pp. 1-45)

Page 13: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Music Related Terms

Pitch – Perceived frequency/fundamental tone (20Hz-20kHz Range)

Melody – Pattern of tones identified by the intervals between consecutive pitches

Contour – Shape of the melody without regard to intervals

Loudness – Perceived intensity of sound (0dB to 120dB) Timbre – Nature of a sound defined mostly by its

harmonic structure and time envelope Rhythm – Repeated pattern of strong and weak sounds Tempo – Rate of the rhythm

Page 14: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Melody Invariance

A melody can typically be recognized over changes in pitch, loudness, timbre, tempo, spatial location, and reverberations.

Contours are typically recalled better than actual melodies (intervals) for unfamiliar tunes. (Massaro, Kallman, and Kelly 1980).

(Daniel J. Levitin, Memory for Musical Attributes, in Music Cognition and Computerized Sound, ed. P.R. Cook, MIT Press, 1999, pp. 209-227)

Page 15: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Primitive Musical Perception

Distinguish between cognitive components present at an early age and those resulting from acculturation. Infant: Grasp of musical structuresAdult: Develop cognitive strategies for applying

musical structures

(W. Jay Dowling, The Development of Music Perception and Cognition, The Psychology of Music Academic Press, 1999, pp 603-625)

Page 16: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Summary Innate perceptual organization separates sounds from different

sources. Grouping by pitch, contour, rhythm (phrasing), and timbre are exhibited by infants.

Acculturation refines melody distinctions and its relationship to harmonies and rhythms based on cultural scales and patterns.

Melodic memory is enhanced for melodies following note of a known scale.

Auditory scene analysis operations apply broadly to all sounds (speech, noise, music). Why some auditory streams become pleasurable/stimulating/interesting (music), and others are simply used to form a perception of reality is still not clear.

Page 17: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

How many streams are there?

0

20

40

60

80

100

120

Seconds

Her

tz

Tell Me Ma - Spectrogram in dB

0 5 10 15

1000

2000

3000

4000

5000

6000

7000

8000

Page 18: Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University

Interesting Websites

Mind, Music, and Machinehttp://www.nici.kun.nl/mmm/ Auditory Scene Analysishttp://www.psych.mcgill.ca/labs/auditory/introASA.html

Joe Wolfe’s Web Page

http://www.phys.unsw.edu.au/~jw/Joe.html