Thesis committee: Roger Dannenberg (Chair) Guy Blelloch Robert Harper

Preview:

DESCRIPTION

Temporal type constructors for computer music programming. Thesis committee: Roger Dannenberg (Chair) Guy Blelloch Robert Harper Perry Cook, Princeton University. Computer music programming. Subdomains: Digital signal processing. Response to asynchronous events. - PowerPoint PPT Presentation

Citation preview

1

Thesis committee:Roger Dannenberg (Chair)

Guy BlellochRobert Harper

Perry Cook, Princeton University

Temporal type constructorsfor computer music programming

2

Computer music programming

Subdomains: Digital signal processing. Response to asynchronous events. Representations of musical and sonic

structure. Example applications:

Synthesize audio from a musical score. Abstract features from audio; alter features. Transform audio to compress it.

3

Analysis of audio

amplitudepp f

analyze

abstract

resynthesize

(modify)

frequency

render

4

The goals

Computer music programming should be expressive: programs are clear and

concise. general: programs fall within the

expressive range.

5

The current tradeoff

general

exp

ress

ive

unit-generatorprogramming(Csound)

low-levelprogramming(C++)

the promised land

6

What we have now

“Unit generator” programming (Csound). User configures black-box audio processors. Can’t express new DSP or new kinds of data.

New kind of data: spectral frames, for example.

Low-level programming (C++). Cumbersome without a computer music library. Libraries don’t support new kinds of data,

and don’t give much benefit for new DSP.

7

What do we need?

Write arbitrary DSP in a high-level language. No more writing unit generators in C.

Types higher and lower than “audio stream”. higher: analysis frames for a new

representation. lower: access to individual samples for

new DSP.

8

My proposal

Temporal type constructors. Proposed set: event, vector, infinite vector. Enable a pure applicative programming

style.

Through temporal type constructors, computer music programming can be both expressive and general.

9

A taste of the results

Chronic is a prototype system using this idea.

The FOF synthesis algorithm can’t be written in Csound.

C implementation is 235 lines, and awkward.

Chronic implementation is 34 lines,and closer to our idea of the algorithm.

10

Outline

Temporal type constructors. Code examples in Chronic. Related work. Chronic internals. Future work. Conclusions.

11

Temporal type constructors

event timestamped event:e.g. as a pair (, time).

vec finite vector of :an array of elements.

ivec infinite vector of :a time-indexed stream.

time integer sample count.

12

Digital audio stream

audio sample ivec(float might be chosen as the sample type.)

S SSSS

13

Multi-channel audio stream

multi_audio sample ivec vec

S SSSS

S SSSS

S SSSS

S SSSS

14

Short-time spectrum data

spectra complex vec ivec

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

15

A chord sequence

chordseq pitch vec event vec

@ @ @@

P

P

P

P

P

P

P

P

P

P

P

P

16

Musical-keyboard events

MIDI (pitchvelocity) event ivec

P V P V

P V P V

@

@

@

@

17

Gestural musical events

violin (pitch vec bowing vec) event ivec

B

@

@

@

@

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

B

P

18

Explicit vs. implicit time

Implicit (Csound): out = -in Code runs in a context holding the current

time:for (t=0; ; ++t) out[t] = -in[t]

Looped unavoidably — hope it’s what you want.

Explicit (Chronic): out = map (x. -x) in out and in are of type float ivec. i.e. they are explicitly temporal data. Explicit model, with map, subsumes implicit.

19

Explicit time

Time information is built into the data. Code can stand outside of time.

vs. operating within some implicit “now”. Advantages:

A strictly more powerful model of time. Implicit time can do delay, but can’t do the

inverse. Types are more tractable than code. The FOF example will show how this works.

20

Chronic

Built inside O’Caml as a set of libraries.core: E event

V vec

IV ivec

EV event vec

EIV event ivec

library: L float and otherLV vec…

21

A couple of IV functions

IV.iterate (fun x -> x +. 0.5) 1.[| 1.; 1.5; 2.; 2.5; ... |]

(* y = IV.map succ (IV.delay 2 y) *)IV.delay_rec 2 [| 0; 5 |] (IV.map succ)[| 0; 5; 1; 6; ... |]

22

A couple of library functions

let fs = 44100. (* sampling frequency *)

LV.osc_sine 1000 (220./.fs) 0.25(* 1000 samples of 220-Hz cosine *)

LIV.para_eq (1200./.fs) 12. 0.5 x(* filter x to boost a 0.5-octave band around 1200 Hz by 12 dB *)

23

Examples built with Chronic

FOF synthesis. Computer-music scores. Two reverberators. An FFT-based pitch shifter.

24

FOF synthesis

Makes a sound with a peak in its spectrum.

frequency

level

pitch peak pitch peak

25

The FOF waveform

Series of enveloped sine-wave ‘grains’.

1 / Fpitch 1 / Fpeak

26

A fitting data type

grains: float vec event ivec

@ @ @

F F F

F F F F

F F F

@

F F F

27

Skeleton of FOF code

let fof (f_pitch: float ivec) phase0f_peak bandwidth dbrise fall dur (risefalltab: float vec) =

let grain_times = LIV.phasor_wrap f_pitch phase0 in

let fgrain t = ... (* miracle occurs *) in grain @@ (int t)in let grains = IV.map fgrain grain_times

in EIV.vfold (+.) 0. grains

float vec event ivec

float ivec

float ivec

float -> float vec event

28

The missing piece

let fgrain t =let sine = LV.osc_sine dur f_peak (~-.(frac t) *. f_peak) in

let kenv = exp(~-.pi*.bandwidth) inlet env = V.iterate (fun x -> kenv *. x) 1.0 dur in

let smooth_ph = EV.pwl_list0.0 [(rise, 1.); (dur-1-fall, 1.); (dur-1, 0.)] in

let smooth = LV.tablei risefalltab smooth_ph in

let ampl = L.db_to_amp db inlet grain = V.map3 (fun x y z -> ampl *. x*.y*.z) sine env smoothin grain @@ (int t)

29

FOF in Chronic vs. in C

What I showed you was slightly simplified. Less time-varying control, no

“octaviation”. This was 19 lines; full FOF is 34.

Csound’s FOF in C is 235. More importantly, it’s unintuitive.

30

FOF in C: what goes wrong?

#include "cs.h" /* UGENS7.C */#include "ugens7.h"#include <math.h>

/* loosely based on code of Michael Clarke, University of Huddersfield */

#define FZERO (0.0f)#define FONE (1.0f)

static int newpulse(FOFS *, OVRLAP *, float *, float *, float *);

void fofset0(FOFS *p, int flag){ if ((p->ftp1 = ftfind(p->ifna)) != NULL && (p->ftp2 = ftfind(p->ifnb)) != NULL) { OVRLAP *ovp, *nxtovp; long olaps; p->durtogo = (long)(*p->itotdur * esr); if (*p->iphs == FZERO) /* if fundphs zero, */ p->fundphs = MAXLEN; /* trigger new FOF */ else p->fundphs = (long)(*p->iphs * fmaxlen) & PMASK; if ((olaps = (long)*p->iolaps) <= 0) { initerror("illegal value for iolaps"); return; } auxalloc((long)olaps * sizeof(OVRLAP), &p->auxch); ovp = &p->basovrlap; nxtovp = (OVRLAP *) p->auxch.auxp; do { ovp->nxtact = NULL; ovp->nxtfree = nxtovp; /* link the ovlap spaces */ ovp = nxtovp++; } while (--olaps); ovp->nxtact = NULL; ovp->nxtfree = NULL; p->fofcount = -1; p->prvband = FZERO; p->expamp = FONE; p->prvsmps = 0; p->preamp = FONE; p->xincod = (p->XINCODE & 0x7) ? 1 : 0; p->ampcod = (p->XINCODE & 0x2) ? 1 : 0; p->fundcod = (p->XINCODE & 0x1) ? 1 : 0; p->formcod = (p->XINCODE & 0x4) ? 1 : 0; if (flag) p->fmtmod = (*p->ifmode == FZERO) ? 0 : 1; } p->foftype = flag;}

void fofset(FOFS *p){ fofset0(p, 1);}

void fofset2(FOFS *p){ fofset0(p, 0);}

void fof(FOFS *p){ OVRLAP *ovp; FUNC *ftp1, *ftp2; float *ar, *amp, *fund, *form; long nsmps = ksmps, fund_inc, form_inc; float v1, fract ,*ftab;

if (p->auxch.auxp==NULL) { /* RWD fix */ initerror("fof: not initialized"); return; } ar = p->ar; amp = p->xamp; fund = p->xfund; form = p->xform; ftp1 = p->ftp1; ftp2 = p->ftp2; fund_inc = (long)(*fund * sicvt); form_inc = (long)(*form * sicvt); do { if (p->fundphs & MAXLEN) { /* if phs has wrapped */ p->fundphs &= PMASK; if ((ovp = p->basovrlap.nxtfree) == NULL) perferror("FOF needs more overlaps"); if (newpulse(p, ovp, amp, fund, form)) { /* init new fof */ ovp->nxtact = p->basovrlap.nxtact; /* & link into */ p->basovrlap.nxtact = ovp; /* actlist */ p->basovrlap.nxtfree = ovp->nxtfree;

} } *ar = FZERO; ovp = &p->basovrlap;

while (ovp->nxtact != NULL) { /* perform cur actlist: */ float result; OVRLAP *prvact = ovp; ovp = ovp->nxtact; /* formant waveform */ fract = PFRAC1(ovp->formphs); /* from JMC Fog*/ ftab = ftp1->ftable + (ovp->formphs >> ftp1->lobits);/*JMC Fog*//* printf("\n ovp->formphs = %ld, ", ovp->formphs); */ /* TEMP JMC*/ v1 = *ftab++; /*JMC Fog*/ result = v1 + (*ftab - v1) * fract; /*JMC Fog*//* result = *(ftp1->ftable + (ovp->formphs >> ftp1->lobits) ); */ if (p->foftype) { if (p->fmtmod) ovp->formphs += form_inc; /* inc phs on mode */ else ovp->formphs += ovp->forminc; } else {#define kgliss ifmode /* float ovp->glissbas = kgliss / grain length. ovp->sampct is incremented each sample. We add glissbas * sampct to the pitch of grain at each a-rate pass (ovp->formphs is the index into ifna; ovp->forminc is the stepping factor that decides pitch) */ ovp->formphs += (long)(ovp->forminc + ovp->glissbas * ovp->sampct++); } ovp->formphs &= PMASK; if (ovp->risphs < MAXLEN) { /* formant ris envlp */ result *= *(ftp2->ftable + (ovp->risphs >> ftp2->lobits) ); ovp->risphs += ovp->risinc; } if (ovp->timrem <= ovp->dectim) { /* formant dec envlp */ result *= *(ftp2->ftable + (ovp->decphs >> ftp2->lobits) ); if ((ovp->decphs -= ovp->decinc) < 0) ovp->decphs = 0; } *ar += (result * ovp->curamp); /* add wavfrm to out */ if (--ovp->timrem) /* if fof not expird */ ovp->curamp *= ovp->expamp; /* apply bw exp dec */ else { prvact->nxtact = ovp->nxtact; /* else rm frm activ */ ovp->nxtfree = p->basovrlap.nxtfree;/* & ret spc to free */ p->basovrlap.nxtfree = ovp; ovp = prvact; } } p->fundphs += fund_inc; if (p->xincod) { if (p->ampcod) amp++; if (p->fundcod) fund_inc = (long)(*++fund * sicvt); if (p->formcod) form_inc = (long)(*++form * sicvt); } p->durtogo--; ar++; } while (--nsmps);}

static int newpulse(FOFS *p, OVRLAP *ovp, float *amp, float *fund, float *form){ float octamp = *amp, oct; long rismps, newexp = 0;

if ((ovp->timrem = (long)(*p->kdur * esr)) > p->durtogo) /* ringtime */ return(0); if ((oct = *p->koct) > FZERO) { /* octaviation */ long ioct = (long)oct, bitpat = ~(-1L << ioct); if (bitpat & ++p->fofcount) return(0); if ((bitpat += 1) & p->fofcount) octamp *= (FONE + ioct - oct); } if (*fund == FZERO) /* formant phs */ ovp->formphs = 0; else ovp->formphs = (long)(p->fundphs * *form / *fund) & PMASK; ovp->forminc = (long)(*form * sicvt); if (*p->kband != p->prvband) { /* bw: exp dec */ p->prvband = *p->kband; p->expamp = (float)exp((double)(*p->kband * mpidsr)); newexp = 1; }

/* Init grain rise ftable phase. Negative kform values make the kris (ifnb) initial index go negative and crash csound. So insert another if-test with compensating code. */ if (*p->kris >= onedsr && *form != FZERO) { /* init fnb ris */ if (*form < FZERO && ovp->formphs != 0) ovp->risphs = (long)((MAXLEN - ovp->formphs) / -*form / *p->kris); else ovp->risphs = (long)(ovp->formphs / *form / *p->kris); ovp->risinc = (long)(sicvt / *p->kris); rismps = MAXLEN / ovp->risinc; } else { ovp->risphs = MAXLEN; rismps = 0; } if (newexp || rismps != p->prvsmps) { /* if new params */ if (p->prvsmps = rismps) /* redo preamp */ p->preamp = (float)pow(p->expamp, -rismps); else p->preamp = FONE; } ovp->curamp = octamp * p->preamp; /* set startamp */ ovp->expamp = p->expamp; if ((ovp->dectim = (long)(*p->kdec * esr)) > 0) /* fnb dec */ ovp->decinc = (long)(sicvt / *p->kdec); ovp->decphs = PMASK; if (!p->foftype) { /* Make fof take k-rate phase increment: Add current iphs to initial form phase */ ovp->formphs += (long)(*p->iphs * fmaxlen); /* krate phs */ ovp->formphs &= PMASK; /* Set up grain gliss increment: ovp->glissbas will be added to ovp->forminc at each pass in fof2. Thus glissbas must be equal to kgliss / grain playing time. Also make it harmonic, so integer kgliss can represent octaves (ie pow() call). */ ovp->glissbas = ovp->forminc * (float)pow(2.0, (double)*p->kgliss); /* glissbas should be diff of start & end pitch*/ ovp->glissbas -= ovp->forminc; ovp->glissbas /= ovp->timrem; ovp->sampct = 0; /* Must be reset in case ovp was used before */ } return(1);}

static int rngflg=0;

Can’t represent grains. Can’t stand outside of time Has to loop over output samples, and think

“What is the set of active grains right now? Are some dying? Are new ones born? Which envelopes are in their rise phase? entering fall phase? …”

You don’t want to think that way about FOF. Want to loop over grains, not samples.

31

Computer music scores

Construct a score, and synthesize from it:type note = float * (float vec) (* dB, Hz *)score: note event vecsynth_beep: note -> float veclet sound = EV.vfold (+.) 0. (V.map (E.lift synth_beep) score)

Hierarchical structure.type 'a element = Note of 'a | Riff of 'a element event vec

Measure event timestamps in fractional beats. Tempo-map from beats to samples.

32

The components of a pitch shifter

overlapped FFT

correct frequencies

rescale frequencies

compute spectrum

overlapped IFFT

float ivec

complex vec ivec

complex vec ivec

(float * float) vec ivec

(float * float) vec ivec

float ivec

float ivec

float ivec

pitch shifterfloat ivec

(float * float) vec ivec

sinusoidalanalyzer float ivec

complex vec ivec

complex vec ivec

float ivec

f: complex vec ivec -> complex vec ivec

spectralmodifier float ivec

float ivec

pitch shifter

33

Related work

Languages with temporal type constructors.

Languages with atomic signals and events. Events with explicit time. Events in implicit time. Events not first-class.

Languages with signals only. Languages with events only.

34

Fran

Elliott and Hudak, 1997. “Functional reactive animation”

Used to define objects’ trajectories, etc. Animation, not video — no frames or

pixels. Behavior is Time -> . Event is time-sorted stream of Time * . Time is continuous.

35

Continuous versus discrete time

Animation is continuous change. DSP is discrete.

Digital filters are based on unit delays. The FFT relies on discrete time and

frequency. A delay line can’t hold a continuous-time

signal. So “delay x by 1” is t . (x (t-1)). Feedback delay involves x (t-1), x (t-2), x (t-3), …

Two different ways of programming.

36

ALDiSP

Freericks, 1996. For digital signal processing.

stream: demand-driven (like ivec). pipe: producer-driven. A pipe is a channel for asynchronous

events. Event timing is implicit. Representing temporal data is not the

goal.

37

Signals and events

Atomicity of signals precludes general DSP. Some languages have events with explicit

time: Arctic (Dannenberg et al., 1986):

applicative programming for reactive systems. SuperCollider (McCartney, 1996):

scores are lazy lists of particular events. Some have events in implicit time. Or events not first-class—score

sublanguage.

38

Inside Chronic

Everything besides ivecs is pretty easy.

The properties of a good ivec. Chronic’s ivec implementation. Phases: building and computation. A little benchmark on static

dataflow.

39

Desirable properties of an ivec

Correct asymptotic space and time use.

Block computation. Consumer control of block length. Efficient fan-out to multiple

consumers. In-place update.

40

Chronic’s ivec

implementation An ivec is a reference to an ivec_dat.

An ivec_dat is an object. Has method compute (upto: time) -> unit

Writes output into a shared buffer.

41

f: x . x+2x0: 0

iterate_dat

evens:

The building phase

let evens = IV.iterate (fun x -> x+2) 0

let powtwo = IV.iterate (fun x -> x*2) 1

let powfour = IV.peekiv evens powtwo (* index into powtwo by evens *)

f: x . x*2x0: 1

iterate_dat

powtwo:

peekiv_dat

powfour:t:

x:

42

The computation phase

[0], [1]?

[0], [1]?

[0], [2]?

f: x . x*2x0: 1

iterate

t: x:

peekiv

f: x . x+2x0: 0

iterate

[0, 2, 4, 6, …]

[1, 2, 4, 8, …]

1, 4

0, 2

1, 4

Demand-driven dataflow:

43

Function calls are expensive

function call: IV.map2 (+.) x y

inlined: IV.add2 x y

C++ inlined: for (i=0; i<len; ++i) z[i] = x[i]+y[i];

Relative times for optimal block length (256):map2 9.3add2 1.0C++ 0.36

44

Future work

Comprehensions. Sampling rates. Arbitrary feedback delay. Lazy vectors. Real-time?

45

vec and ivec comprehensions

Instead of IV.map2 (fun xi yi -> xi + 2*yi) x y,write { xi + 2*yi: xi in x, yi in y }

or just { x + 2*y }

More readable. Can generate specialized code. Accomplish with camlp4 preprocessor?

or with C++ template tricks?

46

Signals with sampling rates

Now you can use signals of differing rates,but you get no checking of rate mismatches.

Audio signal: 44100 Hz. Control signal: 1000 Hz. Incorporate sampling rate into sig,

isig types.

47

Conclusions

Unify computer-music sublanguages.

Think and program outside of time.

If we construct types, we can take them apart.

48

Unify sublanguages

Csound has three separate languages:event placement, signal routing, and DSP (“score”, “orchestra”, and C).

The divisions cut across useful interaction.

Nyquist unifies the event and routing levels.

Chronic unifies all three.

49

Stand outside of time

Program in time: logical time advances as the program runs. An event’s time is “now”.

Out of time: all time is explicitly in the data. The program’s execution is atemporal.

Allows vfold (in FOF code) to be factored out. Out-of-time is often the way we think about

an algorithm.

50

Deconstructing constructed types

Traditional computer music languages make the audio signal an atomic type—a black box.

Then there is no notion of an audio sample. Other types: spectra, LPC frames, …? Add them as more atomic types?

Add corresponding suite of unit generators, too. A constructed type is no longer a black box.

51

Questions we can now address

How can computer music languages support writing new DSP and new representations?

How can libraries for low-level languages support new DSP and new representations?

How can we build better tools for researching music and DSP algorithms?

52

Summary

Temporal type constructors lead to a better way of doing computer music programming.

53

54

Synthesis from a score

let motif = base bend . let bleep = pitch .

bend

base

filterosc

pitch

let shorten = x . timescale 0.5 xlet double = sequence [motif, motif]let zeno = sequence (iterate 10 shorten motif)let score =

zeno double

double

doublezeno

zeno

zeno

let audio = apply bleep score

55

What’s wrong?

Csound aims to be expressive, high-level:audio signal, not audio sample, is atomic.

Csound types go no higher and no lower. Higher: stream of frames of analysis data.

Can’t express a new analysis system. Lower: access to individual samples.

Can’t express new DSP.

56

Music languages (Csound)

New DSP generally cannot be expressed. No access to individual elements of audio

data. Recursive delay is restricted.

Code is really scalar, mapped over time. Time is factored out, unavailable.

Can’t construct new types.

57

Low-level languages (C++)

It is hard to write a good support library. Most assume all data is synchronous

signals. Infinite data is awkward.

Libraries don’t help new DSP code much. Fine-grained primitives are hard to

identify.

58

Implementations

Chronic is a prototype implementation of this style of programming.

One possible future use: a framework for — developing computer music algorithms. analyzing and manipulating sonic data. Similar niche to Matlab.

59

Something implicit time can’t do

The implicit model can represent delay: out = temp; temp = in

It cannot represent the inverse operation. Undesired delay leaks out, breaking

modularity.

Explicit time supports the inverse:out = drop 1 in drop 2 [1 0 6 6] = [6 6]

Explicit time, with map, subsumes implicit.

60

Why this matters

A DSP operation may add undesired delay.

In explicit time, this can be removed. In implicit time, the delay leaks out.

Must delay other signals to keep them aligned.

A signal’s delayedness is not part of its type.

61

A couple of EIV functions

EIV.pwl 3. [| 4.@@2; 1.@@5 |][| 3.; 3.5; 4.; 3.; 2.; 1.; ... |]

0 1 2 3 4 5

EIV.vfold (+) 0 [|[| 1; 1; 1; 1 |] @@ 2;[| 2; 2; 2; 2 |] @@ 4 |]

[| 0; 0; 1; 1; 3; 3; 2; 2; ... |] 0 1 2 3 4 5 6 7

[ 1 1 1 1 ] [ 2 2 2 2 ]

62

Two reverberators

Based on feedback-delay structures. Moorer: filtered comb. Gardner: nested

allpass. Feedback delay, y = delay (f y):

x yD

g

+x yD

y

f

let f x y = IV.map2 (+.) x (IV.map (fun y -> g *. y) y)

let echo length x = IV.delayz_rec2 length f x

(f y) (f y)

63

A complication

Can’t access the inside of a feedback delay.

N1 N2 lowpassX

g

0.5

Y

0.5

IV.delayz_rec2

N1 N2

Kludge: duplicate part of it instead.

D

64

Feedback delay: a comparison

In low-level languages— you have to maintain grungy delay

queues. In computer music languages—

you often can’t represent feedback delay. In Chronic—

high-level representation of feedback loops,

but not arbitrary flow graphs.

65

Why not just a stream?

type ’a ivec = Ivec of (unit -> ’a * ’a ivec)

Has

66

An ivec is an ivec_dat ref

class [’a] ivec_dat mutable in-place method get_buf ()

-> ’a vec fan-out from buf method compute (upto: time) control of block length

-> unit (* side effect: writes to buf *)method seek (upto: time)

type ’a ivec = ’a ivec_dat ref

compute 10; use buf.(0) to buf.(9);

compute 20; use buf.(0) to buf.(9); …

67

Subclassing ivec_dat

class [’a, ’b] map_dat (f: ’a -> ’b) (x: ’a ivec) =

object inherit [’b] ivec_dat

method compute_hook

(* call !x#compute; use !x#get_buf (); write to buf *)

let map (f: ’a -> ’b) x =

ref ((new map_dat f x) :> (’b ivec_dat))

68

69

The components of a pitch shifter

overlapped FFT

correct frequencies

rescale frequencies

compute spectrum

overlapped IFFT

float ivec

complex vec ivec

complex vec ivec

(float * float) vec ivec

(float * float) vec ivec

float ivec

70

float ivec

float ivec

pitch shifter

71

float ivec

(float * float) vec ivec

sinusoidalanalyzer

72

float ivec

complex vec ivec

complex vec ivec

float ivec

f: complex vec ivec -> complex vec ivec

spectralmodifier

73

Reusing the components

overlapped FFT

overlapped IFFT

overlapped FFT

correct frequencies apply function f

Sinusoidal analyser Spectral manipulator

output: (float * float) vec ivec f: complex vec ivec -> complex vec ivec

Recommended