Introduction to Freesound - ETIC UPFgroma/CDSIM/freesound_CDSIM.pdf · Foley (1891-1967), working at Universal Studios ... (later removed) ... (django) + Postgres by Bram + a team

Introduction to Freesound.org

Gerard RomaMusic Technology Group

UPF

Outline1. What is Freesound?

2. The freesound commuity

3. Architecture

4. Searching sounds

5. Freesound API

6. Applications:

1. Sound maps and education

2. Virtual world sonification

3. Web-based music creation

4. Music Hacks

1 What is Freesound?

media sharing• Media sharing has become a prominent

use of internet

• Users share text, photos, videos, music ... sounds

• “Sounds” have a different specific purpose: not enjoyed online, but downloaded for audiovisual production or music creation

• Sample libraries previously sold as CDs are now distributed over internet

foley• Practice named after Jack Donovan

Foley (1891-1967), working at Universal Studios

• Sound effects used in cinema, video, games

• Added in post-production to enhance the soundtrack

• With digital technology, existing sound FX recordings can be used

http://en.wikipedia.org/wiki/Jack_Foley_(sound_effects)

audio cultures in music• musique concrète - first uses of sound

recordings for music

• plunderphonics - explicit appropriationism as critique of copyright system

• soundscape composition - field recordings for preservation and composition

• loops - hip/hop, techno, minimalism ...

• glitch - errors, broken software/hardware

attitudes when working with recordings

• Avoid other people’s sounds

• Extreme transformations

• Collage & citation

• Anti-Copyright activism

• Plagiarism

freesound.org

• Collaborative audio database focused on sound samples

• Started in ICMC 2005 free sound

• 210,000+ sounds under CC licenses

• ~4M users

• About 10k contributors

Life of a (free) sound

• Recording

• Upload

• Describe (text, tags)

• Preview is computed (mp3, image)

• Moderation (group of freesound users)

• Downloads, comments ratings

moderation

• Sounds are moderated mainly for copyright infringement. Many difficult cases:

• Digital synths

• Toys

• Street music

• Also description quality

functionalities• Sounds can be shared, described, commented, rated, packed!• Forum: users request samples, exchange recording tips, show their work!• Geotagging: put sounds in the map!• Find similar sounds based on analysis of the audio signal

• New version developed by Bram de Jong

• Design by PixelShell

• Available under GNU Affero GPL

• Freesound 2.0 public API

Freesound 2.0

2.0 Licenses

• CC zero (public domain)

• CC Attribution

• CC Attribution non-commercial

2 The freesound community

Freesound community

“We made a thunder storm track to desensitize our dog.. When we got our puppy dog, it seemed to be somewhat scared of thunder. So we made up a storm track from many Freesound clips. We slowly played the storm track at increasing

volumes while we 'acted normally' ignoring the sound so as to keep the dog calm. It worked..!We've had various requests from friends for the storm track, and I've recently had a request for

a traffic track for a dog that's afraid of motorbikes.”

!-Nik

Nik, Cassie and Wicket

Nik, Cassie and Wicket (and Richard)

“I am the mother of 2 autistic children, who are both intelligent and go to regular schools. Next

week I will be giving a lecture about autism for all the teachers my son has to deal with this year.

One of his problems is that he hears all sounds on the same sound level, so the buzzing of a fly, a

zipper, a clicking pen, street workers outside the school, [...] and the teachers voice all sound

equally loud and all compete for the first prize, so to say.

I am looking for a short sound fragment which I can use during this lecture that contains most of the noises I mentioned above, or is similar to what I

want to demonstrate.”!

- Bianca

Bianca

3 Architecture

Freesound 1• The first version of freesound was

coded by Bram de Jong in php over mysql in 2005

• User management and forum based on phpbb

• This architecture couldn’t scale to the success of the site, both due to performance and maintenance issues

• Also the university initially capped bandwidth usage (later removed)

Freesound 2• Rewritten from scratch in python (django) +

Postgres by Bram + a team of developers and researchers at MTG

• Better maintenance, scalability and data integrity

• Distributed architecture

• Hosted in VM infrastructure at UPF. Currently maintained by Phd Students

• Again reaching bandwidth problems!

Freesound 2! ! !

!!

Tel.: 93 542 21 00 F ax: 93 542 24 49 www.upf.edu Roc Boronat, 138 08018 B A R C E L O N A !

FREESOUND2 Components Block Diagram

Licenses 1 - Web front-end, RESTful API Both the web front-end and the RESTful API are build using several Python modules. These modules are listed below:

o django - BSD license o django-extensions - BSD o django-debug-toolbar - BSD o django-piston - BSD license o markdown - BSD o pycrypto - Public Domain o BeautifulSoup - BSD o functional - PSF o FeedParser - MIT license (http://www.opensource.org/licenses/mit-license.php) o scikits.audiolab - LGPL o Pygments - BSD o python-cjson - LGPL o gunicorn - MIT o numpy - BSD o networkx - BSD o gearman (Python API for Gearman) - Apache license o pyzmq (Binding for Zeromq, http://www.zeromq.org/) - LGPLv3+ license o python-memcached (Python API for memcached) - Python license

(http://docs.python.org/license.html) o psycopg2 (Python API for postgres) - GPL license with exceptions or ZPL

(http://pypi.python.org/pypi/psycopg2/2.0.5.1)

Along with python modules we also use some javascript libraries:

Audio processing

• Sounds are processed in worker machines:

• Preview images (waveform, spectrogram) -> wav2png

• Compressed audio previews (mp3,ogg, low quality, high quality)

• Content-based analysis (Essentia)

Essentia• Library for audio analysis developed

at MTG-UPF during the last 6 years

• Large quantity of algorithms and descriptors• Low-level (based on magnitude spectrum)

• Tonal (chords, key ...)

• Rhythm (tempo, meter ...)

• Highlevel (live/studio, male/female voice ...)

• Sfx (inharmonicity, tristimulus ...)

• Optimized for polyphonic music

Gaia• Specialized server for indexing

documents according to distances in a vector space

• Used for content-based similarity in freesound

• Vectors are formed by Essentia descriptors

• After several years of exclusive licensing, Essentia and Gaia are available under AGPL

4 searching sounds

Text search• Based on Solr text indexing engine

• Default:search in filename, description, tags

• Rank: default by relevance (text-based), many options

• Results are also grouped in packs by default

• Advanced search: select different fields or ranges

Query filters

• Several fields are split into discrete options

• Solr returns a classification of search results into facets

• Examples: licenses, sample rates, tags ...

• Can be used to filter search

Content-based search

• Similarity search is supported for each sound

• Analog to Query by Example (QbE) in MIR

• Filter by descriptor range also supported via API

• Combined search: use content or text target, use content or text filters

5 freesound API

Web APIs

• Allow programs to access web resources

• Supported in most major social media apps: Youtube, Flickr, Facebook ...

• Promote new applications of the same data

• Usually require API keys (e.g. google maps) and possibly three-tier authentication (facebook)

Freesound API

• Access most important features of freesound.org programatically

• Clients for python, javascript, actionscript, supercollider

• Based on REST principles

http://www.freesound.org/docs/api/

http://www.freesound.org/docs/api/

6 applications

6.1 Sound maps and education

Sound maps

• M. Schaffer (1933) initiates Acoustic Ecology as a study of environmental sounds (The tuning of the world, 1977).

• Since the start, Freesound includes a google maps interface with geotagged sounds

• Other maps: escoitar.org, Radio Aporee ...

sonsdebarcelona• Started in 2008 for fostering local

community of freesound users

• Workshops at schools. Current directions: sons de la natura, sons de les cultures

• http://barcelona.freesound.org

• Example: sound fictions• http://barcelona.freesound.org/post/43831199917/tallers-de-ficcions-

sonores-talleres-de-ficciones

• http://www.freesound.org/people/sonsdebarcelona/packs/11877/

http://barcelona.freesound.org

http://barcelona.freesound.org/post/43831199917/tallers-de-ficcions-sonores-talleres-de-ficciones

http://www.freesound.org/people/sonsdebarcelona/sounds/179937/

6.2 Virtual world sonification

Metaverse

• European project focused on standardization for virtual worlds

• Bridging with the real world

• Our contribution: methods for content-based retrieval and soundscape generation

metaverse environments

Towards user-contributed content• Many environments support in-world

building

• Users may upload 3D models

• Models are available in online databases

• Second life allows uploading sounds for attaching to objects

Problems when searching• Text-based search

• Polysemy

• Insufficient annotations

• Noise

• Content-based search

• Query specification

• Different descriptors for different kinds of sound

Ecological acoustics• Gaver (93) recalled Schaeffer’s

concept of musical listening and opposed it to everyday listening

• Musical listening: we listen to the properties of sounds without focusing on their sources

• Everyday listening: we use the properties of sounds to extract information about their source

• He also proposed a taxonomy based on the interactions between different kinds of materials

Taxonomy

INTERACTING MATERIALSHUMAN & ANIMAL

VOICES

MUSICAL

SOUNDS

VIBRATING SOLIDS AERODYNAMIC SOUNDS LIQUID SOUNDS

IMPACT

SCRAPING DEFORMATION

ROLLING DRIP

POUR SPLASH

RIPPLE

WHOOSH EXPLOSION

WIND

Classification of audio clips

Training database

feature extraction

training

classification model

unlabelled samples

Annotation

feature extraction

Audio features• Frame level features

• MFCC - most commonly used description of timbre

• MPEG-7 - Largee set of widely used measures (spectral shape, pitch, zero-crossing rate ....)

• Temporal aggregation

• Mean, variance

• 1st, 2nd derivatives (mean, variance)

• Attack, decay

• Temporal moments (centroid, kurtosis, skewness)

CONCEPT: a graph model sequencer and a set of sound events (samples) perceived as a single semantic unit. !! ZONE: part of the soundscape that presents a specific characteristic. Composed by a set of concepts. !!! SOUNDSCAPE: complex temporal-spatial structure of sound objects, organized as a set of layers or zones.

authoring soundscapes

authoring soundscapes

1B17.17s

1C13.88s

1 Start6.98s

10 End13.46s

6.5

6

63.09s

44.24s

87.29s

70.33s

34.92s

21.46s

93.74s

5

3

5

5

8

56.27s

3

3

7

0.2

106

6

5

4

4.5

15/20/120/460/530

4

13

6.3 Web-based music creation

Network music

from Barbosa, A. 2006. “Computer-Supported Cooperative Work for Music Applications” PhD Thesis - Music technology Group; Pompeu Fabra University, Barcelona, Spain

web-based music creation

• Anonymous users

• Unknown musical / technical background

• Collective vs individual goals

Creativity• Value / innovation

• Levels:

• artifact

• individual

• collective

• Csikszentmihalyi (1988) Systems view of creativity

• Montuori (1995) Deconstructing the Lone Genius Myth

detached creation process

offline online

single user interface music creation systems

friendship and collaboration networks

Radio freesound

• Radio station for discovery of combinations of sounds from the database

• Allowed internet users to create and share short compositions without any assumption about their musical training

• Based on human-based evolutionary algorithm (selection, mutation, crossover)

sample patch

• Simple data structure to represent mixes and sequences of sounds

• Accommodates different levels of expertise

• Allows to quickly sketch and share short compositions

representing nested structures

• Limit sample patch to rooted tree (no loops!)

• Add virtual start and end nodes

start end

A

B

C

A

D

A

B

C

A

D

start end

X Y

patch_1 patch_2

start

end

X Y

patch_2

final songauthor: user_1

patch 1author: user_1



sound 3author: user_3





Embedding

start end

A

B

C

A

D

Authorship tree

Radio freesound 2.0

search panel

composition panel

edit panel

Programming audio in the browser: flash

• Origins: the web is conceived a sa network of hyprerlinked rich text documents. Multimedia supported via plug-ins

• Neither Javascript nor Actionscript timers are accurate enough for audio

• Around 2005 (?) a hack was found to support “accurate” (+/-) audio timing on the flash player using the SOUND_COMPLETE event

• 2008: A flash update breaks the workaround “Adobe make some noise” campaign

• Adobe adds basic low level audio functionality

Web Audio API

• 2008: HTML5 first public draft. HTML5 adds <audio> tag.

• 2009: Mozilla starts working on low level audio api (Audio Data API)

• 2010 W3C Audio incubator group. Google and Apple promote alternative Web Audio API

• 2013: Web audio API is implemented on webkit browsers (Chrome and Safari, including mobile Safari)

floop

• Motivations:

• More than 10k sounds are tagged as “loop” in freesound, but there is no global separation between musical and non-musical sounds

• Many non-musical sounds are loopable

• Music descriptors give “random” results for non-musical audio

beat spectrum (foote,2001)

Similarity matrix Beat spectrum

Not as biased towards polyphonic musicCan be computed over different features

loop detection

• Assumption: user has cut the loop correctly so that the duration matches the rhythmic content

• Pick N peaks from the beat spectrum

• Detect harmonics of the duration (‘loop periods’)

spatial mapping

• Select sounds with matching loop periods

• Organize by similarity:

• Compute distances from MFCC

• Compute knn graph

• Map to 2D using force-directed graph layout

http://labs.freesound.org/floop

http://labs.freesound.org/floop

6.4 Music hacks

Music Hack Day

• Meeting between music technology companies and hackers

• Make connections, create a prototype (hack) in 24 hours

• Freesound API has been used in a number of hacks

Cantamañanas

• Parse RSS feeds from the news (e.g. sports …)

• Retrieve some loops from freesound with matching key and tempo (using API)

• Generate a matching melody algorithmically

• Have a singing voice synth sing the text (using extinct Canoris API / Vocaloid)

The Daily soundscape

• Obtain some key words from the day’s news

• Search for sounds in freesound

• Compose a stream

• http://ginsonic.wordpress.com/2013/03/13/python-music-hack/

http://ginsonic.wordpress.com/2013/03/13/python-music-hack/

Free CC it

• Audio Mosaicing using freesound sounds

• Get descriptors and beat positions of a song using EchoNest API

• Analyze with Essentia

• Resynthesize with similar sounds from freesound

• http://labs.freesound.org/freeccit/

http://labs.freesound.org/freeccit/

Free Maschine!

• Map sounds to NI Maschine (MPC-style grid) using different search methods

• Mutate sounds while sequencing

Documents

Introduction to Freesound - ETIC UPFgroma/CDSIM/freesound_CDSIM.pdf · Foley (1891-1967), working at Universal Studios ... (later removed) ... (django) + Postgres by Bram + a team