Upload
dangtuyen
View
225
Download
0
Embed Size (px)
Citation preview
Introduction to Freesound.org
Gerard RomaMusic Technology Group
UPF
Outline1. What is Freesound?
2. The freesound commuity
3. Architecture
4. Searching sounds
5. Freesound API
6. Applications:
1. Sound maps and education
2. Virtual world sonification
3. Web-based music creation
4. Music Hacks
1 What is Freesound?
media sharing• Media sharing has become a prominent
use of internet
• Users share text, photos, videos, music ... sounds
• “Sounds” have a different specific purpose: not enjoyed online, but downloaded for audiovisual production or music creation
• Sample libraries previously sold as CDs are now distributed over internet
foley• Practice named after Jack Donovan
Foley (1891-1967), working at Universal Studios
• Sound effects used in cinema, video, games
• Added in post-production to enhance the soundtrack
• With digital technology, existing sound FX recordings can be used
audio cultures in music• musique concrète - first uses of sound
recordings for music
• plunderphonics - explicit appropriationism as critique of copyright system
• soundscape composition - field recordings for preservation and composition
• loops - hip/hop, techno, minimalism ...
• glitch - errors, broken software/hardware
attitudes when working with recordings
• Avoid other people’s sounds
• Extreme transformations
• Collage & citation
• Anti-Copyright activism
• Plagiarism
freesound.org
• Collaborative audio database focused on sound samples
• Started in ICMC 2005 free sound
• 210,000+ sounds under CC licenses
• ~4M users
• About 10k contributors
Life of a (free) sound
• Recording
• Upload
• Describe (text, tags)
• Preview is computed (mp3, image)
• Moderation (group of freesound users)
• Downloads, comments ratings
moderation
• Sounds are moderated mainly for copyright infringement. Many difficult cases:
• Digital synths
• Toys
• Street music
• Also description quality
functionalities• Sounds can be shared, described, commented, rated, packed!• Forum: users request samples, exchange recording tips, show their work!• Geotagging: put sounds in the map!• Find similar sounds based on analysis of the audio signal
• New version developed by Bram de Jong
• Design by PixelShell
• Available under GNU Affero GPL
• Freesound 2.0 public API
Freesound 2.0
2.0 Licenses
• CC zero (public domain)
• CC Attribution
• CC Attribution non-commercial
2 The freesound community
Freesound community
“We made a thunder storm track to desensitize our dog.. When we got our puppy dog, it seemed to be somewhat scared of thunder. So we made up a storm track from many Freesound clips. We slowly played the storm track at increasing
volumes while we 'acted normally' ignoring the sound so as to keep the dog calm. It worked..!We've had various requests from friends for the storm track, and I've recently had a request for
a traffic track for a dog that's afraid of motorbikes.”
!-Nik
Nik, Cassie and Wicket
Nik, Cassie and Wicket (and Richard)
“I am the mother of 2 autistic children, who are both intelligent and go to regular schools. Next
week I will be giving a lecture about autism for all the teachers my son has to deal with this year.
One of his problems is that he hears all sounds on the same sound level, so the buzzing of a fly, a
zipper, a clicking pen, street workers outside the school, [...] and the teachers voice all sound
equally loud and all compete for the first prize, so to say.
I am looking for a short sound fragment which I can use during this lecture that contains most of the noises I mentioned above, or is similar to what I
want to demonstrate.”!
- Bianca
Bianca
3 Architecture
Freesound 1• The first version of freesound was
coded by Bram de Jong in php over mysql in 2005
• User management and forum based on phpbb
• This architecture couldn’t scale to the success of the site, both due to performance and maintenance issues
• Also the university initially capped bandwidth usage (later removed)
Freesound 2• Rewritten from scratch in python (django) +
Postgres by Bram + a team of developers and researchers at MTG
• Better maintenance, scalability and data integrity
• Distributed architecture
• Hosted in VM infrastructure at UPF. Currently maintained by Phd Students
• Again reaching bandwidth problems!
Freesound 2! ! !
!!
Tel.: 93 542 21 00 F ax: 93 542 24 49 www.upf.edu Roc Boronat, 138 08018 B A R C E L O N A !
FREESOUND2 Components Block Diagram
Licenses 1 - Web front-end, RESTful API Both the web front-end and the RESTful API are build using several Python modules. These modules are listed below:
o django - BSD license o django-extensions - BSD o django-debug-toolbar - BSD o django-piston - BSD license o markdown - BSD o pycrypto - Public Domain o BeautifulSoup - BSD o functional - PSF o FeedParser - MIT license (http://www.opensource.org/licenses/mit-license.php) o scikits.audiolab - LGPL o Pygments - BSD o python-cjson - LGPL o gunicorn - MIT o numpy - BSD o networkx - BSD o gearman (Python API for Gearman) - Apache license o pyzmq (Binding for Zeromq, http://www.zeromq.org/) - LGPLv3+ license o python-memcached (Python API for memcached) - Python license
(http://docs.python.org/license.html) o psycopg2 (Python API for postgres) - GPL license with exceptions or ZPL
(http://pypi.python.org/pypi/psycopg2/2.0.5.1)
Along with python modules we also use some javascript libraries:
Audio processing
• Sounds are processed in worker machines:
• Preview images (waveform, spectrogram) -> wav2png
• Compressed audio previews (mp3,ogg, low quality, high quality)
• Content-based analysis (Essentia)
Essentia• Library for audio analysis developed
at MTG-UPF during the last 6 years
• Large quantity of algorithms and descriptors• Low-level (based on magnitude spectrum)
• Tonal (chords, key ...)
• Rhythm (tempo, meter ...)
• Highlevel (live/studio, male/female voice ...)
• Sfx (inharmonicity, tristimulus ...)
• Optimized for polyphonic music
Gaia• Specialized server for indexing
documents according to distances in a vector space
• Used for content-based similarity in freesound
• Vectors are formed by Essentia descriptors
• After several years of exclusive licensing, Essentia and Gaia are available under AGPL
4 searching sounds
Text search• Based on Solr text indexing engine
• Default:search in filename, description, tags
• Rank: default by relevance (text-based), many options
• Results are also grouped in packs by default
• Advanced search: select different fields or ranges
Query filters
• Several fields are split into discrete options
• Solr returns a classification of search results into facets
• Examples: licenses, sample rates, tags ...
• Can be used to filter search
Content-based search
• Similarity search is supported for each sound
• Analog to Query by Example (QbE) in MIR
• Filter by descriptor range also supported via API
• Combined search: use content or text target, use content or text filters
5 freesound API
Web APIs
• Allow programs to access web resources
• Supported in most major social media apps: Youtube, Flickr, Facebook ...
• Promote new applications of the same data
• Usually require API keys (e.g. google maps) and possibly three-tier authentication (facebook)
Freesound API
• Access most important features of freesound.org programatically
• Clients for python, javascript, actionscript, supercollider
• Based on REST principles
http://www.freesound.org/docs/api/
6 applications
6.1 Sound maps and education
Sound maps
• M. Schaffer (1933) initiates Acoustic Ecology as a study of environmental sounds (The tuning of the world, 1977).
• Since the start, Freesound includes a google maps interface with geotagged sounds
• Other maps: escoitar.org, Radio Aporee ...
sonsdebarcelona• Started in 2008 for fostering local
community of freesound users
• Workshops at schools. Current directions: sons de la natura, sons de les cultures
• http://barcelona.freesound.org
• Example: sound fictions• http://barcelona.freesound.org/post/43831199917/tallers-de-ficcions-
sonores-talleres-de-ficciones
• http://www.freesound.org/people/sonsdebarcelona/packs/11877/
6.2 Virtual world sonification
Metaverse
• European project focused on standardization for virtual worlds
• Bridging with the real world
• Our contribution: methods for content-based retrieval and soundscape generation
metaverse environments
Towards user-contributed content• Many environments support in-world
building
• Users may upload 3D models
• Models are available in online databases
• Second life allows uploading sounds for attaching to objects
Problems when searching• Text-based search
• Polysemy
• Insufficient annotations
• Noise
• Content-based search
• Query specification
• Different descriptors for different kinds of sound
Ecological acoustics• Gaver (93) recalled Schaeffer’s
concept of musical listening and opposed it to everyday listening
• Musical listening: we listen to the properties of sounds without focusing on their sources
• Everyday listening: we use the properties of sounds to extract information about their source
• He also proposed a taxonomy based on the interactions between different kinds of materials
Taxonomy
INTERACTING MATERIALSHUMAN & ANIMAL
VOICES
MUSICAL
SOUNDS
VIBRATING SOLIDS AERODYNAMIC SOUNDS LIQUID SOUNDS
IMPACT
SCRAPING DEFORMATION
ROLLING DRIP
POUR SPLASH
RIPPLE
WHOOSH EXPLOSION
WIND
Classification of audio clips
Training database
feature extraction
training
classification model
unlabelled samples
Annotation
feature extraction
Audio features• Frame level features
• MFCC - most commonly used description of timbre
• MPEG-7 - Largee set of widely used measures (spectral shape, pitch, zero-crossing rate ....)
• Temporal aggregation
• Mean, variance
• 1st, 2nd derivatives (mean, variance)
• Attack, decay
• Temporal moments (centroid, kurtosis, skewness)
CONCEPT: a graph model sequencer and a set of sound events (samples) perceived as a single semantic unit. !! ZONE: part of the soundscape that presents a specific characteristic. Composed by a set of concepts. !!! SOUNDSCAPE: complex temporal-spatial structure of sound objects, organized as a set of layers or zones.
authoring soundscapes
authoring soundscapes
1B17.17s
1C13.88s
1 Start6.98s
10 End13.46s
6.5
6
63.09s
44.24s
87.29s
70.33s
34.92s
21.46s
93.74s
5
3
5
5
8
56.27s
3
3
7
0.2
106
6
5
4
4.5
15/20/120/460/530
4
13
6.3 Web-based music creation
Network music
from Barbosa, A. 2006. “Computer-Supported Cooperative Work for Music Applications” PhD Thesis - Music technology Group; Pompeu Fabra University, Barcelona, Spain
web-based music creation
• Anonymous users
• Unknown musical / technical background
• Collective vs individual goals
Creativity• Value / innovation
• Levels:
• artifact
• individual
• collective
• Csikszentmihalyi (1988) Systems view of creativity
• Montuori (1995) Deconstructing the Lone Genius Myth
detached creation process
offline online
single user interface music creation systems
friendship and collaboration networks
Radio freesound
• Radio station for discovery of combinations of sounds from the database
• Allowed internet users to create and share short compositions without any assumption about their musical training
• Based on human-based evolutionary algorithm (selection, mutation, crossover)
sample patch
• Simple data structure to represent mixes and sequences of sounds
• Accommodates different levels of expertise
• Allows to quickly sketch and share short compositions
representing nested structures
• Limit sample patch to rooted tree (no loops!)
• Add virtual start and end nodes
start end
A
B
C
A
D
A
B
C
A
D
start end
X Y
patch_1 patch_2
start
end
X Y
patch_2
final songauthor: user_1
patch 1author: user_1
patch 2author: user_2
patch 3author: user_1
sound 3author: user_3
sound 1author: user_3
sound 4author: user_3
sound 5author: user_3
sound 6author: user_3
Embedding
start end
A
B
C
A
D
Authorship tree
Radio freesound 2.0
search panel
composition panel
edit panel
Programming audio in the browser: flash
• Origins: the web is conceived a sa network of hyprerlinked rich text documents. Multimedia supported via plug-ins
• Neither Javascript nor Actionscript timers are accurate enough for audio
• Around 2005 (?) a hack was found to support “accurate” (+/-) audio timing on the flash player using the SOUND_COMPLETE event
• 2008: A flash update breaks the workaround “Adobe make some noise” campaign
• Adobe adds basic low level audio functionality
Web Audio API
• 2008: HTML5 first public draft. HTML5 adds <audio> tag.
• 2009: Mozilla starts working on low level audio api (Audio Data API)
• 2010 W3C Audio incubator group. Google and Apple promote alternative Web Audio API
• 2013: Web audio API is implemented on webkit browsers (Chrome and Safari, including mobile Safari)
floop
• Motivations:
• More than 10k sounds are tagged as “loop” in freesound, but there is no global separation between musical and non-musical sounds
• Many non-musical sounds are loopable
• Music descriptors give “random” results for non-musical audio
beat spectrum (foote,2001)
Similarity matrix Beat spectrum
Not as biased towards polyphonic musicCan be computed over different features
loop detection
• Assumption: user has cut the loop correctly so that the duration matches the rhythmic content
• Pick N peaks from the beat spectrum
• Detect harmonics of the duration (‘loop periods’)
spatial mapping
• Select sounds with matching loop periods
• Organize by similarity:
• Compute distances from MFCC
• Compute knn graph
• Map to 2D using force-directed graph layout
http://labs.freesound.org/floop
6.4 Music hacks
Music Hack Day
• Meeting between music technology companies and hackers
• Make connections, create a prototype (hack) in 24 hours
• Freesound API has been used in a number of hacks
Cantamañanas
• Parse RSS feeds from the news (e.g. sports …)
• Retrieve some loops from freesound with matching key and tempo (using API)
• Generate a matching melody algorithmically
• Have a singing voice synth sing the text (using extinct Canoris API / Vocaloid)
The Daily soundscape
• Obtain some key words from the day’s news
• Search for sounds in freesound
• Compose a stream
• http://ginsonic.wordpress.com/2013/03/13/python-music-hack/
Free CC it
• Audio Mosaicing using freesound sounds
• Get descriptors and beat positions of a song using EchoNest API
• Analyze with Essentia
• Resynthesize with similar sounds from freesound
• http://labs.freesound.org/freeccit/
Free Maschine!
• Map sounds to NI Maschine (MPC-style grid) using different search methods
• Mutate sounds while sequencing