Upload
alexander-stevens
View
20
Download
1
Embed Size (px)
DESCRIPTION
A simple implementation of a music visualizer using Python and OpenGL
Citation preview
VisualBoxAn OpenGL Music Visualiser
Alexander Conrad Stevens41719882
Visualization, Computer Graphics & Data AnalysisComputer Graphics Project
April 2012.
Contents
1 Project Overview 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 The VisualBox Platform 2
2.1 The Development Environment . . . . . . . . . . . . . . . . . . . . . 2
2.2 Sampling and Playing the Sound . . . . . . . . . . . . . . . . . . . . 2
2.3 The Graphics Library . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Miscellaneous Python Libraries . . . . . . . . . . . . . . . . . . . . . 3
3 Design of Visualisation 4
3.1 Inspiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Design Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Implementation 6
4.1 GStreamer: Playing and Decoding . . . . . . . . . . . . . . . . . . . . 6
4.2 NumPy and the Fast Fourier Transform . . . . . . . . . . . . . . . . 7
4.3 OpenGL, GLU and GLUT . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Results and Conclusions 12
5.1 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.2 What Could Be Improved? . . . . . . . . . . . . . . . . . . . . . . . . 14
5.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Appendices 15
A Program listings 16
ii
Chapter 1
Project Overview
1.1 Introduction
Stimulation through sound is such a large industry in todays culture, that there are
many methods of satisfying the desire for audible stimulation. Many such ways to
satisfy these desires, include playing an instrument, playing music through a device
or attending live concerts. Many of these surely do stimulate the auditory senses
of the subject, but they do not generally stimulate the visual senses. This is where
VisualBox and many music visualisers fill in the gap.
1.2 Aim of Project
The general idea of a music visualiser is to take the music input frequency, level,
tempo, etc. and convert it to a visual representation of the sound. This can
be completed in 2D or 3D; however, the focus of this project will be 3D visualisa-
tions. So the aim for the project is to develop a 3D visualisation that takes at least
frequency and level samples from the provided music, and convert them into an
aesthetically pleasing and entertaining 3D animation. The 3D world does not need
to be interactive, as the music being played through the visualiser can be considered
the interactive medium. Though, the visual representation needs to clearly show to
the subject that manipulation of the world is directed by the music.
1
Chapter 2
The VisualBox Platform
2.1 The Development Environment
For simplicity and prior knowledge and experience in development on the Ubuntu
operating system, it was decided that the beta of Ubuntu 12.04 would be used to
develop VisualBox. The Ubuntu platform is quick, and provides many libraries that
can be used at the developers disposal.
The Python programming language will also be the language of choice to develop
the application. It is quick to prototype simple (and complicated) operations on the
fly, and can provide good insight as to what is happening within the program.
2.2 Sampling and Playing the Sound
Since VisualBox is being developed within Ubuntu, it made sense to use a multime-
dia framework that was installed by default and had a highly customisable pipeline.
The obvious choice from a developers standpoint would be the GStreamer mul-
timedia framework.
It allows the modification of existing extensions to play any given audio file that
GStreamer supports, as well as the ability to modify the decoder, mux, sinks, pads,
and many other aspects of the pipeline. This would allow decoded sound samples
to be acquired and played at the same time.
2
2.3. THE GRAPHICS LIBRARY 3
2.3 The Graphics Library
Once again, since Ubuntu is the operating system of choice for VisualBox, libraries
that are compatible with the system are needed. Unfortunately, this automatically
made Direct3D out of the question (since its a Windows/Xbox exclusive technol-
ogy). There is however, the Simple DirectMedia Library (SDL) and OpenGL.
SDL is an attempt to standardise the input and output between operating systems
and platforms. This includes the standardisation of audio, keyboard and mouse
inputs - however, since the visualiser does not need keyboard or mouse input and
already has a medium in which to play music, SDL is made redundant. This leaves
OpenGL along with its GLU and GLUT libraries to develop upon. This also enables
a more in-depth learning experience into how OpenGL functions, rather than using
a higher level of abstraction like SDL.
2.4 Miscellaneous Python Libraries
Libraries like SciPy/NumPy (algorithms and mathematics library) and threading
libraries will be used to supplement the GStreamer (py-gst) and OpenGL (python-
opengl) libraries. Use of these libraries will be self explanatory (except as mentioned)
and can be considered trivial to explain.
Chapter 3
Design of Visualisation
3.1 Inspiration
The attainment of an entertaining and aesthetically pleasing visualisation can be rel-
atively difficult. Many people have different opinions on what is pleasing to them,
and so a topic that generally appeals to most people would be chosen.
Outer space is a mysterious place, and often unpredictable. In many cultures it
can also be seen that people have a fascination with the void beyond their own little
planet. So from a designers stand point, the sun and the stars and their mysterious
secrets could be considered an interesting and appealing direction for a visualisation.
That is why a star and black hole binary system can be used to appeal to a large
audience.
Figure 3.1: Example of Black Hole and Star Binary System
4
3.2. DESIGN DIRECTION 5
3.2 Design Direction
So as can be seen in Figure 3.1, there are solar flares, solar activity on the surface
of the star, and the gravitation effect of the black hole slowly devouring the larger
star. Unfortunately, in real time, one could imagine that the process isnt quite as
dynamic in a macro level. However, the activity on the surface of the star is quite
dynamic and could be considered a micro level activity. If one were to imagine that
this micro level activity could be represented in a macro sense, a visualiser could
effectively show dynamic properties of the music on the surface of the star. This
could include many particles or shapes, skipping about and moving around the star
according to the beat of the music.
VisualBox though, will assign frequencies to each particle, and have the level of
that frequency dictate a miniature solar flare. This gives the effect of an exploding
star with music of infrequent, heavy beats. Hence, this would satisfy the require-
ment that a subject could determine that the visualiser is based on the music and
not just a pre-set loop. Since each particle (or solar flare) has its own frequency,
there will be variance over the star for jumping particles.
When a solar flare jumps high off of the star and close to the black hole, one
would expect the black hole to trap the solar flare into a spinning orbit - until its
impending doom. In the case of VisualBox, this is simple, in which if the particle
is too close to the black hole, the particle will be trapped in orbit. Once in orbit,
the particle retains its assigned frequency, but instead of the particle jumping (or
flaring), the particle will speed up and slow down according to the sound level. After
a randomly assigned time, the particle will decay and then finally reinitialise itself
on the surface of the star if any particles are in an idle state, they will just roam
across the surface of the star.
To populate the black void of space, simple stars can be added - they have no
other purpose other than to add depth and a sense of vastness. To show the depth
of 3D to the viewer, the camera can also rotate about the scene, using the centre
point of the star as a reference.
Chapter 4
Implementation
4.1 GStreamer: Playing and Decoding
GStreamer is a pipeline based multimedia framework, this means that a developer
can take a stream input (file, device, etc.) and pass the media data down the pipeline
to be modified, decoded, played, saved to a file, or just sent into nothingness. This
basic set of extensions and plug-ins can be rearranged to suit the developers inter-
ests.
Figure 4.1: Example of a simple audio pipeline in GStreamer
GStreamer also has a very well implemented extension called Playbin2 in which
utilises the GStreamer codecs already installed on the viewers system to play any
Video or Audio file with little to no effort from the developer. That surely satisfies
the requirement that VisualBox needs to play the music with the visualisation, but
it surely doesnt link the decoded music to the visualisation.
This is where the pipeline framework becomes extremely useful. The developer
6
4.2. NUMPY AND THE FAST FOURIER TRANSFORM 7
only has to make a separate bin/pipeline, plug up the original Playbin2 (at the
audio playing sink, or ALSA element), and redirect that pipeline to a custom audio
playing sink and decoded output sink - similar to that shown in Figure 4.2.
Figure 4.2: Layout of modified audio pipeline in GStreamer
The audio sink will play the music, while the decoded sink will send the pull-buffer
signal, allowing for the decoded Pulse Code Modulated (PCM) audio data to be
saved. This PCM data is channel interleaved (Left Channel Sample, Right Channel
Sample, Left, Right, etc.) and as specified in the GStreamer initialisation, has 16-
bits worth of depth - as can be seen in Figure 4.4. This data is saved as an array of
16-bit integers. For parallelisation, the GStreamer component of VisualBox will be
threaded next to the OpenGL component.
Figure 4.3: Example of a Decoded PCM Audio Data Sample
4.2 NumPy and the Fast Fourier Transform
Since the decoded data has been acquired, the PCM format needs to be converted
into something workable. Preferably, VisualBox needs a set of frequencies and their
corresponding levels. This is where the NumPy libraries come into play.
8 CHAPTER 4. IMPLEMENTATION
NumPy has an algorithm for Fast Fourier Transforms, an efficient use of Discrete
Fourier Transforms. This method essentially converts a signal into its frequency and
level components. In the case of VisualBox though, it does not need to know fre-
quencies in Hertz or levels in decibels, it just needs to know the dominating regions
of music, and visualise it.
Figure 4.4: Real Example of the Mirrored FFT output from VisualBox
So on each iteration of the OpenGL state, VisualBox will check the sampled data,
deinterleave the left and right channels, apply the FFT upon this data, and save
the output to two arrays of floats containing the levels in order of frequency. From
this, the OpenGL visualiser can use the raw data to compute particle effects, and
displacements according to ascending frequency. The only drawback to this method,
is that not all samples are used. This is since there would be a 44100 Hz sample
rate, split up by a framerate of around 20 frames per second, except the size of each
buffered data is 1152 16-bit samples; hence, only half the buffered data would be
captured, computed and utilised. However, the method of waiting for the OpenGL
4.3. OPENGL, GLU AND GLUT 9
state would be less CPU intensive, since youre only computing FFTs and deinter-
leaving when OpenGL can actually display the next visual.
4.3 OpenGL, GLU and GLUT
The OpenGL component of VisualBox (the class OpenGL Main), is where all of the
graphical components are brought together and visualised. The OpenGL class is
initialised, and started from a thread using the start() function. This will ini-
tialise GLUT and its display modes, display windows, window resizing functions,
draw functions, and key press even handlers. It also follows on to the OpenGL
initialisation component of VisualBox.
Within the initialisation function (InitGL()), depth is set to check for any objects
with a depth less than that of the stored depth (GL LESS), Polygons are configured
to only render the outside (GL BACK) and only display the polygon edge lines inside
(GL LINE) - this is to speed up rendering. To enable the depth checking between
objects so that theyre displayed in order, GL DEPTH TEST is added, and finally the
shade model is chosen to be GL SMOOTH to keep a smoother shade over objects. For
final touches, OpenGL will be hinted to use the cleanest rendering techniques in-
stead of the most efficient (glHint and GL NICEST), as VisualBox aims to provide
an aesthetically pleasing experience.
Figure 4.5: In order: Solar Texture, Star Sprite Texture, Particle Sprite Texture
Next, the starmap is initialised into memory as static vertices that are randomly
spawned over a large area of the scene. Then the particles that are spawned over the
sun are initialised with random lifetimes, random inclinations, and random colours
(tending to the red/orange spectrum) using the SphericalEmit() function. These
particles are also assigned a unique frequency that theyll hold for the lifetime of
the visualisation. Once this is complete, the textures can finally be initialised using
InitTexturing(); where the solar texture is the only clamped texture, but all are
loaded with alpha channels. These textures can be seen in Figure 4.5. Finally, the
OpenGL draw sequence can start.
10 CHAPTER 4. IMPLEMENTATION
The main draw function initially starts with a buffer clear, and then proceeds to
translate everything 3 units into the scene. This is effectively positioning the camera,
ready to rotate the whole scene by half a degree every frame. Alpha tests are then
enabled with a check passing only if the alpha is greater than 0.1. The blend func-
tion is also set up to the recommended layout for transparency - this is mainly for
the background stars and the particles as they pass over each other and solid objects.
Once the precursing configurations have been set up, it is ready for the ARB Point
Sprites to be loaded. These point sprites are essentially vertices in space, that are
hardware rendered with a single texture instead of a pixel or quad. The quadratic
is set up to dictate how distance should effect the size of the star or particle. The
quadratic is pretty arbitrary in this case, it was chosen to give the best looking effect
for distant stars. Maximum point size is then specified and the actual point size set
to the maximum size. The only time that the sprites should fade out of the scene
are when they are too small to be perceived (when their size is less than 3.0, from
GL POINT FADE THRESHOLD SIZE ARB) or when the sprite is no longer on the screen.
The point sprites are now set up, and the stars can finally be rendered into the
scene. Textures are bound to the point sprites (in this case, the star sprite texture)
and the pre-calculated star map vertices are sent via the glBegin(GL POINTS) func-
tion. The vertices for the stars are kept static throughout the whole visualisation.
The particles that are orbiting the Black Hole and the Sun however, are dynamic.
Generation of the dynamic particles will start only when GStreamer has signalled
that a buffer is ready to be pulled and is stored. Once this has happened, VisualBox
will deinterleave the stereo channels that make up the single decoded audio buffer,
and turn them into two arrays of data called lftch and rhtch. The data will then
be passed through a fast fourier transform, and scaled down by 5 orders of magni-
tude. The particle texture is then bound to the following point sprites, and the point
sprite size is set to 2. When the point sprites for the particles are sent to the buffer,
every even particle will use the left channels frequency, and every odd particle will
use the right channels. This allows for balance and beat for various streams in music.
Each particle will then be given a nudge (through randomNudge()), according to
their assigned frequency (from initialisation). This nudge will simulate a solar flare
and expel a particle by a displacement dictated by the level from the associated
frequency. If the particle doesnt enter the radius of the Black Hole (half the dis-
tance between the Black Hole and the Sun) and get trapped, the particle return to
4.3. OPENGL, GLU AND GLUT 11
a orbit position on the surface of the Sun and continue to randomly shift and roam
along the surface. However, if the particle does get trapped in the grip of the Black
Hole, the particle will assume its orbit around the Black Hole, and orbit around the
black hole with a speed dictated once again by the sound level from the particles
associated frequency. The particle, once trapped will then start the countdown until
the count hits the designated lifetime, which will then remove the particle from the
Black Hole, and re-designate it upon the Sun, using SphericalEmit() again.
Finally, the Sun - which is a simple gluSphere - can be textured with the Solar
Texture using texture coordinates generated from gluQuadricTexture, and dis-
played at the [0, 0, 0] coordinate. It is considered a static object for the entirety of
the visualisation. The Black Hole however, has no texture (since its supposed to
be black), and randomly orbits the Sun at a randomly incremented radius and rota-
tion. The new position for which the Black Hole moves to next is calculated by the
blackHoleNudge() function. All of the calculations used in VisualBox use a polar
coordinate system thats converted to the normal coordinate system to calculate the
vertex positions around the Sun and the Black Hole. The Black Hole also uses this
same system to orbit the Sun.
Chapter 5
Results and Conclusions
5.1 Result
VisualBox provided an entertaining experience while listening to music. It success-
fully utilised the Frequency and corresponding Level to calculate motion that could
be clearly seen by the viewer. However, due to the method of sending the Points
Sprites to the buffer, the performance of VisualBox was not quite as high as ex-
pected on lower end graphics cards - in this case, the program was developed on an
Intel Core i5-2557 with HD3000 graphics. This limited the number of Point Sprites
on screen to about 1200 with 20 frames per second, instead of a potential 12000.
Figure 5.1: Close up of the stars rendered in VisualBox
Python was also considered a limiting factor, since its a runtime based language,
and the amount of data passed about VisualBox was significant. Removing one of
the audio channels from the process of interleaving and FFTs actually increased
the framerate of VisualBox. It is considered that doing many array manipulation
routines in a language like C would be far faster.
12
5.1. RESULT 13
Figure 5.2: Close up of Sun and Black Hole with Particles
As can be seen in Figure 5.2, addition of particles randomly orbiting the Sun and
the Black Hole proved to be a well worth inclusion. The particles help to obscure
the stretching of the Solar Texture (as can be seen in the centre of the Sun, if ob-
served closely), and the particles around the Black Hole are used to show that there
is actually a medium on an otherwise, almost black object - since they pass around
and behind.
14 CHAPTER 5. RESULTS AND CONCLUSIONS
Finally, Figure 5.3 clearly shows that all of the elements have come together in a
neat and entertaining fashion. The user does not need to have much of an input to
the visualisation, and the camera will slowly revolve about the Sun, giving a sense
of depth and dimension.
Figure 5.3: A still of the whole scene from within VisualBox
5.2 What Could Be Improved?
As discussed in the last section, Python was considered a restriction in terms of
performance. So a language like C - which is a compile time based language - would
be used instead of a runtime based language like Python.
OpenGL Vertex Buffer Objects (VBOs) should also have been used, instead of the
slow process of looping through an array of vertices and sending them to the buffer
using glBegin() and glEnd().
A true gravitational physics, rather than a simple displacement algorithm using
the frequency and levels. This would provide a much more dynamic and fluid ex-
perience if this model had been used. Particles could actually follow paths, and use
the sound level to accelerate the flow of particles to the Black Hole.
5.3. CONCLUSIONS 15
Add the addition of a beat checking algorithm to find the Beats Per Minute of
a song. This however, is still a large topic of debate, as to which algorithm would
be considered the best to use. Since different styles of music would surely have a
different beat signature.
With the combination of all of these improvements, a true, fluid particle motion,
with possibly 10s or 100s of thousands of particles could be implemented. This
could potentially look like the inspirational Black Hole and Sun Binary system as
shown in Figure 3.1.
5.3 Conclusions
Overall, VisualBox was a success; albeit, with a few improvements to be made.
Though, it was a 3-Dimensional visualiser that utilised the Frequency and Level
from sound samples to display an aesthetic and entertaining visualisation synchro-
nised with the beat of the music.
The code though, can now act as a base to develop even more complicated visuali-
sations. Ports can be made quite simply between different programming languages,
and a possible plug-in could be created for various Media Players. VisualBox could
be considered a prototype and learning experience for other programmers looking
into the world of GStreamer and OpenGL.
Appendix A
Program listings
Currently, VisualBox uses the Bazaar revisioning system and stores its code on
Launchpad.
Install Bazaar, Python GStreamer, Python OpenGL, and SciPy/NumPy using:
sudo apt-get install python-gst0.10 python-opengl python-numpy
And get the source code using:
bzr clone lp:alex-stevens/+junk/VisualBox
The code can also be viewed online at this address:
http://code.launchpad.net/alex-stevens/+junk/VisualBox
The revision that is referenced in this version of the document is revision 36.
16