Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
IN DEGREE PROJECT MEDIA TECHNOLOGY,SECOND CYCLE, 30 CREDITS
, STOCKHOLM SWEDEN 2017
The Effect of Social Media on the Numbers of Streams of Unsigned Artists’ Music
BJÖRN LUNDKVIST
KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF COMPUTER SCIENCE AND COMMUNICATION
The Effect of Social Media on the Numbers of Streams of Unsigned Artists’ Music
Sociala mediers påverkan på antalet streams av osignerade artisters musik
Björn Lundkvist
Master’s thesis in Computer Science and Communication, DA222X
Degree Progr. in Media Technology 300 credits
Supervisor: Emma Frid
Examiner: Roberto Bresin
KTH Royal Institute of Technology
CSC School of Computer Science and Communication
SE-100 44 Stockholm, Sweden
2017-07-01
Abstract Social media has provided a way for music artists to reach many people with their music, without having to rely on record labels
to perform marketing tasks. Most previous research within the area has focused on how already established music artists can use
social media as part of their marketing strategies and how digital technologies have transformed the music industry. This study
focuses on how unsigned music artists’ followers and fans on social media have an impact on their music streaming numbers. The
main research question of the study is: how does unsigned artists’ social media performance affect the number of streams of their
music? To investigate this, a robust regression model was defined with the aim of predicting the number of artists’ music streams
based on their social media data. The robust regression model showed that the social media variables did not have significant effects
on the number of streams. Therefore, an analysis of each individual artist in the data was conducted. The results showed that the
social media data in this study could not be used to explain changes in the number of streams for unsigned music artists. An analysis
based on each individual artist and the content that each individual artist is posting on the different social media channels, is
suggested instead. An information visualization tool was developed with the purpose of allowing analysts to get an overview of the
social media data as well as allow analysts to look at each artist’s social media feeds to understand how artists’ social media
activities affect their music streaming data.
Keywords Social media, unsigned music artists, music streaming, robust regression analysis, information visualization.
Sammanfattning Sociala medier har gjort det möjligt för musikartister att nå många människor med sin musik utan att behöva förlita sig på skivbolag.
Tidigare forskning inom området har fokuserat på hur redan etablerade musikartister kan använda sociala medier som en del av
sina marknadsstrategier och hur digital teknik har förändrat musikbranschen. Denna studie fokuserar på hur osignerade
musikartisters antal anhängare och fans på sociala medier påverkar antalet streams av artisternas musik. Studiens huvudsakliga
forskningsfråga är: Hur påverkar osignerade artisters prestationer på sociala medier antalet streams av deras musik? För att
undersöka detta definierades en robust regressionsmodell i syfte att förutse antalet streams av artisternas musik baserat på deras
sociala mediedata. Den robusta regressionsmodellen visade att socialamedievariablerna inte hade signifikanta effekter på antalet
streams av artisternas musik. Därför genomfördes en analys av varje enskild artist i datan. Resultaten visade att sociala mediedatan
i denna studie inte kunde användas för att förklara förändringar i antalet streams för osignerade musikartister. En analys baserad
på varje enskild artist och innehållet som varje enskild artist lägger ut på de olika sociala mediekanalerna föreslås istället. Ett
informationsvisualiseringsverktyg utvecklades med syftet att ge analytiker en möjlighet att få en överblick över sociala mediedatan
samt låta analytiker titta på varje artists sociala medieflöden för att förstå hur artisternas sociala medier påverkar deras
musikstreamingdata.
Nyckelord Sociala medier, osignerade musikartister, musik-streaming, robust regressionsanalys, infromationsvisualisering.
1
The Effect of Social Media on the Numbers of Streams of
Unsigned Artists’ Music Björn Lundkvist
CSC School of Computer Science and Communications
KTH Royal Institute of Technology, Stockholm
Sweden
ABSTRACT Social media has provided a way for music artists to reach
many people with their music, without having to rely on
record labels to perform marketing tasks. Most previous
research within the area has focused on how already
established music artists can use social media as part of their
marketing strategies and how digital technologies have
transformed the music industry. This study focuses on how
unsigned music artists’ followers and fans on social media
have an impact on their music streaming numbers. The main
research question of the study is: how does unsigned artists’
social media performance affect the number of streams of
their music? To investigate this, a robust regression model
was defined with the aim of predicting the number of artists’
music streams based on their social media data. The robust
regression model showed that the social media variables did
not have significant effects on the number of streams.
Therefore, an analysis of each individual artist in the data
was conducted. The results showed that the social media data
in this study could not be used to explain changes in the
number of streams for unsigned music artists. An analysis
based on each individual artist and the content that each
individual artist is posting on the different social media
channels, is suggested instead. An information visualization
tool was developed with the purpose of allowing analysts to
get an overview of the social media data as well as allow
analysts to look at each artist’s social media feeds to
understand how artists’ social media activities affect their
music streaming data.
Keywords Social media, unsigned music artists, music streaming,
robust regression analysis, information visualization.
1 INTRODUCTION Marketing through social media is not a new phenomenon.
It has been an increasingly popular and effective way to
market one’s brand ever since the breakthrough of social
media. Social media has made engagement and collaboration
with consumers much easier for companies. This is mainly
because of the possibilities of reaching large quantities of
people and to create closer customer relationships for
companies (Hudson et.al, 2015). The consumers’ impact on
company sales and brand marketing has become more
significant with the emergence of social media. Social media
have made it possible for people to communicate and share
content with a much larger quantity of people than before.
Hudson and colleagues (2015) state that “when people hear
about a product from a friend, they become a customer at a
15 percent higher rate than when they find out about it
through other means”.
Music consumers are increasingly finding music through
social media. The technology, tools and platforms that have
made music so easily accessible have changed consumer
behaviors in terms of music consumption. Dewan and
Ramaprasad (2014) state that “technology has had a great
impact on music sales and the shape and size of the music
market”.
Most of the studies investigating the impact of social media
on the music industry has focused on how consumer-
generated content, and peer-to-peer communication, has
affected the music industry (Lin & Lu, 2011). Relatively few
studies have focused on social media as a tool for marketing
in the music industry (Boyd & Ellison, 2008). The research
that has been conducted within the music industry has
largely investigated how the roles of the record labels and
artists have changed because of the digitization of the
industry (Leeners et.al, 2015).
This study focuses on the effect of social media on music
streams of artists that are not signed by any record labels yet.
The main aim of the study was to investigate if unsigned
artists’ streaming data could be predicted based on the
artists’ social media data.
2 RELATED WORK
2.1 Social Media Data Visualization Social media data is characterized by the fact that it is often
large in volume. Social media content is usually user-
generated. Therefore, social media data is challenging to
analyze. However, there are techniques involving using
visual analytics tools to make sense of the big data generated
from social media (Schreck & Kleim, 2013). The goal of
information visualization systems is to create a way to
understand large sets of data by using the human visual sense
(Card et.al, 1999). Schreck and Kleim (2013) state that when
it comes to social media data, visual representations can help
provide an initial overview of the data. An initial overview
can in turn enable an analyst to locate certain points of
interest in the data (Schreck & Kleim, 2013). There are a
variety of frameworks that have been developed with the
purpose of graphically visualizing data. The framework that
is most suited to use differs depending on the data that is to
be visualized. Bostock and colleagues (2011) present the
data visualization system library D3.js. It is a JavaScript
library with the focus of producing interactive and dynamic
2
data visualizations for web browsers. One of the advantages
of D3.js compared to its competitors is that it gives the user
a lot of control over the final visual results. It is also more
efficient when it comes to page-loading and data-loading
time, compared to other libraries and frameworks. D3.js can
be applied to a wide range of data sets and has been designed
to give the user the ability to customize the visualization
depending on the data (Bostock et.al, 2011). D3.js has been
used to develop the information visualization system in this
study.
2.2 Social Media and the Music Industry The music industry is an industry that is in a constant state
of change. It has gone through a transformation where a big
part of its product has been transformed from a physical
product, that can be bought and owned by consumers, into a
digital state where the product is streamed via digital
platforms rather than bought and owned by the consumer
(Wagman & Kopp, 2006). The digitization of the industry
has forced actors in the music industry to develop their
business models and focus on more service-oriented and
customer relationship oriented marketing strategies than
before (Vaccaro & Cohn, 2004).
Social media has made consumer-to-consumer
communication easier. Social networking has made it
possible for one person to reach hundreds of people with
quite little effort. Although marketing managers cannot
control the direct consumer-to-consumer conversations, they
can influence and shape the discussion using different
methods (Mangold & Faulds, 2009). Marketing has
previously been a one-way communication channel, but with
social media, the customers’ opinions and voices become
important for how a brand or an organization is perceived
(Mangold & Faulds, 2009). Marketing organizations
therefore need to coordinate so that they include social
media in their marketing strategy and send one unified and
customer-focused message to the consumer (Boone & Kurtz,
2013).
There has not been a lot of research conducted on the impact
of social media on the music industry, but lessons can be
learnt from research conducted on other industries.
Rothschild (2011) investigated the use of social media for
marketing purposes for sports and entertainment venues.
Rothschild (2011) found that the ones who had a defined
social media strategy experienced higher revenues than
those that did not have a defined social media strategy. This
shows that social media can be used in strategic ways for
marketing purposes and that strategic social media presence
can have a positive effect on sales and revenue (Rothschild,
2011).
2.3 Predicting Future Revenue Asur and Huberman (2010) conducted a study focusing on
how much attention movies received on Twitter and how it
affected the revenues of the movies. Their study showed that
movies that got a lot of attention on social media performed
better in terms of viewers and revenues. The authors focused
specifically on Twitter and found that the number of tweets
that were generated over a specific period of time could be
used to create a model for predicting movie box-office
revenue. The authors state that although the focus of their
study was on movie revenue prediction, the method that they
used can be applied in other areas and markets. The model
that they present in their study can be generalized for
predicting revenues for products using social media (Asur &
Huberman, 2010). The generalized model can be seen in
Equation 1.
y = β1 ∗ A + β2 ∗ P + β3 ∗ D + ǫ (1)
In Equation 1, y represents the predicted revenue. A
represents the rate of attention seeking. P represents the
polarity of sentiments and reviews, which means the views
and opinions that are expressed on social media. D
represents the distribution parameter and ǫ is the error term.
Dhar and Chang (2009) investigated the effect of online
activity surrounding music albums and how online activity
affected sales of these albums. In their study, they
discovered that there was a positive correlation between the
amount of blog posts about a music album and the sales of
the album. A linear regression model was defined to estimate
the number of sales of a music album (see Equation 2). The
authors discovered that the sales of an album were dependent
on factors such as if it was released by a major record label
or if it had gotten reviews from mainstream sources. The
authors state that conclusions about the causality of blog
posts on album sales could not be drawn from their study.
However, the study clearly indicates that online activity
surrounding a music album had an impact on the album’s
sales. The word of mouth channel that the internet provides
for consumers had a significant impact on music sales,
according to the authors (Dhar & Chang, 2009).
ln(s)i,t+n = a + 𝛽1(ln(b)i,t) + 𝛽2(ln(m)i,t) + 𝛽3(ln(r)i,t) +
𝛽4(ln(f)i,t) + 𝛽5(ln(c)i,t) + 𝛽6(ln(l)i,t) + ɛi,t (2)
In Equation 2, i represents the observed album. t represents
the date of observation. s represents the amount of sales. n
represents the number of weeks since the observation date. b
represents the amount of blog chatter, meaning the amount
of times an album has appeared on different blogs. m
represents the number of mainstream reviews. r represents
the average consumer ratings. f represents the percentage
change in Myspace friends by week. c represents the level of
blog chatter multiplied with the number of Myspace friends.
l represents whether the album was backed by a major label
or not. ɛ is the error term.
2.4 Marketing Music in Social Media Kaplan and Hanlein (2012) investigated how Britney Spears
and her employees used social media to interact with the
audience and to build, as well as maintain, the celebrity
brand image. They identified five areas in which Britney
Spears and her employees can be used as an example of how
to use social media in a successful way. The authors state
that it is important to concentrate one’s efforts on few
channels and to choose strategically which channels to focus
on in order to avoid excessive complexity. Social media
channels lose their momentum over time and become
replaced by new channels. It is therefore important to keep
an own channel such as a website or blog that is under full
control of the artist’s team. Kaplan and Hanlein (2012)
highlighted the importance of making sure that content that
is created on different channels is aligned and consistent.
3
Britney Spears’ media team used Britney Spears’ website as
the center for her social media strategy and they even
ensured that content created by third party applications was
consistent with the media strategy of the brand. To be active
in responding to one's audience is necessary in order to be
successful on social media. It is important to build up a
community around the artist and to interact with the fans.
Britney Spears and her team achieved this by selecting a few
of all the fan-created videos that were posted on YouTube
and displayed them on Britney Spears’ website on a regular
basis. Britney Spears and her team also focused a lot on
keeping their integrity on social media by making it clear
when it was Britney Spears herself who was posting and
when it was a member of the marketing team. It is of
importance to be honest and differentiate between when a
post is made by the artist and when a post is made by a
member of the artist’s marketing team. Otherwise it can be
difficult to keep the credibility of the brand and the messages
that are posted on different social media (Kaplan & Hanlein,
2012).
Since not all artists have the resources that already
established artists like Britney Spears have, it is important to
distinguish artists that are in early stages of their careers
from artists that already have established themselves on the
music market. Leenders and colleagues (2015) studied how
young music artists have been configuring their media and
sales platforms in the digital age. Their study showed that
independent artists who were not signed to any record labels
benefitted from using social media in combination with live
performances in order to generate revenues. The authors also
support Kaplan and Hanlein’s (2012) view that marketing
performed by record labels plays an important role for
success for artists who are already signed. This is because
record labels can provide broader media access on many
different channels (Leenders et.al, 2015).
Technologies have paved way for a new structure of music
production and distribution. Technologies have made it
possible for individual musicians to produce and distribute
music independently from anywhere at any time. The low
entry barriers for new actors in the music industry that new
technologies have enabled have altered the music market in
a radical way. Independent music artists are now required to
perform a much wider range of tasks in order to succeed on
the music market. It has become a responsibility of the
independent artist to perform tasks that are of a more
business-like nature rather than of a creative nature. The
changes on the music market have forced independent music
artists to become entrepreneurs at the same time as becoming
artists (Hracs, 2012).
2.5 The Importance of Fans and
Followers An important term in this context is the notion of fandom.
Although there are many definitions of this term, most
researchers agree upon that it includes a group of people who
are connected because of their shared interest in popular
culture objects (Baym, 2007). Much of the research that has
been conducted on fandom since the middle of the 1990s has
focused on the internet as a center for fan participation and
activity (Baym, 2007). Baym and Burnett (2009) argue that
an artist’s fan base can be a powerful force to take advantage
of when spreading and marketing a brand. They argue that
by understanding the fundamentals of fan participation,
strategies can be developed to use fan participation as a
marketing tool. By understanding the circumstances under
which people share and spread an artist's music or brand,
strategies can be used to manage the level of fan
participation in marketing an artist or a brand (Baym &
Burnett, 2009). The importance of fan participation has
never been larger than it is today. The development of
technology has made it possible for amateurs and fans to
produce content on their own. New technologies have made
it possible for user-generated content to measure against
professionally produced content in terms of quality (Jenkins,
2006). Baym and Burnett (2009) state “fans are gatekeepers,
filters, and influencers on a scale they never were before the
internet. They are needed by both industry and other fans.”.
However, one must be careful not to make the fans feel
exploited (Baym & Burnett, 2009).
3 METHOD The main research question of this master thesis is: how does
unsigned artists’ social media performance affect the
number of streams of their music? The research question has
been divided into three sub-questions.
1. How can stream numbers of an unsigned artist’s
music be predicted based on the artist’s social
media data?
2. Is there a relationship between the social media
data and the stream numbers of unsigned music
artists’ music?
3. Do certain social media channels have a larger
effect on the stream numbers of unsigned music
artists’ music?
The data in this study was gathered from a music distribution
service which is provided by one of the world’s largest music
corporations. Unsigned artists can use the distribution
service to upload their songs to most of the major streaming
services and music sales platforms. It is also possible for the
unsigned artists to connect their Facebook-, Twitter-,
YouTube- and Soundcloud accounts to the distribution
service. This allows the distribution service to gather data
about the unsigned artists’ social media performances as
well as music sales and streaming data.
The music streaming data in this study consisted of the
number of streams per day of each release of each artist on
the music streaming service Spotify. A release is defined as
a single, an EP or an album belonging to a specific artist. The
streaming data consisted of the number of streams a release
had on a specific day. The social media data however,
consisted of the all-time number of Facebook page likes,
YouTube fans, Twitter followers and Soundcloud followers
that the artists’ had each day. Since not all unsigned artists
had connected their social media accounts to the distribution
service, or did not have social media accounts, social media
data was not available for all artists.
An initial data analysis was conducted by visual inspection
of the data. The visual inspection of the data enabled
identification of outliers and errors in the data, as well as
localizing where there were missing values. The data set
contained data about the artists’ releases and social media
4
between the dates 2014-12-28 and 2017-03-05. The data
contained a total of 11,744 releases from 7,484 artists. The
distribution service tracked Spotify streaming data in a
separate system than the system used for the data collection
in this study before 2015. Therefore, all releases that were
released before 2015 were removed from the data.
In order to calculate correlations between the social media
variables and the number of streams, the original data set
was filtered down to contain only the artists that had data
available for all the four different social media channels
during a period of 100 days after the release date of a release.
The reason why the interval of 100 days was chosen was
because there were few artists that had a consistent stream of
data for a longer period of time than 100 days. After filtering
the data, the data set contained 13 releases from a total of 11
artists. The reason why there were so few artists left after the
filtering was because there were few artists that had all four
social media channels connected to the distribution service.
There were also a lot of missing values and error values in
the data.
The assumptions for a linear regression model were not met.
Instead a robust regression model was estimated from the
data. Robust regression is a form of regression analysis
which is not critically affected by violations of assumptions
of the data. A robust regression model using a maximum
likelihood-estimation was defined to predict music
streaming data. The purpose of defining a robust regression
model was to predict how many streams an artist’s release
would have on a certain number of days after the release-
date, depending on the number of followers and fans that the
artist had on social media on the day of release. In order to
define the regression model, the original data set was filtered
down to contain only the artists that had their Facebook
accounts connected to the distribution service. This was
because there were strong correlations between all the
different social media channels in the data. Therefore, only
one social media channel was necessary to use in the
regression model to represent the effects of social media on
the number of streams. The data analysis was conducted
using the statistical analysis programming software R.
An information visualization system was developed using
the D3.js framework. The aim of the information
visualization system was to give a visual representation of
the large data set for visual inspection and analysis to gain a
deeper understanding of the social media and streaming data.
The tool was developed with the purpose of making it
possible to locate certain points of interest in the data. The
points of interest could then be used to look at the artist’s
social media channel feeds at certain dates in order to
understand what activities may have affected the streaming-
, and social media data.
4 RESULTS
4.1 Social Media Data Visual inspection of the data was carried out as a first step of
the data analysis and the descriptive statistics for the original
social media data can be seen in Table 1. The data was
filtered down to contain only the artists that had data
available for all four different social media. This was done
in order to analyze the correlations between the different
social media and the number of streams for different artists.
It was also done to see if there were some social media
channels that had larger effects on the number of streams for
the artists in the data set. After filtering the data there were
11 artists left in the data set. This means that the data set
became much smaller when filtered down. The reason for
that was that there was a lot of data that could not be used
because of missing values and error values.
The median values were specified in the descriptive statistics
instead of the mean values since the data was not normally
distributed. As can be seen in Table 2, the variations between
the median, the 1st quartile and the 3rd quartile, for the filtered
data set, were quite large for all social media variables. This
means that the artists in the filtered data set varied a lot in
terms of Facebook page likes, Twitter followers,
Soundcloud followers and YouTube fans. The median value,
1st quartile value and 3rd quartile value for Twitter followers
had much larger values than the median-, the 1st quartile- and
3rd quartile values for the other variables which indicates that
the artists in the filtered data set had more twitter followers
than followers on the other social media channels.
Soundcloud was the social media variable with the smallest
variation between the median-, the 1st quartile- and the 3rd
quartile value and it was also the social media variable with
the lowest median- and quartile values. This indicates that
the artists in the filtered data set had the fewest followers on
Soundcloud compared to the other social media channels.
When comparing the descriptive statistics for the original
data set in Table 1 with the descriptive statistics for the
filtered data in Table 2, it becomes clear that the filtered data
had larger median- and quartile values for all social media
variables compared to the original data set. This shows that
the filtered data set was not very representative of the
original data set.
The initial data analysis also revealed that the different social
media variables were all strongly correlated. As can be seen
in Table 3, there were strong positive Spearman correlations
between the social media variables in the data set. Spearman
correlation is a nonparametric measure of the rank
correlation between two variables. The Spearman
correlation coefficients have a value of between 1 and -1,
where 1 is total positive correlation and -1 is total negative
correlation (Lehman et.al, 2005). The strong positive
correlations between the social media variables showed that
the data for the different social media followed the same
increasing pattern. This means that when one social media
channel increased in value, the other channels increased as
well. Since all the correlations were strong, significant and
positive between the social media variables, only one
variable was selected to represent the impact of social media
on an artist's music streams in a regression model. This
allowed using a larger data set, since there were more artists
who had one social media connected to the distribution
service than artists who had all four social media connected
to the distribution service. Facebook was chosen as the
variable to focus on since it allowed for the largest data set.
When Facebook was chosen as social media variable, the
filtered data set contained 40 releases from a total of 33
artists.
5
Table 1. Descriptive statistics for the original data set
MEDIAN Q1 Q3 N
FB LIKES 298 96 929 4123453
TW
FOLLOWERS 201 33 790 3342177
SC
FOLLOWERS 42 12 147 4672194
YT FANS 15 2 68 3758094
FB = Facebook, YT = YouTube, TW = Twitter, SC =
Soundcloud, Q1 = 1st quartile, Q3 = 3rd quartile, N =
number of observations.
Table 2. Descriptive statistics for the social media
variables after filtering the data set.
MEDIAN Q1 Q3 N
FB LIKES 389 204 1501 1148
TW
FOLLOWERS 5342 622 12924 1052
SC
FOLLOWERS 172 16 300 1148
YT FANS 838 107 950 1148
FB = Facebook, YT = YouTube, TW = Twitter, SC =
Soundcloud, Q1 = 1st quartile, Q3 = 3rd quartile, N =
number of observations.
Table 3. Spearman-correlation coefficient matrix for
the social media variables.
FB
LIKES YT FANS TW
FANS SC
FANS
FB
LIKES 1 - - -
YT
FANS 0,973*** 1 - -
TW
FANS 0,899*** 0,862*** 1 -
SC
FANS 0,787*** 0,996*** 0,708*** 1
FB = Facebook, YT = YouTube, TW = Twitter,
SC = Soundcloud, *** <= 0.001.
It became clear that the artists in the data set could be divided
into three groups in terms of number of Facebook page likes.
There were two artists in the data set that stood out since they
had a lot more Facebook page likes than the rest of the artists.
These two artists had 68,209 and 106,405 Facebook page
likes on the day of release. As can be seen in Figure 1, there
were 6 artists in the filtered data set that had more than 1,000,
but less than 6,000 Facebook page likes. Most of the artists
in the filtered data set had less than 750 Facebook page likes
on the day of release. As can be seen in Figure 2, there were
25 artists, out of 33 artists, in the filtered data set that had
less than 750 Facebook page likes.
Figure 1. Facebook page likes on the day of release for
the artist’s that had less than 6,000 likes but more than
1,800 likes.
Figure 2. Facebook page likes on the day of release for
the artists that had less than 750 likes.
4.2 Streaming Data As mentioned previously, the data set was filtered to include
releases from artists who only had their Facebook accounts
connected to the distribution service. This resulted in the
filtered data set containing 40 releases. Visual data analysis
of the 40 releases revealed that they could be divided into
three groups in terms of number of streams over the first 100
days after release.
The first group contained four releases from four different
artists that had much larger numbers of streams over the first
100 days after release compared to the rest of the releases.
The number of streams per individual release was plotted
against the number of days since release for the first group,
as can be seen in Figure 3. The four releases plotted in Figure
3 has more than 2,000 streams over the first 100 days after
release.
Group number two consisted of releases who had more than
100 but less than 2,000 total streams during the first 100 days
since release. There were 20 releases in this group. The
releases in group 2 are plotted in Figure 4.
The third group consisted of all releases who had less than
100 total streams over the first 100 days since release. There
were 16 releases in the third group. This group is plotted in
Figure 5. Each line in Figure 3, 4 and 5 represents one
release. The visual inspection indicated no distinguishable
patterns between the releases in the data set in terms of
6
number of streams per day. The number of streams per day
for each individual release varied a lot.
Figure 3. Number of streams/day for the releases in
group 1. The x-axis represents the number of days since
release and the y-axis represents the number of streams.
Figure 4. Number of streams/day for the releases in
group 2. The x-axis represents the number of days since
release and the y-axis represents the number of streams.
Figure 5. Number of streams/day for the releases in
group 3. The x-axis represents the number of days since
release and the y-axis represents the number of streams.
The number of streams was aggregated for all artists in order
to plot the average number of streams per day. The average
number of streams per day for all 40 releases in the data set
was plotted against the number of the days since release, as
can be seen in Figure 6. It is clear that the number of streams
per day became bigger over time within the first 100 days
after release. However, it is important to consider that one of
the artists in the data set had a very large amount of the total
number of streams of the artists in the data set. As can be
seen in Figure 3, one artist had more streams than most other
artists and that artist’s streams per day also follows a linear
pattern. Therefore, this artist had a big effect on why the
aggregated data, that can be seen in Figure 6, shows a linear
increase in the number of streams based on the number of
days since release. The correlation between the average
number of streams per day and the number of days since
release was 0.74 with a significance of <0.001, which
confirms that the average number of streams increased when
the number of days since release increased for the
aggregated data.
Figure 6. Average number of streams per day for the
entire data set.
4.3 Robust Regression Model A robust regression model was defined to predict the number
of streams based on the number of days since release and the
number of Facebook likes on the artist’s profile page on the
day of release. The reason for why the number of Facebook
page likes on the day of release was used and not the number
of Facebook page likes during each day of the observed time
period was because the number of Facebook page likes in
the future is unknown when predicting the future number of
streams for an artist. This means that the number of
Facebook page likes also would have to be estimated when
being used in the model, if the number of Facebook page
likes on each day during the observed time period was used
as an independent variable instead. This would lead to a
larger uncertainty in the estimated number of streams for the
artist since it would be based on an estimated number of
Facebook page likes. The robust regression was carried out
in R using the command that can be seen in Equation 3.
summary(m = rlm(s ~ d + f)) (3)
m in Equation 3 represents the model, s represents the
number of streams, d represents the number of days since
release and f represents the number of Facebook page likes
on the day of release. A regression equation was defined and
can be seen in Equation 3.
y = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + 𝜖 (4)
𝑥1 represents the number of days since release. 𝑥2 represents
the number of Facebook page likes on the day of release. The
p-values for both Facebook page likes and days since release
were <0.001. However, the 𝛽1 coefficient was 0.015 and the
𝛽2 coefficient was 0.005 which shows that the relationship
between the independent variables and the dependent
7
Figure 7. View of the information visualization system for one selected artists release. The x-axis represents the number of
days since release. The y-axis represents the amount of Spotify streams, Facebook page likes, Twitter followers, Soundcloud
followers and YouTube fans.
variable was very small. The relationship was detected
because of the large sample size but the relationship is too
small to be of significance. When there is a small
relationship in the data, the p-values will become smaller as
the data set grows larger. Even though the p-values states
that the independent variables had significance, the
relationship between the independent variables and the
dependent variable was too small to be able to explain the
number of streams. Therefore, the robust regression model
does not support the hypothesis that the number of streams
can be explained by the number of days since release and the
number of Facebook page likes on the day of release. A
similar analysis was performed for all the other social media
variables to see if any of the other social media variables had
a significant effect on the number of streams. As in the case
with Facebook likes, the other social media variables did not
have significant effects on the number of predicted streams.
4.4 Observed Behaviors Among Artists’
Social Media Data Since the robust regression model analysis showed that the
number of days since release and number of Facebook likes,
or any of the other social media variables, did not have a
significant effect on the predicted number of streams, the
data for each individual release was analyzed separately. The
analysis showed that the patterns in the data for each
individual artist differed considerably. Spearman
correlations between the number of streams and the number
of Facebook likes for each artist was calculated. The only
significant correlation that could be observed was for an
artist that is referred to as artist 30636 in this study. Artist
30636 had a Spearman-correlation value of 0.69 between the
number of Facebook likes and the number of streams. This
means that when artist 30636’s total number of Facebook
likes increased, the number of streams per day increased as
well. Artist 30636 also had a much larger amount of total
Facebook page likes on average than most of the artists in
the data set. Artist 30636 had an average of 68,258 total
Facebook likes over the days that were examined. All other
artists in the data set had correlation values that were below
0.55. This suggests that there was only one artist in the data
set where the amount of Facebook likes had an influence on
the number of streams. The same analysis was conducted for
the other three social media variables. There were 27 artists
who only had Soundcloud connected to their account on the
distribution service, 32 artists who only hade Twitter
connected to their distribution service account and 21 artists
who only had YouTube connected to their distribution
service account. There were no significant correlations that
could be observed between the number of streams and the
number of YouTube fans, Twitter followers, or Soundcloud
followers. No discernible patterns could be found between
the number of streams and fans or followers on the different
social media channels. The number of streams differed a lot
between different days for most of the artists.
4.5 Information Visualization System An overview of the information visualization system can be
seen in Figure 7. The information visualization system that
was developed was designed to give a visual representation
of individual artists’ streaming and social media data. First,
the user chooses a release from a specific artist. The
information visualization system then displays a chart
showing how the number of Spotify streams has developed
over time since the release date. The chart also displays how
the artist’s social media data has developed during the same
period. Each line in the chart represents one data-channel for
8
the artist. The green line represents the number of Spotify
streams of the release, the dark blue line represents the
artist’s number of Facebook page likes, the light blue line
represents the amount of Twitter followers that the artist has,
the orange line represents the number of Soundcloud
followers that the artist has and the red line represents the
number of YouTube fans that the artist has. The user can also
choose to hide or show the different social media data in the
chart by clicking on the different social media logos beneath
the chart. The user can interact with the chart by hovering
the mouse cursor over a specific data point in the chart area.
The chart then displays pop-ups on each line with the exact
numbers of the data points aligning with the mouse cursor
on the x-axis. It also shows the specific date that each day on
the x-axis represents. Additionally, the visualization system
has the activity feeds from each social media channel for the
selected artist displayed to the right of the chart. This allows
the user to look at the activity feed and understand what
activities may have caused changes in the data. For example,
if there is a sudden increase of Spotify streams at some point
in the data chart, such as around point 60 on the x-axis in
Figure 7, the user can look at the different social media
activity feeds around that date to see what may have caused
the increase in Spotify streams. A walkthrough of the
information visualization system can be found in this video1.
5 DISCUSSION
5.1 Predicting Number of Streams from
Social Media Data Considering the results, it is not possible to answer the
question regarding how stream numbers of an unsigned
artist’s music can be predicted based on the artist’s social
media data with the data that was used in this study. The
robust regression model that was defined, showed that the
number of days since release and the number of followers,
fans and page likes on the four different social media
channels did not have significant effects on the number of
streams. However, it is important to consider that there were
outliers in the data set that had considerably larger numbers
of streams and social media followers, fans and page likes
compared to most of the artists in the data set. Although the
robust regression model that was defined in this report
indicated no significant effect, there could be unobserved
social media variables that have an effect on the number of
streams. For instance, the number of posts and quality of the
artists’ posting activities on the different social media were
not accounted for in this study. Marketing activities beyond
social media such as press releases and live performances by
the artists were not observed in this study either. The reason
for that was because the gathered data only contained the
number of followers, fans and page likes on the different
social media channels for the artists and the aim of the study
was to see how those variables affected the artists’ number
of streams. The distribution service did not track social
media data that was of a more qualitative nature. The
amount of fan participation around an artist on social media
was not accounted for in the regression model either. Social
media are complex channels in which fan participation and
different marketing activities by the artists can be the
1 https://www.youtube.com/watch?v=S7ICiGEuQx4
difference between a successful artist and an unsuccessful
artist.
The study by Asur and Huberman (2010) showed that
movies that generated a lot of attention on Twitter performed
better in terms of revenues and viewers. The authors showed
that the numbers of tweets that were generated surrounding
a movie could be applied as an independent variable in a
regression model for predicting revenue. Although the focus
of Asur and Huberman’s (2010) study was on movies, their
model can be applied on other products that use social media
for marketing purposes. This shows that there are other
aspects of social media that can be examined to gain a deeper
understanding of how social media can influence music
streaming numbers for unsigned artists.
Dhar and Chang (2009) examined the effect of blog posts on
the sales of music albums. In their study, they concluded that
a causal connection between blog posts and sales of an
album cannot be drawn from their study. This is because
there may be an unobserved variable that causes both an
increase in blog posts and music album sales. The same can
be said about the number of followers, fans and likes of the
unsigned artists in this study. The causality between the
number of followers, fans and likes cannot be certain. One
could argue that it is the popularity of the artists’ music that
draws attention to the artists’ social media channels, not the
other way around. A deeper analysis of each individual artist
must be made before conclusions can be drawn on that
matter. The information visualization system that was
developed in the current study was designed to function as a
tool for this purpose. It was designed to give analysts a good
overview of the social media and music streaming data as
well as provide a way to look more closely at each social
media channel for the artists. The information visualization
tool allows analysts to look at the quantitative data, as well
as the possibility to analyze qualitative data about each
artist’s social media activities. The chart with the social
media and music streaming data can be used to identify
points of interest in the music streaming data. As can be seen
in Figure 7, the social media feeds on the right side of the
information visualization tool can then be used to look on
the specific posts that might have affected music streaming.
5.2 Relationships Between Social Media
and Music Streaming Data The results showed that there were strong positive Spearman
correlations between all social media variables in the data
set. However, there was only one strong positive Spearman
correlation between an artist’s number of Facebook likes and
number of streams. This artist also had a much larger
quantity of Facebook page likes than the rest of the artists in
the data set. This would suggest that the number of Facebook
page likes only affect the number of streams for artists who
have gained a certain amount of Facebook page likes.
However, this cannot be generalized for all artists since this
behavior was only observed for one artist in the data set. That
artist also had a strong positive Spearman correlation
between the number of streams and the number of days since
release as well as the number of Facebook page likes and the
number of days since release. That means that the number of
9
streams and the number of Facebook page likes both
increased over time in a linear fashion.
The people who look at an artist’s Facebook posts are mostly
people who already follows the artist’s Facebook page. This
suggests that artists who have larger amounts of Facebook
likes will reach larger numbers of people and in turn get
more streams of their music. Artists who have smaller
amounts of followers, fans or likes won’t reach as many
people with their posts and will therefore not gain as much
attention which in turn won’t generate as many streams of
their music. However, there was only one artist that had a
larger amount of Facebook page likes than most of the other
artists. The same pattern could not be observed among the
other artists in the data set. Therefore, it cannot be concluded
that artists with a certain amount of Facebook page likes
have a strong positive correlation between the number of
streams and the number of Facebook page likes.
There were no strong correlations between the number of
streams and the number of Twitter followers, the number of
streams and the number of Soundcloud followers or the
number of streams and the number of YouTube fans, for any
artist in the data set. This suggests that the social media data
that was used in this study did not correlate with the number
of music streams.
5.3 Social Media Channel Effects As mentioned above, Facebook was the only social media
that had a correlation between the number of streams and the
number of followers, fans and page likes. However, the
correlation only existed for one artist who had significantly
larger numbers of Facebook likes and streams than the rest
of the artists in the data set. The robust regression analysis
showed that none of the social media variables had
significant effects on the number of streams. When looking
at the descriptive statistics for the original data set in Table
1, Facebook was the social media variable with the biggest
variation between the first and the third quartile as well as
having the largest median value. This shows that unsigned
artists in general have more Facebook page likes than
Twitter followers, Soundcloud followers and YouTube fans.
An important thing to consider regarding the effect of
followers on Soundcloud and fans on YouTube is that
Soundcloud and YouTube provide their own media players
for playing music on their platforms. This means that people
who follow artists on Soundcloud and YouTube do not have
to use a second platform to listen to the artist’s music. The
number of streams that the artists get on Soundcloud and
YouTube are not reflected in the streaming data in this study
since the streaming data only consisted of Spotify streams.
Based on the results of this study, there was only one
observation where the number of Facebook page likes had a
strong positive correlation with the number of streams. That
was for an artist who had a relatively large number of page-
likes on Facebook compared to the rest of the artists in the
data set. Leenders and Colleagues (2015) showed that social
media can function as a successful marketing tool for
independent music artists when used in combination with
live performances. Since the data in this study did not
consider the activities that the artist performed on the
different social media. This, once again, suggests that
qualitative data about the social media marketing activities
performed by the artists could be more suitable to look at
when trying to answer the question about how social media
can affect the number of streams of an unsigned artist’s
music.
5.4 General Discussion Record labels still play an important role in marketing and
gaining exposure for artists. Dhar and Chang (2009), Kaplan
and Hanlein (2012) as well as Leenders and Colleagues
(2015) all concluded that record labels play a big role for an
artist’s success because of the broad media coverage that the
record labels can provide in terms of marketing. This shows
that artists who are signed to record labels have bigger
advantages than artists who are unsigned. Being unsigned
makes it the responsibility of the artists themselves to
perform all marketing tasks. Competing about the audience’s
attention with the record labels as well as all the other
unsigned artists is a big task for artists with small fanbases.
As can be seen in the Table 1, most of the artists had less
than 929 Facebook page likes, 790 Twitter followers, 167
Soundcloud followers and 48 YouTube fans. As can be seen
in Figure 5, many artists in the data set had a very small
number of streams of their music per day. This shows that
the artists represented in Figure 5 are losing the battle against
the other artists in terms of bringing attention to their music.
The fact that there were so many artists in the data set that
had a small number of streams per day made it difficult to
predict how many streams an artist would have on a release
based on their social media data. Since there were a few
artists that had considerably larger numbers in terms of
music streams and social media data, they became outliers
compared to the rest of the artists. As can be seen in Figure
3, there are only four artists in the data set that have been
more successful than the rest of the artists in terms of number
of music streams.
5.5 Limitations and Future Work Since there were many missing values and errors in the data,
the data set had to be filtered before conducting an analysis
on the effect of social media on streams. After the data had
been filtered the data set only contained a small part of all
the artists that were in the original data set. As can be seen
when comparing the descriptive statistics in Table 1 with
Table 2, the quartile-, and median values differ a lot between
the two tables. This means that the filtered data values did
not give an accurate representation of the raw data values.
This study was also limited by only including data from the
four social media channels Facebook, Twitter, Soundcloud
and YouTube. These are the four social media channels that
the distribution service has chosen to gather data from and
consider the most important. There are however many more
social media channels on the market and new channels and
platforms emerge all the time. Which social media platforms
and channels that are the most effective can change over
time. Therefore, future research should investigate how
other social media channels can function as marketing
channels for unsigned music artists and how marketing
efforts on other channels can affect the number of streams of
unsigned artists’ music.
10
The data in this study was limited to the number of followers,
fans and page likes on the four social media channels and did
not cover the qualitative data about the content that the artists
in the study was posting on the different social media. Future
research should focus on what kind of posts and content on
social media that have the biggest effect on gaining attention
and followership for unsigned artists.
Future research should also focus on what kind of marketing
strategies that should be applied by unsigned artists to gain
more attention. Previous research in this area has mostly
focused on artists that already have established themselves
on the music market. As mentioned by Kaplan and Hanlein
(2012), who investigated the social media marketing strategy
used by the artist Britney Spears, already established artists
are often backed up by record labels and marketing teams.
6 CONCLUSIONS Although this study did not find a concrete answer to how
unsigned artists social media performance affect the streams
of their music, some conclusions can be drawn. The research
question was addressed with creating a robust regression
model as well as conducting an individual inspection of the
artists in the data set. This showed that the number of days
after release and the number of followers, fans or likes on
Facebook, Twitter, Soundcloud and YouTube did not have
significant effects on the number of streams of unsigned
music artists’ releases. This in turn suggests that the number
of followers, fans and likes cannot be used as sole predictors
for the number of streams of unsigned artists’ music. Instead
it suggests that one must look more closely at each individual
artist’s activities on the different social media channels. One
reason for why the regression model did not yield any
significant results could be that the data contained a lot of
missing values. It also could be possible that the data is very
specific since it only contains data about artists which have
used the distribution service. It is possible that a significant
regression model could be found for artists who are not
connected to distribution service.
An information visualization tool was developed. It was
designed to give an analyst a good overview of artists’
quantitative data about the number of followers, fans and
likes as well as the number of music streams of the artists’
releases. The tool allows analysts to locate points of interest
in the data and then look at each artist’s social media channel
feeds to gain an understanding of which kind of activities
might have affected changes in the music streaming and
social media data.
Social media has proven to be a powerful channel for
marketing purposes and reaching a large amount of people
in previous studies, but it is also a complex channel with
many layers. Interesting future research would include
making an in-depth analysis of which social media content
is most efficient in terms of generating music streams for
unsigned music artists. One way of performing such an
analysis would be to use information visualization tools such
as the one developed in this study to analyze unsigned music
artists social media and music streaming data.
7 REFERENCES Asur, S., & Huberman, B. A. (2010, August). Predicting the
future with social media. In Web Intelligence and Intelligent
Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM
International Conference on (Vol. 1, pp. 492-499). IEEE. Baym, N. K., & Burnett, R. (2009). Amateur experts
International fan labour in Swedish independent music.
International Journal of Cultural Studies, 12(5), 433-449. Baym, N. K. (2007). The new shape of online community:
The example of Swedish independent music fandom. First
Monday, 12(8). Bostock, M., Ogievetsky, V., & Heer, J. (2011). D³ data-
driven documents. IEEE Transactions on visualization and
computer graphics, 17(12), 2301-2309. Boyd, D. M, & Ellison, N. B (2008), Social networking sites:
definition, history and scholarship. Journal of computer
mediated-communication, 13(1), 2-21. Card, S. K., Mackinlay, J. D., & Shneiderman, B. (1999).
Readings in information visualization: Using vision to think.
Morgan Kaufmann. Boone, L. E., & Kurtz, D. L. (2013). Contemporary
marketing. Cengage learning. Dewan, Sanjeev, and Jui Ramaprasad (2014), Social media,
traditional media, and music sales. Mis Quarterly 38.1, 101-
121. Dhar, V., & Chang, E. A. (2009). Does chatter matter? The
impact of user-generated content on music sales. Journal of
Interactive Marketing, 23(4), 300-307. Hracs, B. J. (2012). A creative industry in transition: the rise
of digitally driven independent music production. Growth
and Change, 43(3), 442-461. Hudson, S., Roth, M. S., Madden, T. J., & Hudson, R.
(2015). The effects of social media on emotions, brand
relationship quality, and word of mouth: An empirical study
of music festival attendees. Tourism Management, 47, 68-
76. Jenkins, H. (2006). Convergence culture: Where old and
new media collide. NYU press. Kaplan, Andreas M., and Michael Haenlein. The Britney
Spears universe: Social media and viral marketing at its best.
Business Horizons 55.1 (2012): 27-31. Leenders, M. A., Farrell, M. A., Zwaan, K., & ter Bogt, T.
F. (2015). How are young music artists configuring their
media and sales platforms in the digital age?. Journal of
Marketing Management, 31(17-18), 1799-1817.
Lehman, A., O’Rourke, N., Hatcher, L., & Stepanski, E.
(2005). JMP for basic univariate and multivariate
11
statistics. SAS Institute Inc., Cary, NC, 481. Lin, K., & Lu, H. (2011). Why people use social networking
sites: An empirical study integrating network externalities
and motivation theory. Computers in human behaviour,
27(3), 1152-1161.
Mangold, W. G., & Faulds, D. J. (2009). Social media: The
new hybrid element of the promotion mix. Business
horizons, 52(4), 357-365. Rothschild, P. C. (2011). Social media use in sports and
entertainment venues. International Journal of Event and
Festival Management, 2(2), 139-150.
Schreck, Tobias, and Daniel Keim. Visual analysis of social
media data. Computer 46.5 (2013): 68-75. Vaccaro, V. L, & Cohn, D. Y (2004). The evolution of
business models and marketing strategies in the music
industry. International Journal on media management,
6(1/2), 46-58. Wagman, M. R., & Kopp, R. E. (2006). The digital
revolution is being downloaded: why and how the copyright
act must change to accommodate an ever-evolving music
industry. Vill. Sports & Ent. LJ, 13, 271.
www.kth.se