21
Associative Framing A unified method for measuring media frames and the media agenda Wouter van Atteveldt 1 , Nel Ruigrok 2 , and Jan Kleinnijenhuis 1 1 Department of Communication Science, Free University Amsterdam, [email protected], [email protected] 2 Department of Communication Science, University of Amsterdam, [email protected] Abstract. Communication Theory, such as Framing, is using increas- ingly sophisticated models for message content and transfer. To support this theoretical work it is needed to also devise more sophisticated meth- ods of content analysis, as the manual thematic content analysis often practised is too expensive and specific to a single theory to provide for the large corpora needed to test sophisticated communication models. This paper proposes Associative Framing, a Relational Content Anal- ysis method based on the marginal and conditional reading chance of concepts. This provides a Communication Theoretic interpretation of linguistic co-occurrence analysis, and can help communication research by increasing the reach of automatic analyses and create more generic data sets, allowing for the simultaneous testing of different theories. We illustrate this method with a case study on the associations of Islam and Terror in newspapers from three countries. Keywords: Content Analysis, Computer Content Analysis, Relational Content Analysis, Framing, Agenda Setting, Co-occurrence Introduction Theoretical progress in Communication Science depends largely on empirical evidence to test and sharpen theories. Such empirical evidence often consists wholly or partially of media content, making Content Analysis an important technique for progress in Communication Science. The last decades have seen a sharp increase in the number and complexity of communication theories, among which Framing takes a prominent place. This increased theoretical complexity has not been fully accompanied with increased sophistication of Content Analysis methods. In particular, content is generally analyzed using manual Content Analysis specifically aimed at extracting infor- mation for a particular research question. Such often expensive types of research rely on one off coding schemes leading to relatively small data sets that are very

Associative Framing - Wouter van Atteveldtvanatteveldt.com/uploads/SSCORE_vanatteveldtetal.pdf · considerations" (Druckman, 2001, p.228). In line with Entman’s deflnition is-sue

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Associative Framing

    A unified method for measuring media frames and themedia agenda

    Wouter van Atteveldt1, Nel Ruigrok2, and Jan Kleinnijenhuis1

    1 Department of Communication Science, Free University Amsterdam,[email protected], [email protected]

    2 Department of Communication Science, University of Amsterdam,[email protected]

    Abstract. Communication Theory, such as Framing, is using increas-ingly sophisticated models for message content and transfer. To supportthis theoretical work it is needed to also devise more sophisticated meth-ods of content analysis, as the manual thematic content analysis oftenpractised is too expensive and specific to a single theory to provide forthe large corpora needed to test sophisticated communication models.This paper proposes Associative Framing, a Relational Content Anal-ysis method based on the marginal and conditional reading chance ofconcepts. This provides a Communication Theoretic interpretation oflinguistic co-occurrence analysis, and can help communication researchby increasing the reach of automatic analyses and create more genericdata sets, allowing for the simultaneous testing of different theories. Weillustrate this method with a case study on the associations of Islam andTerror in newspapers from three countries.

    Keywords: Content Analysis, Computer Content Analysis, Relational ContentAnalysis, Framing, Agenda Setting, Co-occurrence

    Introduction

    Theoretical progress in Communication Science depends largely on empiricalevidence to test and sharpen theories. Such empirical evidence often consistswholly or partially of media content, making Content Analysis an importanttechnique for progress in Communication Science.

    The last decades have seen a sharp increase in the number and complexityof communication theories, among which Framing takes a prominent place. Thisincreased theoretical complexity has not been fully accompanied with increasedsophistication of Content Analysis methods. In particular, content is generallyanalyzed using manual Content Analysis specifically aimed at extracting infor-mation for a particular research question. Such often expensive types of researchrely on one off coding schemes leading to relatively small data sets that are very

  • difficult to combine. This makes it very difficult to compare the explanatorypower of competing theories, hindering theoretical progress.

    This article proposes Associative Framing, a probabilistic Relational Con-tent Analysis approach. In this approach, we extract connections between con-cepts based on textual proximity. These concepts are relatively close to the text,making automated extraction feasible. Theoretically relevant variables, such asframes, are then defined as patterns or features over the graph. By introducingthis layer between the actual text analysis (nodes and edges) and the theoreticalconcepts (frames), it is easier to combine data sets and test different theories onthe same coded texts.

    Presenting associative frames we rely on a method for extracting asymmetricassociation patterns based on calculating the conditional reading chance. Thismethod is similar to existing methods, but using a probabilistic approach allowsfor easier extensions and combination with existing work.

    In taking this approach, this article makes two contributions. Firstly, we pro-vide an interpretation of co-occurrence graphs in terms of recent theoretical workin Communication Science, decreasing the gap between theory and measurement.Secondly, we propose a probabilistic operationalization of co-occurrence, give asubstantive interpretation to the edge weights of the co-occurrence graph andsuggest ways for extensions based on more thorough linguistic analysis.

    The next section will discuss the Communication Theories that this methodaims at. This is followed by a conceptual definition of Associative Framing inthe third section and a concrete operationalization using a probabilistic modelof co-occurrence in section 4. The fifth section will give an extended example,showing some of the information that can be extracted using this method. Thefinal two sections offer a brief discussion of other potential uses and extensionsto this method and the conclusions.

    Theoretical Framework

    Within the communication literature we can distinguish a number of theoriesdealing with the transfer of salience on different levels. The most general levellooks at the transfer of salience of concepts in media messages to the conceptsin the public mind. This is the core idea of the Agenda Setting theory. A morespecific level focuses on the transfer of salience of concept-attribute pairs. This isthe core idea of both the second level of Agenda Setting and some research intoFraming, such as Issue Framing. Framing research, however, also includes dif-ferent views about the process of Framing. Both first and second order AgendaSetting originate from a linear model of influence stating that there is a lin-ear relationship between the messages sent and the reception of those messagesby the public. Other researchers focusing on Framing theories look at the pro-cess differently, stating that it is a more complex process than just a transfer ofsalience. In fact they argue that a certain frame can strengthen particular framesin the audience’s mind. Moreover, they argue that only one concept, for examplethe picture of the Omarska detention camps in Bosnia, can trigger prior estab-

    2

  • lished associations in people’s minds, such as associations with the Holocaust.In the next sections we will discuss these theories in more detail, finishing witha theoretical argumentation for associative frames as the common denominator.

    Agenda Setting

    The core of Agenda Setting research was stated years before the actual term wascoined by Bernard Cohen stating that the mass media ‘may not be successfulmuch of the time in telling people what to think, but it is stunningly successfulin telling its readers what to think about’ (Cohen, 1963, p.13). In their seminalChapel Hill study McCombs and Shaw (1972) introduced the term Agenda Set-ting to describe this influence after finding a nearly perfect correlation betweenthe public agenda and issue visibility in the media. If Agenda Setting occurs,people come to believe an issue is more important after exposure to the issuethrough the mass media than before.

    In the years since this first study, Agenda Setting has turned out a robust andconceptually clear theory, with numerous studies reproducing these effects andelaborating on the theory (see for example Dearing and Rogers, 1996; McCombsand Bell, 1996; Rogers et al., 1993). Dearing and Rogers (1996, p.22) formulateda three-component model of the Agenda Setting process, consisting of (a) themedia agenda, which influences (b) the public agenda, which in turn may influ-ence (c) the policy agenda. Expanding the original model with influences on themedia agenda, researchers divided the theory into Agenda Building and AgendaSetting processes, with the media agenda being the dependent variable in thebuilding phase and the audience as the dependent variable in the setting phase(see Scheufele, 1999). The common theoretical base underlying the large varietyof Agenda Building and Agenda Setting studies is the transfer of salience, withsalience interpreted as the “degree to which an issue on the agenda is perceivedas relatively important” (Dearing and Rogers, 1996, p.22).

    Second Level Agenda Setting

    Elaborating on the original Agenda Setting hypothesis, McCombs and Shaw(1993, p.62) argue that Agenda Setting “is a theory about the transfer of salience,both the salience of objects and the salience of their attributes.” McCombs andGhanem (2001) speak about a second level of Agenda Setting. They argue a divi-sion between objects and their attributes. “Beyond the agenda of objects there isanother aspect to consider. Each of these objects has numerous attributes, thosecharacteristics and properties that fill out the picture of each object” (p.68). Theattributes connected to the objects form the central part of this Second LevelAgenda Setting, or Attribute Agenda Setting. According to the researchers ‘theseattributes suggests that the media also tells us how to think about some objects’(p.69).

    3

  • Framing

    During the last decades, the study of Framing gained an important place in thefield of communication research. Similarly to the second level of Agenda Setting,Framing theory also deals with the influence on how to think about objects.This becomes clear in the seminal definition of Framing by Entman (1993) whodefines the concept as selecting “some aspects of a perceived reality and makethem more salient in a communicating text, in such a way as to promote aparticular problem definition, causal interpretation, moral evaluation, and/ortreatment recommendation for the item described.” (p.52)

    Entman’s definition shows the multi-faceted nature and complexity of Fram-ing research. Besides a transfer of salience it is about selection, and recommen-dation, including not only the communicator but also the audience. In the sameline as Agenda building and Agenda Setting, researchers distinguish a framebuilding process with the media as the dependent variable and a frame settingprocess where the audience is the dependent variable (Scheufele, 1999; De Vreese,2002), as presented in Figure 1.

    External

    Influences(Elite, Real

    World, …)

    Sender

    Media Frames

    Media Agenda

    Receiver

    Audience Frames

    Audience AgendaMessage

    Frame Building Frame Setting

    Agenda Building Agenda Setting

    Fig. 1: Setting and Building in the Communication Pipeline

    The many theoretical complexities attached to Framing cause Entman (1993)to complain about a lack of structure and paradigmatic unity. Starting fromRosengren (1993) and Beniger (1993) arguments stating that three paradigmsinfuse the communication research (constructionist, critical and cognitive ap-proach), D’Angelo (2002) takes a more optimistic view and sees Framing as amultiparadigmatic research program. Due to the inherent link between salienceand cognition, we focus here on the cognitive perspective.

    The Cognitive Approach

    The construction of mental models or frames is a central part within the cog-nitive approach of Framing. Within this approach Framing can ‘encourage par-ticular trains of thoughts about political phenomena’ (Price et al., 1997, p.483).Grounded in cognitive psychology, the approach uses the associative network

    4

  • model of human memory (Collins and Quillian, 1969), proposing that the con-cepts in semantic memory are represented as nodes in a complex hierarchicalnetwork. Each concept in the network is directly linked to other related concepts.Collins and Loftus (1975) refined this model in introducing important changesregarding the processing of information in the network. They talk about theautomatic spreading of activation. According to them the processing of a con-cept is manifested in the network as the activation of the appropriate node thatrepresents it. When the proper concept node is activated, activation continuesautomatically to all connected nodes. Minsky (1975) connected this view toFraming when he defined a frame as a structure containing various pieces ofinformation. These discursive or mental structures are closely related to the de-scription of a schema as ‘a cognitive structure that represents knowledge abouta concept or type of stimulus, including its attributes and the relation amongthose attributes’ (Fiske and Taylor, 1991, p.98).

    Framing: Definition and Process

    Within frame setting research there are two critical questions: ‘what are frames?’and ‘how are frames transferred between media an audience?’ In the followingsections we will discuss some distinctions made by researchers trying to answerthese questions.

    Equivalency versus Emphasis Frames Within the answers to the questionof what frames are, an important distinction is made between equivalency framesand emphasis frames. ’Equivalency frames’ present an issue in different ways with“the use of different, but logically equivalent, words or phrases” (Druckman,2001, p.228). In experiments researchers found systematic changes of audience’spreference when the same problem is presented in different wordings, such asrescuing some versus sacrificing others (Tversky and Kahneman 1981; Quattroneand Tversky 1988; for a general discussion see Levin et al. 1998).

    Emphasis frames, later transformed into the term issue Framing (Druckman,2004), on the other hand highlight a particular “subset of potentially relevantconsiderations” (Druckman, 2001, p.228). In line with Entman’s definition is-sue Framing can be defined as a process of selecting and emphasizing certainaspects of an issue on the basis of which the audience can evaluate the issuedescribed or the protagonists associated with the issues. In our study we willfocus on issue Framing rather than equivalence Framing, since we are interestedin the relationship between different concepts and their attributes, rather thanin the different descriptions of a certain concept. Issue frames reflect networksof concepts and attributes.

    Linear versus Interactive Frame Setting

    The second question has also lead to a number of different hypotheses. Thebehavioralists consider the transfer of salience as a linear process, straight from

    5

  • the sender into the audience. According to criticasters these accessibility models“portray the individual as rather mindless, as automatically incorporating intothe final attitude whatever ideas happen to pop into mind”, that is, whateveris suggested in a mediated frame. By contrast, other researchers suggest a morecomplex situation in which meanings are produced and exchanged between thesender, the receiver and the larger community in which they operate.

    For example, Nelson et al. (1997) show that Framing effects occur through acomplex psychological process in which receivers of messages consciously thinkabout the importance of different considerations suggested by a frame. In otherwords, the Framing process can be regarded as an interaction between textualfeatures and the interpreter’s social knowledge. This interaction process leads toa construction of a mental model, as a resulting state of interpretation (Rhee,1997). According to Graber (1988), people use schematic thinking to handle in-formation. They extract only those limited amounts of information from newsstories that they consider important for incorporation into their schemata. Sheadded that the media make major contributions to this schema formation. Be-sides the creation of these mental models, the Framing process can trigger analready existing mental model, or frame within the receivers perception. In theirstudies Snow and Benford focus on social movement Framing and the individ-ual and collective action effects. The researchers define frames as “interpretiveschemata” that organize the meanings of objects, situations, events, experiences,and sequences of action for social actors (Snow and Benford, 1992, p.137). Theystate that media frames and audience frames interact through ’frame alignment’(Snow et al., 1986) and ’frame resonance’(Snow and Benford, 1988).

    Association: the Common Denominator of Agenda Setting andFraming

    In the sections above we described both Agenda Setting and Framing researchconducted over the last decades. Despite the big differences between the actualangles taken by the researchers, we perceive a common denominator betweenthose studies in the use of associations, either between concepts, concepts andattributes or as more complex networks of concepts. In this study, therefore, wewill focus on what we call ‘Associative Framing’. These frames, which extend theobject-attributes in second level Agenda Setting, consist of associations betweenobjects and other objects. Taking the cognitive perspective as described above,these frames refer to the earlier described schemata of interpretation of (Goff-man, 1974). The corresponding audience frames are seen as associative networksas described by (Collins and Quillian, 1969). Analogous to Agenda Setting as atransfer of salience of concepts, this allows us to see associative frame setting asa transfer of salience of the links between concepts. Where these concepts are theattributes of other concepts, this is identical to Second Level Agenda Setting,but we believe that it is fruitful to look at the associative network as one largeinterconnected network rather than as the attributes of individual concepts.

    Thus, this model is related to the audience frames or schemata in whichindividuals form strong associations between different mental concepts. The re-

    6

  • lationships between these different concepts can be triggered through outsidecues, such as news messages, consistent with second level of Agenda Setting ofMcCombs and Estrada (1997) and the strong media effect postulated by Graber(1988). Although the general transfer of salience from the media to the audiencehas been found by numerous Agenda Setting studies, there are various hypothe-ses about the transfer of attribute and relation salience from media frames toaudience frames. Thus, rather than hypothesizing a direct transfer of relationsalience, we propose developing a framework for testing these different hypothe-ses in a systematic way, in which an automatic measurement of associative mediaframes such as proposed here is an important first step.

    Associative Framing: Measuring the News

    For testing hypothesis regarding media effects or media logics, it is necessary tomeasure the news content. For Agenda Setting research, which hypothesizes thetransfer of salience of concepts, a thematic content analysis suffices (Holsti, 1969;Krippendorff, 2004). Second Level Agenda Setting and Framing theories, on theother hand, postulate more complicated patterns. It is possible to measure theseframes directly using thematic content analysis, and in fact well described andtested methodologies exists such as the one descibed in Semetko and Valken-burg (2000). The measurement variables in such an approach, however, are farremoved from the text, making it difficult to automate such analyses. Moreover,material annotated using this method have low reusability, as a slight changein the definition of frames neccesitates a reannotation. Also, the opaque natureof human annotation makes it difficult to judge the effects of coder culture andbias on the obtained results.

    For these reasons, we think it would be beneficial to use Relational ContentAnalysis methods for Framing research (Roberts, 1997; Carley, 1997). Thesemethods represent message contents as a graph of relations betweeen conceptsthat are relatively close to the text. The target variables, such as frames, arethen defined as patterns or metrics on the graph representation. This allowsfor the post-hoc redefinition of frames as long as they can be expressed usingthe same concepts and relations (nodes and edges). Additionally, this allowsfor easier testing of competing hypotheses as they can be based on the sameannotated material. Finally, graph-based data structures are intensively studiedin Graph Theory, Knowledge Representation, and Social Networks Analysis,which has lead to the development of toolkits and techniques that can be usedto gain more insight into relational content data (see for example Van Atteveldt,Kleinnijenhuis, and Carley, 2006).

    Within Relational Content Analysis, some methods, such as Automap (Car-ley, 1993; Diesner and Carley, 2004) and TABARI (Schrodt and Gerner, 1994;Schrodt, 2001), rely on simple keyword and co-occurrence based information toderive this graph. Other methods, such as NET (Van Cuilenburg et al., 1988;Kleinnijenhuis et al., 1997), assume a more complicated data structure with dif-ferent edge types (action, causation, affinity) and signs (association versus dis-

    7

  • sociation), which makes it more challenging to automate this method, althoughwork is being done in that direction to include ‘subjective’ language resources(Van Atteveldt et al., 2004) and grammatical structure (Kleinnijenhuis, 2006).

    We propose a new method called Associative Framing. This method usessimple occurrence and co-occurrence to derive edges between predefined nodes.In particular, it uses a probabilistic model, using the marginal reading chance ofconcepts as measure for visibility, and conditional reading chance as associationmeasure. The combination of concept visibility and associative links constituteAssociative Frames.

    Using unspecified association as links makes the data structure proposed hererelatively simple. We realize that some of the frames proposed in the literatureemploy more complex relations between concepts than unspecified association.However, we think that simple association is sufficient to study a large propor-tion of these theories, and can still be a valuable tool for other studies as aexploratory step. This simple data structure allows for the automatic analysisof large amounts of text. Furthermore, as will be discusses in the penultimatesection of this article, it is easy to extend this method to more complicated struc-tures if linguistic markers can be found, extending the reach of this method.

    Associative Frames can be found both as Media Frames, in terms of linguisticco-occurrence patterns, and as Audience Frames, based on concept and attributesalience and associative memory structures such as described by Collins andQuillian (1969) and Collins and Loftus (1975). This method specifically does notassume a particular Frame Setting process. We can imagine direct ‘hypodermic’transfer of salience but also more complicated filtering, interaction or frame ac-tivation mechanisms. What combination of these processes best describes mediaeffects is a very important question, and a formal and structural frame represen-tation method such as our proposal can help in containing empirical evidence toanswer this question.

    A Probabilistic Model of Associative Framing

    In Associative Framing we assume that media messages can be reduced to con-texts containing atomic events which have a certain probability of occurring, andwithin which co-occurrence is meaningful. More concretely, we assume that thereis a document size, such as a sentence, paragraph, document, or newspaper, forwhich we can measure the occurrence of our target concepts and within whichwe want to know whether they co-occur.

    The occurrence of concepts is measured using synonyms or keywords as indi-cators of the target concept, along with disambiguating conditions. An exampleof this is requiring the phrase “President Bush” to occur in a document as acondition of accepting the word ‘Bush’ as an indicator of the concept. Thiswould prevent articles about George H.W. Bush or Governor Jeb Bush to bemistakenly counted while still allowing the use of just ‘Bush’ as an indicatorin other sentences within the article. For example, in the two sentences from aconstructed article, we will find the following keyword counts, assuming sensible

    8

  • keywords for the target concepts Bush, Immigration, and American Values (thelatter including speaking English).

    Sentence Bush Immigration Values

    Bush Continues Campaign for Immigration Reform 1 1 0New arrivals to this country must adopt American values andlearn English, President Bush said Wednesday

    1 1 2

    To transform these keyword counts to probabilities, we need a function from[0,∞> to [0, 1]. We would wish this function to increase monotonically but sub-linearly from zero towards one, and the probability of a concept with two syn-onyms should equal the probability of encountering either of the two synonymsif they were separate keywords. A set of functions satisfying these constraintsis given in equation 1, where c and m stand for the concept and message beinginvestigated, and the parameter 1b is the probability of a concept encounteredonly once. This latter parameter is difficult to base on theoretical grounds. Itcould be modelled empirically, but we believe that results should be fairly stablewith different settings. We would advice a value around .5 for short contextualunits (sentence or paragraph), and lower (such as .25 or even .1) for longer unitssuch as documents.

    p(c|m) = 1−(

    1− 1b

    )count(c,m)(1)

    Using this to assign probabilities to the sentences above, taking 1b = 50% yields:

    Bush Immigration ValuesSentence Count p(c—m) Count p(c—m) Count p(c—m)

    Bush Continues ... 1 .5 1 .5 0 0New arrivals ... 1 .5 1 .5 2 .75

    Associative Frames are defined for the unit of analysis, which is a set ofdocuments probably corresponding to a given time period and/or medium. Inessence, the formula above translates this into a term by document matrix con-taining probabilities as cell values. On this matrix we define the two measurescomprising Associative Frames, visibility and associations, as the marginal andconditional probabilities.

    Visibility is the marginal probability of a concept, in other words the chancethat if a single message is received from the set of messages, that message willcontain that concept. This probability is based on the chance of a concept oc-curring in a message and the chance of receiving that message. This latter prob-ability could be based on message properties, such as position in a newspaperand reach of that newspaper for articles, but also characteristics of the potentialreceiver of the message in individual level analyses. In a formula, where p(m) isthe chance of receiving a message (the normalized weight of a message):

    V isibility(c) = p(c) =∑m

    p(m)p(c|m) (2)

    9

  • The association between two concepts, called the base and target concepts, isdefined as the conditional probability of a message containing the target conceptgiven that that message contains the base concept. This corresponds to thefollowing formula, where ct is the target concept and cb is the base concept.

    ass(cb → ct) =∑

    m p(m) · p(cb|m) · p(ct|m)V isisility(cb)

    (3)

    In our two-sentence example, this leads to the associations below:

    Association withBase Concepts Visibility Bush Immigration Values

    Bush .5 - .5 .19Immigration .5 .5 - .19Values .38 .38 .38 -

    Motivation and Relation to other Methods

    In the preceding section, we proposed using marginal and conditional probabilityto describe association patterns. This section will give a number of reasons whywe think this is a good representation, and also discuss how it relates withexisting association measures.

    The simplest alternative to probabilities is using the raw keyword co-occur-rence counts, such as in Automap (Diesner and Carley, 2004). These numbers,however, are very difficult to compare between data sets and even concepts, andalso suffer from strong autocorrelation between different edges from the samenode (cf Krackhardt, 1987). Moreover, outliers such as very long documentscan strongly influence these counts, necessitating a normalization using localand global weighting in studies such as Deerwester et al. (1990). The resultingnormalized numbers are hard to interpret. Probabilities do not suffer from theseproblems, having a very clear substantive interpretation as (conditional) readingchance.

    Marginal and conditional probabilities are a direct extension of simple di-chotomous occurrence of concepts. This is similar to the crisp versus partial setmembership such as used by Fuzzy Logics. The method being a direct extensionmeans that it is valid to use deterministic concept occurrence, ie a concept iseither present or not, while still remaining within the Associative framework. Inthat case, visibility reduces to the weighted proportion of documents mentioninga document, and associations are the proportion of documents containing thebase concept that also contain the target. In fact, Tversky (1977) describes asimilarity measure quite similar to a dichotomized version of Associative Frames.

    Another advantage of using probabilities is that Statistical Natural LanguageProcessing methods generally return a probability distribution over possible out-comes or at least a confidence estimation of the best outcome. Using a proba-bilistic graph representation allows the seamless integration of such qualifiedinformation.

    10

  • Additionally, probability calculus is a well established field of mathematics,and many other methods are built on its foundations. Although this is beyondthe scope of this article, it is thinkable that one could uses Bayesian Networks forrepresenting frames (Jensen, 2001). Also, probability calculus gives us a naturalway to extend the models presented here by making the (conditional) probabilitymodels more complicated, some examples of which will be given below. Anotherpossibility is using a generative model for the media producer, viewing mediacontent as something produced based on an internal state of the media producerusing. This could be a natural way to estimate confidence of media data andmight be a useful way to model theories on media production. Although allof this would require substantial theoretical and methodological work beforeyielding results, building the graph representation on a probabilistic foundationmakes it easier to use such established methods and might be a first step infruitful interdisciplinary research.

    Another important choice was to use an asymmteric association measure. Themain argument for that is substantive: currently John Bolton only appears inthe news in articles on the United Nations, making the association from Boltonto UN very strong, while the reverse association is fairly weak. Tversky (1977)also notes that the semantic distance of one concept to another is often differentfrom its inverse, for example they find that Hospital is more similar to Buildingthan the other way around. This choice rules out many existing metrics such ascorrelation and the cosine distance often used in Information Retrieval systems.

    As a final note we would like to state that our association metric is fairly sim-ilar to metrics like cosine distance, correlation, and regression. All these metricsare based on the dot product of the variable vectors with some normalization.Cosine distance and correlation both normalize on the length (standard devia-tion) of both vectors, while corrleation is also based on mean centered variables.Regression coefficients model only on the standard deviation of the target vari-able, making it equivalent to our metric except for the centering. A statisticoften used in linguistic co-occurrence analysis, the χ2 test, is a measure of thesignificance rather than the strength of the association (Manning and Schütze,1999), making it less relevant for describing media frames (although it couldbe used to test whether such found frames deviate from some prior expecteddistribution, although QAP test or an explicit modeling of message productionas a Bernoulli process might be more useful for this purpose).

    Case Study: Islamic Terrorism and Terrorist Islamists

    As a case study and proof of concept of the method described above, we per-formed an explorative Associative Framing analysis of Islam and Terror in Dutch,British and U.S. newspapers from 2000 to 2005. This section does not attemptto give a full description of this analysis or substantively add to the body of

    11

  • knowledge on newspaper coverage of this issue.3 Rather, it is meant to showcasethe possibilities of the Associative Framing method.

    We analyzed one ‘popular’ and one ‘quality’ newspaper in each country. Forthe U.S., these were the USA Today and The Washington Post; for the U.K. TheGuardian and The Sun; and for the Netherlands de Volkskrant and De Telegraaf.We selected all articles mentioning Terrorism or Islam, and words related tothese concepts, from January 2001 until September 2005. In total, this yielded114,751 articles containing over half a million mentions of either Terrorism orIslam. Table 1 gives an quantitative summary of the corpus.

    Table 1: Overview of the analyzed newspapers

    Newspaper Country Type #articles #hits Sum of hits

    De Telegraaf Netherlands ‘Popular’ 7,025 12,215 27,613De Volkskrant Netherlands ‘Quality’ 16,528 29,340 81,560The Sun United Kingdom ‘Popular’ 14,499 21,644 40,962The Guardian United Kingdom ‘Quality’ 21,567 35,920 95,860USA Today United States ‘Popular’ 12,003 19,882 60,612The Washington Post United States ‘Quality’ 43,129 71,813 206,097Total 114,751 190,814 512,704

    This corpus was analyzed by counting occurrences of a number of concepts,including Islam, Terror, Government Actors, Legislative Actors, and a numberof word lists for positive and negative associations and ‘patriotic’ phrases such as‘our great nation’ and ‘God bless America’. These keyword counts were trans-formed using the formula listed above and 1b = 0.25. As we selected articleson the first two concepts, we can only report visibility of these concepts andassociations of these concepts with each other and with the other concepts.

    Figure 2 displays the unnormalized visibility of Islam and Terror in the threecountries during the examined time period. The vertical scale is the total numberof articles about the concept, with the year and month on the horizontal scale.In all countries, there was a steep increase in visibility of both Terror and Islamafter 9/11, and again after the attacks in Madrid on 3/11/2004 and the attacksin London on July 7 2005 London Bombings . In all three countries, but mostnoticeably so in the U.S. newspaper, visibility of both concepts were at a steadilyhigher level after 9/11 than before. Islam peaks at each of the three terror events,but also around the spring of 2002, a particularly violent period in Israel, andin the Netherlands after the murder on the filmmaker Theo van Gogh by afundamentalist Muslim in November 2004.

    Figure 3 shows the associaton patterns of Islam and Terror with the politicalactors and with Positive, Negative, and Patriotic words. The middle column

    3 For this purpose, please see a recent conference paper Ruigrok and Van Atteveldt(2006) or contact the authors for a preview of a recently submitted article on thistopic.

    12

  • (a) U.S. (b) U.K.

    (c) Netherlands

    Fig. 2: Visibility of Terror and Islam

    shows the pattern during the first three months after 9/11 and the left and rightcolumns show the period before and after that, respectively. Associations below10% are not shown.

    Both the UK and the Netherlands whow an increase in the association of Is-lam with negativity in the period directly after 9/11, falling back to the pre-9/11level after three months. The US started out with a high level of negative asso-ciations, although that also drops from 2002. Associations with positive wordsdecrease in all countries. In all countries both Terror and Islam are more stronglyassociated with the executive than with the legislative branch, with only the USshowing a pattern of associating (the fight against) Terror with legislative actors.The use of patriotic words in both Terror and Islam contexts increases stronglyin the American press, falling back after the first three months but staying higher

    13

  • Government

    Terror

    Legislative

    Islam

    0.38 0.130.28

    Negative

    0.35

    Positive

    0.19

    Pattriotism

    0.25

    0.37

    0.27

    0.43 0.300.28

    (a) US, before 9/11

    Government

    Terror

    Legislative

    Islam

    0.40 0.15

    0.15

    Negative

    0.28

    PositivePattriotism

    0.33

    0.46

    0.70

    0.44 0.210.44

    (b) US, 9/11 – 12/11

    Government

    Terror

    Legislative

    Islam

    0.48 0.190.16

    Negative

    0.29

    Positive

    0.10

    Pattriotism

    0.31

    0.46 0.11

    0.43

    0.37 0.220.36

    (c) US, after 12/11

    Government

    Terror

    Legislative

    Islam

    0.20

    0.13

    Negative

    0.26

    Positive

    0.13

    Pattriotism

    0.18

    0.22

    0.21

    0.33 0.250.23

    (d) UK, before 9/11

    Government

    Terror

    Legislative

    Islam

    0.30

    0.16

    Negative

    0.30

    PositivePattriotism

    0.24

    0.35

    0.55

    0.41 0.200.33

    (e) UK, 9/11 – 12/11

    Government

    Terror

    Legislative

    Islam

    0.28

    0.16

    Negative

    0.28

    PositivePattriotism

    0.20

    0.28

    0.35

    0.34 0.190.27

    (f) UK, after 12/11

    Government

    Terror

    Legislative

    Islam

    0.20

    0.21

    Negative

    0.24

    Positive

    0.12

    Pattriotism

    0.10

    0.15

    0.26 0.170.13

    (g) Neth, before 9/11

    Government

    Terror

    Legislative

    Islam

    0.28

    0.28

    Negative

    0.32

    PositivePattriotism

    0.18

    0.22

    0.45

    0.35 0.110.19

    (h) Neth, 9/11 – 12/11

    Government

    Terror

    Legislative

    Islam

    0.24 0.110.27

    Negative

    0.27

    PositivePattriotism

    0.11

    0.18 0.14

    0.21

    0.27 0.110.14

    (i) Neth, after 12/11

    Fig. 3: Associations of Terror and Islam

    than before 9/11. This pattern is also seen to a lesser degree in the British andDutch press.

    Finally, Figure 4 plots the change in association between terrorism and Islamand the other way around over the studied time period. In both the UK and theUS the association of Terror with Islam is fairly steady around 0.15 throughoutthe investigated period. The reverse association clearly peaks at 9/11 and stayshigh in the US newspapers. In the British press it also peaks, but falls steadilyafter that, coming to almost pre-9/11 levels right before the London bombings,after which it shoots up again. The Netherlands shows a different picture. Theassociation of Terror with Islam is higher than in the other countries, althoughthis is difficult to compare directly due to the different keywords (as it is a differ-ent language). The change in value is easier to compare. Although the associationof Islam with terror also peaks after 9/11, it actually declines after the murderon Van Gogh, which is unexpected since a local ’terror’ event was expected tolead to an increase of association of Islam with terror. The reverse associationdid increase, indicating that terror was mainly discussed in the context of the(fundamentalist) Islam, even though the Islam itself was associated with otherconcepts, as the debate shifted to integration, culture, and civil rights.

    14

  • (a) U.S. (b) U.K.

    (c) Netherlands

    Fig. 4: Changes in association between Terror and Islam

    Use Cases and Extenstions

    The results presented in the previous section give a high-level overview of theassociations of a small number of concepts in a very large time span. This sec-tion will attempt to answer the question of what we can do with this kind ofdata by describing a number of use cases for this data. Moreover, it will brieflydescribe some possible extensions to the method within the probabilistic associ-ation framework.

    Explorative Research The method presented here can be used as a relativelyquick and cheap explorative step to take before a more detailed investigation. Itclearly indicates the time spans in which the ‘action’ occurs and can also be used

    15

  • to select countries or newspapers to study. For example, suppose one would liketo investigate how the Islam is portrayed in the international press, this suggeststhat the beginning of 2002 might be an interesting period to study next to the‘obvious’ periods around the attacks, and that a comparison of Dutch and USpress captures more variety than US and UK press. These are not answers toresearch questions per se, but can help focus research on periods where answerson such questions can be found and avoid some ‘sampling bias’ in selectingperiods and newspapers to investigate.

    Direct hypothesis testing A second possibility is to use this data to answera number of research questions quantitatively. As described in (Ruigrok andVan Atteveldt, 2006), chi squared analyses can determine whether differencesbetween periods or associations are significant. Also, time series analysis couldbe run to detect a ‘pack journalism’ or intra-media Agenda Setting effect, eitherat the visibility or association level.

    Part of a Model Thirdly, and most interestingly, would be to use this dataas a variable in a model including information on audience or political attitudesand beliefs, such as survey data or a text analysis of open survey questions,political speeches, or text from Internet fora or newsgroups. This can be usedto test different linear or non-linear models of media effects at the associationallevel.

    Possible Extensions

    The above sections described a simple mechanism to extract and represent Asso-ciative Frames. Although these Frames are a powerful tool for both explorativeresearch and hypothesis testing, we realize that for certain research questions itmight be necessary to measure more complex frames. This section will discuss atwo possible extensions to the basic model.

    Typed or signed edges In the current proposals, edges are limited to a quan-titative representation of the strength of association between nodes. The mea-surement of the evaluative content of text is difficult, but recently a lot of workhas been done on this in Computational Linguistics (Van Atteveldt et al., 2004;Esuli, 2006). If these techniques are sufficiently accurate, it is quite easy to ex-tend our model to incorporate them.

    In the section above we calculated the direct association of Islam with posi-tive and negative keywords. This effectively measures an attribute of Islam suchas described by Second Level Agenda Setting. We can also use this to enrich therelation between Terror and Islam. If we determine the probability of encoun-tering a negative word given that we encountered both concepts, we essentiallycapture the ‘mood’ of the association. This can be used to create typed (multi-plex) edges rather than just associations. Also, we can substract the association

    16

  • with negative from the association with positive, and create a signed associationfrom a set of antonyms.

    Integrating Grammatical Features The extraction mechanism proposed inthis article uses surface proximity to determine relations. Since two words needto be relatively close to express a relation between them, this is actually notthat bad an indicator of relatedness. However, if the grammatical structure ofsentences is available, it might be useful to base relatedness on syntactic structurerather than surface structure. This is especially true if one is interested in morespecific relationship types, such as negative or causal relations, as it is quitepossible for a negative word to be used in a sentence without applying to theconcept under investigation.

    The probabilistic model presented here can also be adapted to those circum-stances. For example, if one has the syntactic tree structure of all sentences underinvestigation, instead of measuring what the chance is of two items co-occurringwithin a paragraph or 10-word window, we can measure the co-occurrence withina clause. Alternatively, we can count the number of ‘steps’ or edges in the syn-tactic or dependency tree and use that instead of surface word distance.

    Conclusion

    This paper presented Associative Frames, a probabilistic Relational ContentAnalysis method based on keyword co-occurrence. Agenda Setting, Second LevelAgenda Setting, and Emphasis or Issue Framing can all be seen as theoriesabout the transfer of association patterns or networks from the media to thereceiver. By measuring the individual concepts rather than the whole frame, weare able to use computer techniques to automate this measurement. Moreover,the data obtained using this method is less dependent on the specific definitionof the measured Frames, increasing data sets reusability and making it easierto test different theories on the same data. In this way, we have given a clearinterpretation of linguistic co-occurrence in terms of current CommunicationTheory, which helps to bridge the gap between theoretical sophistication andmeasurement techniques.

    We have also presented a methodology for calculating association scores asa conditional reading probability. This measure is asymmetric, conforming tosubstantive intuitions about the nature of association. Moreover, probabilitycalculus is a well understood field of mathematics, making the model easy toextend and compare to other methods.

    Finally, we gave an example content analysis of the associations betweenIslam and Terror in the Dutch, British and American press between 2000 and2005, showcasing the power of this method to analyze large amounts of text. Wealso proposed a number of other research use cases possible with this method,and ways to extend this method to incorporate more sophisticated linguisticknowledge.

    17

  • This method has some limitations, however. First, we accept that not allframes can be expressed as simple association patterns. For some frames, itwill be necessary to extend the current model, for example by distinguishingbetween types of relations or by measuring whether the relation is positive ornegative. Although these measurements are difficult linguistically, a lot of workis being done on such problems in the Computational Linguistics community,and the model can easily be extended as soon as acceptable accuracy is reachedon the linguistic extraction. Also, certain Framing theories, such as EquivalencyFraming, are even more difficult to fit into this model, since they are not directlybased on association networks.

    Another limitation is the difficulty of interpreting co-occurrence measures onkeywords. An overly broad or narrow set of keywords for a concept can easilyskew the results or measure something completely different from the intendedconcept. For this reason, it is very important to go ‘back to the text’ and qual-itatively assess whether the found contexts are actually expressing the relationthat one is interested in. This makes the initial phase of creating keyword listsmore labor-intensive that an ‘automatic approach’ might suggest, but once thelists are in place and well tested, there are little extra costs in analysing moredocuments, making this approach particularly well suited for resarch topics thatare ongoing or cover a large amount of texts. The creation and evaluation of key-word lists can be done more systematically using Keyword-in-context programsor manual coding of a subset of documents, but it is ultimately the interactionbetween the quantitative measurement and the qualitative control that ensuresa correct interpretation of the results.

    These limitations notwithstanding, this paper provides a clear communica-tion theoretical interpretation and probabilistic operationalization of co-occurrence.This yields a powerful and flexible method for the automatic analysis of text,which is a contribution to the measurement techniques currently available to theCommunication Scientist. This will aid theory development by allowing multipletheories to be tested simultaneously on large corpora.

    18

  • References

    Beniger, J. (1993). Communication: embrace the subject, not the field. Journalof Communication 43(3), 18–25.

    Carley, K. (1993). Coding choices for textual analysis: A comparison of contentanalysis and map analysis. Sociological Methodology 23, 75–126.

    Carley, K. (1997). Network text analysis: The network position of concepts. InC. Roberts (Ed.), Text Analysis for the Social Sciences, pp. 79–100. Mahwah,NJ: Lawerence Erlbaum Associates.

    Cohen, B. (1963). The press and foreign policy. Princeton, NJ: Princeton Uni-versity Press.

    Collins, A. and Loftus (1975). A spreading activation theory of semantic memory.Psychological Review 82, 407–428.

    Collins, A. and M. Quillian (1969). Retrieval time from semantic memory. Jour-nal of Verbal Learning and Verbal Behavior 8, 240–248.

    D’Angelo, P. (2002). News framing as a multi-paradigmatic research program:A response to entman. Journal of Communication 52(4), 870–888.

    De Vreese, C. (2002). Framing Europe: Television News and European Integra-tion. Amsterdam: Aksant.

    Dearing, J. and E. Rogers (1996). Agenda setting. Thousand Oaks, CA: Sage.Deerwester, S. C., S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A.

    Harshman (1990). Indexing by latent semantic analysis. Journal of the Amer-ican Society of Information Science 41 (6), 391–407.

    Diesner, J. and K. Carley (2004). Automap1.2 - extract, analyze, represent,and compare mental models from texts. Technical Report CMU-ISRI-04-100,Carnegie Mellon University, School of Computer Science, Institute for SoftwareResearch International.

    Druckman, J. N. (2001). The implications of framing effects for citizen compe-tence. Political Behavior September 2001, 225–256.

    Druckman, J. N. (2004). Political preference formation: Competition, deliber-ation, and the (ir)relevance of framing effects. American Political ScienceReview 98(4), 671–686.

    Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm.Journal of Communication 43(4), 51–58.

    Esuli, A. (2006). Sentiment classification bibliography. Annotated bibliographymaintained at http://liinwww.ira.uka.de/bibliography/Misc/Sentiment.html.

    Fiske, S. and S. Taylor (1991). Social Cognition, 2nd Ed. New York: McGraw-Hill.

    Goffman, E. (1974). Frame analysis: an essay on the organization of experience.Boston: Northeastern University Press.

    Graber, D. (1988). Processing the News: How People Tame the InformationTide. Lanham, MD: University Press of America.

  • Holsti, O. (1969). Content Analysis for the Social Sciences and Humanities.Reading MA: Addison-Wesley.

    Jensen, F. V. (2001). Bayesian Networks and Decision Graphs. Springer.Kleinnijenhuis, J. (2006). Applications of graph theory to cognitive communica-

    tion research. In K. Krippendorff and M. Bock (Eds.), The Content AnalysisReader (forthcoming). Thousand Oaks: Sage.

    Kleinnijenhuis, J., J. De Ridder, and E. Rietberg (1997). Reasoning in economicdiscourse: an application of the network approach to the Dutch press. InC. Roberts (Ed.), Text Analysis for the Social Sciences; Methods for DrawingStatistical Inferences from Texts and Transcripts, pp. 191–207. Mahwah, NewJersey: Lawrence Erlbaum Associate.

    Krackhardt, D. (1987). Qap partialling as a test of spuriousness. Social Net-works 9, 171–186.

    Krippendorff, K. (2004). Content Analysis: An Introduction to Its Methodology(second edition). Sage Publications.

    Levin, I., S. Schneider, and G. Gaeth (1998). All frames are not created equal:A typology and critical analysis of framing effects. Organizational Behavior& Human Decision Processes 76, 149–88.

    Manning, C. and H. Schütze (1999). Foundations of Statistical Natural LanguageProcessing. Cambridge, MA: MIT Press.

    McCombs, M. and T. Bell (1996). the agenda-setting role of mass communica-tion. In M. Salwen and D. Stacks (Eds.), An integrated approach to commu-nication theory and research, pp. 93–110. Mahwah, NJ: Lawrence Erlbaum.

    McCombs, M. and G. Estrada (1997). The news media and the pictures in ourheads. In S. Iyengar and R. Reeves (Eds.), Do the media govern?, pp. 237–247.London: Sage.

    McCombs, M. and S. Ghanem (2001). The convergence of agenda setting andframing. In S. Reese, O. Gandy, and A. Grant (Eds.), Framing public life, pp.95–106. Mahwah, NJ: Lawrence Erlbaum.

    McCombs, M. and D. Shaw (1993). The evolution of agenda-setting research:Twenty-five years in the marketplace of ideas. Journal of communica-tion 43 (2), 58–67.

    McCombs, M. E. and D. L. Shaw (1972). The agenda-setting function of massmedia. Public Opinion Quarterly 36, 176–187.

    Minsky, M. (1975). A framework for representing knowledge. In P. H. Winston(Ed.), The Psychology of Computer Vision. New York: McGraw-Hill.

    Nelson, T. E., Z. Oxley, and R. A. Clawson (1997). Toward a psychology offraming effects. Political Behavior 19 (3), 221–46.

    Price, V., D. Tewksbury, and E. Power (1997). Switching trains of thought.the impact of news frames on readers’ cognitive responses. CommunicationResearch 24, 481–506.

    Quattrone, G. and A. Tversky (1988). Contrasting rational and psychologicalanalyses of political choice. American Political Science Review 82, 719–736.

    Rhee, J. (1997). Strategy and issue frames in election campaign coverage: Asocial cognitive account of framing effects. Journal of Communication 47,

    20

  • 26–48.Roberts, C. W. (Ed.) (1997). Text Analysis for the Social Sciences: Methods

    for Drawing Statistical Inferences from Texts and Transcript. Mahwah, NJ:Lawrence Erlbaum.

    Rogers, E., J. Dearing, and D.Bregman (1993). The anatomy of agenda-settingresearch. Journal of Communication 43(2A), 68–84.

    Rosengren, K. E. (1993). From field to frog ponds. Journal of Communica-tion 43(3), 6–17.

    Ruigrok, N. and W. Van Atteveldt (2006). Global angling with a local angling:How us, british and dutch newspapers frame global and local terrorist attacks.In Presentation at the 47th Annual Convention of the International StudiesAssociation (ISA), 22–25 March, San Diego.

    Scheufele, D. (1999). Framing as a theory of media effects. Journal of Commu-nication 29, 103–123.

    Schrodt, P. (2001). Automated coding of international event data using sparseparsing techniques. In Annual meeting of the International Studies Associa-tion, Chicago.

    Schrodt, P. and D. Gerner (1994). Validity assessment of a machine-coded eventdata set for the middle east, 1982-1992. American Journal of Political Sci-ence 38(3), 825–854.

    Semetko, H. A. and P. M. Valkenburg (2000). Framing european politics: Acontent analysis of press and television news. Journal of Communication 50(2), 93–109.

    Snow, D. A. and R. D. Benford (1988). Ideology, frame resonance, and partici-pant mobilization. International Social Movement Research 1, 197–217.

    Snow, D. A. and R. D. Benford (1992). Master frames and cycles of protest. InA. D. Morris and C. M. Mueller (Eds.), Frontiers in Social Movement Theory,pp. 133–155. New Haven: Yale University Press.

    Snow, D. A., E. B. Rochford, S. K. Worden, and R. D. Benford (1986). Framealignment processes, micromobilization, and movement participation. Ameri-can Sociological Review 51, 464–481.

    Tversky, A. (1977). Features of similarity. Psychological Review 84 (4), 327–352.Tversky, A. and D. Kahneman (1981). The framing of decisions and the psy-

    chology of choice. Science 211, 453–458.Van Atteveldt, W., J. Kleinnijenhuis, and K. Carley (2006, 19-23 june). Rcadf:

    Towards a relational content analysis standard. In Presentated at the Inter-national Communication Association (ICA), Dresden.

    Van Atteveldt, W., D. Oegema, E. van Zijl, I. Vermeulen, and J. Kleinnijenhuis(2004). Extraction of semantic information: New models and old thesauri. InProceedings of the RC33 Conference on Social Science Methodology, Amster-dam.

    Van Cuilenburg, J., J. Kleinnijenhuis, and J. De Ridder (1988). Tekst en Betoog:naar een Computergestuurde Inhoudsanalyse van Betogende Teksten. Muider-berg (Netherlands): Coutinho.

    21