6
1170 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013 A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks Timothy C. Havens, James C. Bezdek, Christopher Leckie, Kotagiri Ramamohanarao, and Marimuthu Palaniswami Abstract—We discuss a new formulation of a fuzzy validity index that generalizes the Newman–Girvan (NG) modularity function. The NG func- tion serves as a cluster validity functional in community detection studies. The input data is an undirected weighted graph that represents, e.g., a so- cial network. Clusters correspond to socially similar substructures in the network. We compare our fuzzy modularity with two existing modular- ity functions using the well-studied Karate Club and American College Football datasets. Index Terms—Clique discovery, community detection, fuzzy communi- ties, fuzzy modularity, modularity. I. INTRODUCTION Social network analysis has become an important topic in pattern recognition and machine learning. Applications range from artificial in- telligence in massive multiplayer online games to the discovery of lead- ers of terrorist organizations. Arguably, the “bible” for social network analysis is Wasserman and Faust, in its 18th printing as of 2009 [1], but it does not mention fuzzy models of social networks! This is proba- bly due to the well-known disconnect between various communities of scholars working in related but uncommunicative fields. Selected read- ings in the literature from various groups indicate that this is probably quite accidental, most likely due to the lack of time to explore what may be essentially similar approaches advanced by disparate groups of researchers. Perhaps, this is where social network analysis could help (us) scholars the most. However, many recent papers do exhibit fuzzy or possibilistic clus- ters in social networks. We will begin with a short review of the evolu- tion of fuzzy models in social networks, which culminates with current work about overlapping (fuzzy) communities in social networks and the motivation for this study. Then, we will develop a new measure of fuzzy modularity for community detection, and compare it, both the- oretically and empirically, with existing ones using Zachary’s Karate Club [2] and the American College Football [3] datasets. Our analysis and experiments show that our proposed generalized modularity in- dex is superior, both in theory as well as in practice, to existing fuzzy modularities. II. FUZZY COMMUNITIES IN SOCIAL NETWORKS Social network analysis usually begins with a crisp (meaning not fuzzy, probabilistic, or possibilistic) graph-theoretic representation of Manuscript received July 29, 2012; revised November 19, 2012; accepted January 14, 2013. Date of publication February 5, 2013; date of current version November 25, 2013. T. C. Havens is with the Departments of Electrical and Computer Engineering and Computer Science, Michigan Technological University, Houghton, MI 49931 USA (e-mail: [email protected]). J. C. Bezdek and M. Palaniswami are with the Department of Electrical and Electronic Engineering, University of Melbourne, Parkville, Vic. 3010, Australia (e-mail: [email protected]; [email protected]). C. Leckie and K. Ramamohanarao are with the Department of Computer and Information Systems, University of Melbourne, Parkville, Vic. 3010, Australia (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TFUZZ.2013.2245135 the social network, say G =(V,E,W ), where V is the vertex set, E is the edge set, and W is the set of edge weights. Different social situations are realized by graphs with various properties: directed or not, weighted or not, connected or not, complete or not, and so on. In this paper, G is undirected and weighted. Clusters (cliques, subtrees, etc.) in G (subsets of vertices in V ) represent groups of individuals that are somehow related to each other more closely than to the individuals in the other clusters. Any weighted graph can be thought of as a (possibly unnormalized) fuzzy graph, or a fuzzy similarity relation on pairs of nodes, which was first discussed by Zadeh [4]. The earliest work on the use of fuzzy relations for social network analysis was done by Blin [5], who introduced the idea of using fuzzy relations in group decision theory. Bezdek et al. [6]–[8] collected data from small groups of students in communications classes, and developed models based on reciprocal fuzzy relations that quantified notions such as distance to consensus. An idea that is gaining traction in social network analysis is the notion of overlapping communities in social networks [9]. Commu- nities are defined as groups of densely interconnected nodes that are only loosely connected to the rest of the network in [10]. There is no clustering algorithm in [10]. Instead, overlapping clusters are seen visually as off-diagonal content in co-appearance images of the con- nection data. The model in [11] finds fuzzy communities by MULTI- CUT spectral clustering. Clustering is done by both hard/fuzzy c-means (HCM/FCM, [12]) and validation is done with an index that the authors call fuzzy modularity. Liu [13] introduces yet another fuzzy modularity functional and finds clusters via simulated annealing. III. P ARTITIONS AND MODULARITY Clustering in unlabeled data is the assignment of labels to the objects (for social networks, the vertices V ). Let c be an integer, 1 <c<n, where n is the number of objects or vertices in V .A c-partition of V is a set of cn values {u ki } arrayed as a c × n matrix U =[u ki ]. Element u ki is the membership of v i in cluster k. There are three sets of partition matrices [12] M pcn = U ∈R c ×n ;0 u ki 1k,i; c k =1 u ki ci 0 < n i =1 u ki <nk (1) M fcn = U M pcn ; c k =1 u ki =1i (2) M hcn = {U M fcn ; u ki ∈{0, 1}∀ki} . (3) Equations (1)–(3) define, respectively, the sets of nondegenerate (no row is all zeroes) possibilistic, constrained fuzzy or probabilistic, and crisp c-partitions of X . Note that M hcn M fcn M pcn . Each run of a clustering algorithm on any dataset produces one or more U s in some M pcn . For example, fixing all control and model parameters except c, then applying any c-means algorithm to X pro- duces one U (c) in M pcn , for each c = 2, 3,...,n 1. Other runs with variations of the c-means parameters, or other clustering algorithms, produce other U s for consideration. We collect all the candidate par- titions into a set named CP and ask the following: Which U CP is the most satisfactory explanation of substructure in V ? This is the cluster validity (more simply, “validation”) problem [12]. 1063-6706 © 2013 IEEE

A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

Embed Size (px)

Citation preview

Page 1: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

1170 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013

A Soft Modularity Function For Detecting FuzzyCommunities in Social Networks

Timothy C. Havens, James C. Bezdek, Christopher Leckie,Kotagiri Ramamohanarao, and Marimuthu Palaniswami

Abstract—We discuss a new formulation of a fuzzy validity index thatgeneralizes the Newman–Girvan (NG) modularity function. The NG func-tion serves as a cluster validity functional in community detection studies.The input data is an undirected weighted graph that represents, e.g., a so-cial network. Clusters correspond to socially similar substructures in thenetwork. We compare our fuzzy modularity with two existing modular-ity functions using the well-studied Karate Club and American CollegeFootball datasets.

Index Terms—Clique discovery, community detection, fuzzy communi-ties, fuzzy modularity, modularity.

I. INTRODUCTION

Social network analysis has become an important topic in patternrecognition and machine learning. Applications range from artificial in-telligence in massive multiplayer online games to the discovery of lead-ers of terrorist organizations. Arguably, the “bible” for social networkanalysis is Wasserman and Faust, in its 18th printing as of 2009 [1],but it does not mention fuzzy models of social networks! This is proba-bly due to the well-known disconnect between various communities ofscholars working in related but uncommunicative fields. Selected read-ings in the literature from various groups indicate that this is probablyquite accidental, most likely due to the lack of time to explore whatmay be essentially similar approaches advanced by disparate groups ofresearchers. Perhaps, this is where social network analysis could help(us) scholars the most.

However, many recent papers do exhibit fuzzy or possibilistic clus-ters in social networks. We will begin with a short review of the evolu-tion of fuzzy models in social networks, which culminates with currentwork about overlapping (fuzzy) communities in social networks andthe motivation for this study. Then, we will develop a new measure offuzzy modularity for community detection, and compare it, both the-oretically and empirically, with existing ones using Zachary’s KarateClub [2] and the American College Football [3] datasets. Our analysisand experiments show that our proposed generalized modularity in-dex is superior, both in theory as well as in practice, to existing fuzzymodularities.

II. FUZZY COMMUNITIES IN SOCIAL NETWORKS

Social network analysis usually begins with a crisp (meaning notfuzzy, probabilistic, or possibilistic) graph-theoretic representation of

Manuscript received July 29, 2012; revised November 19, 2012; acceptedJanuary 14, 2013. Date of publication February 5, 2013; date of current versionNovember 25, 2013.

T. C. Havens is with the Departments of Electrical and Computer Engineeringand Computer Science, Michigan Technological University, Houghton, MI49931 USA (e-mail: [email protected]).

J. C. Bezdek and M. Palaniswami are with the Department of Electricaland Electronic Engineering, University of Melbourne, Parkville, Vic. 3010,Australia (e-mail: [email protected]; [email protected]).

C. Leckie and K. Ramamohanarao are with the Department of Computer andInformation Systems, University of Melbourne, Parkville, Vic. 3010, Australia(e-mail: [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TFUZZ.2013.2245135

the social network, say G = (V, E, W ), where V is the vertex set,E is the edge set, and W is the set of edge weights. Different socialsituations are realized by graphs with various properties: directed ornot, weighted or not, connected or not, complete or not, and so on. Inthis paper, G is undirected and weighted. Clusters (cliques, subtrees,etc.) in G (subsets of vertices in V ) represent groups of individuals thatare somehow related to each other more closely than to the individualsin the other clusters.

Any weighted graph can be thought of as a (possibly unnormalized)fuzzy graph, or a fuzzy similarity relation on pairs of nodes, whichwas first discussed by Zadeh [4]. The earliest work on the use offuzzy relations for social network analysis was done by Blin [5], whointroduced the idea of using fuzzy relations in group decision theory.Bezdek et al. [6]–[8] collected data from small groups of students incommunications classes, and developed models based on reciprocalfuzzy relations that quantified notions such as distance to consensus.

An idea that is gaining traction in social network analysis is thenotion of overlapping communities in social networks [9]. Commu-nities are defined as groups of densely interconnected nodes that areonly loosely connected to the rest of the network in [10]. There isno clustering algorithm in [10]. Instead, overlapping clusters are seenvisually as off-diagonal content in co-appearance images of the con-nection data. The model in [11] finds fuzzy communities by MULTI-CUT spectral clustering. Clustering is done by both hard/fuzzy c-means(HCM/FCM, [12]) and validation is done with an index that the authorscall fuzzy modularity. Liu [13] introduces yet another fuzzy modularityfunctional and finds clusters via simulated annealing.

III. PARTITIONS AND MODULARITY

Clustering in unlabeled data is the assignment of labels to the objects(for social networks, the vertices V ). Let c be an integer, 1 < c < n,where n is the number of objects or vertices in V . A c-partition ofV is a set of cn values {uk i} arrayed as a c × n matrix U = [uk i ].Element uk i is the membership of vi in cluster k. There are three setsof partition matrices [12]

Mpcn =

{U ∈ Rc×n ; 0 ≤ uk i ≤ 1∀k, i;

c∑k=1

uk i ≤ c∀i

0 <n∑

i=1

uk i < n∀k

}(1)

Mf cn =

{U ∈ Mpcn ;

c∑k=1

uk i = 1∀i

}(2)

Mhcn = {U ∈ Mf cn ; uk i ∈ {0, 1}∀ki} . (3)

Equations (1)–(3) define, respectively, the sets of nondegenerate (norow is all zeroes) possibilistic, constrained fuzzy or probabilistic, andcrisp c-partitions of X . Note that Mhcn ⊂ Mf cn ⊂ Mpcn .

Each run of a clustering algorithm on any dataset produces one ormore U s in some Mpcn . For example, fixing all control and modelparameters except c, then applying any c-means algorithm to X pro-duces one U (c) in Mpcn , for each c = 2, 3, . . . , n − 1. Other runs withvariations of the c-means parameters, or other clustering algorithms,produce other U s for consideration. We collect all the candidate par-titions into a set named CP and ask the following: Which U ∈ CPis the most satisfactory explanation of substructure in V ? This is thecluster validity (more simply, “validation”) problem [12].

1063-6706 © 2013 IEEE

Page 2: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013 1171

Social network data are represented by a graph G = (V, E, W ),where V is a set of n vertices, E is a set of m edges, and W is an n × nedge weight matrix with elements wij , i, j = 1, . . . , n. Clusteringin the graph G means finding partitions U ∈ Mpcn of V . Becausethe data are not object vectors or dissimilarity data, as is usually thecase in cluster analysis, finding candidate partitions often requires spe-cial methods. In addition, the derivative problem of validating the foundclusters (cluster validity) for this special type of data structure is alsotreated somewhat differently than the validation schemes often em-ployed in pattern recognition.

Validity indices for partitions of G are usually called quality func-tions in the community detection literature. According to Fortunato [9]:

A quality function is a function that assigns a number to each partitionof a graph. This way, one can rank partitions based on their scoregiven by the quality function. Partitions with high scores are “good,”so the one with the largest score is by definition the best. . . . Themost popular quality function is the modularity of Newman andGirvan [14].

The basic rationale for modularity is that a random graph does nothave cluster structure; therefore, the existence of clusters is revealed bycomparing the actual density of edges in a subgraph with the expecteddensity under some null hypothesis. The expected edge density dependson the chosen null model.

The most popular form of modularity assumes that W is an n × npositive, symmetric edge weight matrix of G. Let V be partitioned intoc crisp subsets of vertices, say {V1 , . . . , Vc}, and let U ∈ Mhcn be thecrisp c-partition of G. The modularity of U for G = (V, E, W ) is

Qh =c∑

k=1

[S(Vk , Vk )S(V, V )

−(

S(Vk , V )S(V, V )

)2]

(4)

where S(Va , Vb ) =∑

i∈Va ,j∈V bwij and where the subscript h at-

tached to Q indicates that U is crisp (hard). A second, and equivalent,form of (4), which is given in [9], follows by letting mi = S({i}, V )and ‖W ‖ =

∑ni,j=1 wij . Then, (4) becomes

Qh =1

‖W ‖

c∑k=1

∑i ,j∈Vk

(wij −

mimj

‖W ‖

). (5)

Although U takes part in the calculation of (4) and (5), its role issomewhat obscured by these forms of the modularity index.

Proposition 1: The modularity index in (4) and (5) can also bewritten as

Qh = tr(UBUT )/‖W ‖, U ∈ Mhcn (6)

where B = [W − (mT m/‖W ‖)] and where m = (m1 , . . . , mn )T

[15].Proof: Using the partition entries uk i as selection variables, (5) can

be written as

Qh =1

‖W ‖

c∑k=1

n∑i ,j=1

(wij −

mimj

‖W ‖

)uk iuk j

=1

‖W ‖

c∑k=1

Uk BUTk

= tr(UBUT )/‖W ‖. �

Equation (6) explicitly reveals the role played by the partition U inthe computation of Qh . The very important point about (6) is that it iswell defined for any partition of V and not just crisp ones. Hence, wecan define the generalized modularity of U wrt G = (V, E, W ) as

Qg = tr(UBUT )/‖W ‖, U ∈ Mpcn . (7)

Qg is a proper generalization of the Newman–Girvan (NG) modularitybecause (7) reduces to (4) or (5) when U is a crisp c-partition of V .Consequently, we are entitled to call (7) the fuzzy modularity of Uwhen U ∈ Mf cn is a fuzzy c-partition of the vertices V .

A. Other Fuzzy Modularity Functionals

In addition to our proposed generalized soft modularity in (7), thereare two other modularity functionals identified in the literature as fuzzymodularity: that of Zhang et al. [11] and that of Liu [13].

1) Zhang Fuzzy Modularity: The community detection algorithmin [11] begins by partitioning V with spectral clustering applied to Gusing FCM, once the eigenvector representation of G is selected. After afuzzy c-partition U ∈ Mf cn is found this way, the partition is convertedto a possibilistic c-partition Uλ ∈ Mpcn of V as follows: A thresholdλ is chosen (presumably 0 < λ < 1), and then used to extract from thekth row of U ∈ Mf cn the vertex set Vk = {i|uk i > λ, 1 ≤ i ≤ n}.For each vertex i in Vk , the corresponding entry of (Uλ)k (the kth rowof Uλ) is set to 1. After a pass over all k rows of U is completed, theremaining (non−1) entries in Uλ are set to 0.

As an example, consider the fuzzy 3-partition of five vertices

U =

⎡⎣ 0.02 0.12 0.40 0.44 0.91

0.08 0.18 0.22 0.56 0.050.90 0.70 0.38 0.00 0.04

⎤⎦ .

The corresponding U0 .1 and U0 .2 are

U0 .1 =

⎡⎣ 0 1 1 1 1

0 1 1 1 01 1 1 0 0

⎤⎦ , U0 .2 =

⎡⎣ 0 0 1 1 1

0 0 1 1 01 1 1 0 0

⎤⎦ .

Imagine that FCM is applied to spectral data from a network of fivepeople at c = 3, and terminates at the fuzzy 3-partition U shown above.Then, the Zhang et al. conversion yields the possibilistic 3-partitionsU0 .1 and U0 .2 shown above. The “fuzzy communities” in U0 .1 are thusinterpreted as follows. Persons 2 and 3 belong to all three groups; person4 belongs to groups 1 and 2, while person 1 is only in group 3 and person5 is only in group 1. When λ is increased to 0.2, the requirement forjoint membership in fuzzy communities is more stringent. Now, person2 only belongs to group 3 (rather than all three groups as in U0 .1 ).Thus, the joint membership of an individual in various communities isa function of the threshold λ.

Zhang et al. do not specify the range of λ, but clearly it is 0 < λ < 1,for otherwise the bounds of this conversion procedure would beUλ≤0 = [1]c×n and Uλ≥1 = [0]c×n . There are many candidate parti-tions available from this procedure because we can apply this process tocandidates generated by FCM at each c = 2, 3, . . . , n − 1, and withineach c, for any 0 < λ < 1. Zhang et al. choose a “best” possibilisticUλ by maximizing their version of fuzzy modularity, which is definedas follows:

Vk = {i|uk i > λ, 1 ≤ i ≤ n}, k = 1, . . . , c (8a)

Sz (Vk , Vk ) =∑

i ,j∈Vk

(uk i + ukj

2

)wij (8b)

Sz (Vk , V ) = Sz (Vk , Vk ) +∑i∈Vk

j∈V −Vk

(uk i + (1 − ukj )

2

)wij (8c)

Qz =c∑

k=1

[Sz (Vk , Vk )S(V, V )

−(

Sz (Vk , V )S(V, V )

)2]

. (8d)

Page 3: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

1172 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013

The values uk i that appear in (8b) and (8c) are from the fuzzy c-partition before possibilistic conversion. Clearly, (8b) and (8c) reduceto S(Vk , Vk ) and S(Vk , V ), respectively, for crisp partitions. Hence,Qz = Qh for crisp partitions. In Section IV, we will discuss the the-oretical aspects of this modularity function further, but first we wouldlike to introduce another fuzzy variant.

2) Liu Fuzzy Modularity: Liu’s modularity function [13] beginswith a partition U ∈ Mf cn and then classifies the vertices according tothe maximum membership rule (which is commonly called hardeningthe partition), i.e., set i ∈ Vk if k = arg maxj=1 , . . . ,c{uj i}, for i =1, . . . , n and k = 1, . . . , c. The modularity is then computed as

Ql =1

‖W ‖

c∑k=1

∑i ,j∈Vk

[(uk i + ukj

2

)wij − pe (i, j)

](9a)

where pe (i, j) is the null model, or expected weight of edge e(i, j).The null model is defined as

pe (i, j) =df (i)df (j)

‖W ‖ , i, j ∈ Vk (9b)

where df (i) is the fuzzy degree of vertex i in community Vk , which isgiven by

df (i) =∑j∈Vk

(uk i + ukj

2

)wij

+∑r �∈Vk

(uk i + (1 − ukr )

2

)wir . (9c)

Let us now compare Qz and Ql . First, the vertex sets Vk are createddifferently; Zhang et al. convert U to a possibilistic partition, whileLiu uses the max membership rule. Past that, there is sharp similarity.The following proposition says that if the Vk s, viz., the sets of vertices{V1 , V2 , . . . , Vc} are constructed in the same way, then Zhang et al.’smodularity is equivalent to Liu’s modularity.

Proposition 2: Consider crisp sets of vertices, Vk , k = 1, . . . , c, anda corresponding fuzzy c-partition U of V . Then, Qz = Ql .

Proof: The first step of the proof is to note that S(V, V ) = ‖W ‖and rewrite Ql as

Ql =c∑

k=1

⎡⎣Sz (Vk , Vk )

S(V, V )−

∑i ,j∈Vk

pe (i, j)S(V, V )

⎤⎦ . (10)

Now, we show that∑

i ,j∈Vk

p e (i ,j )S (V ,V ) =

(Sz (Vk ,V )S (V ,V )

)2, which will finish

the proof. We start by writing

∑i ,j∈Vk

pe (i, j)S(V, V )

=∑

i ,j∈Vk

df (i)df (j)S(V, V )2 . (11)

Substituting (9c) into∑

i ,j∈Vkdf (i)df (j) and collecting terms gives

∑i ,j∈Vk

df (i)df (j) = Sz (Vk , V )2

which, when substituted back into (11) and then (10), finishes theproof. �

Remark 1: Proposition 2 shows that the only difference between Qz

and Ql is the way that the vertex sets {V1 , V2 , . . . , Vc} are created.

IV. DISCUSSION

We now compare theoretically the three fuzzy modularity functions(recall that Qz and Ql are equivalent if the Vk s are formed in the sameway). First, we look back at (7), examining each quantity. The ratioS(Vk , Vk )/S(V, V ) is the empirical probability, which we denote asp̂c ,c , that both ends of a randomly selected edge from G lie in clusterc. Similarly, S(Vk , V )/S(V, V ) is the empirical probability p̂c that anend of a randomly selected edge (a stub) lies in cluster c. Reformulatingthese quantities in terms of U gives

S(Vk , Vk ) = uTk Wuk =

n∑i ,j=1

uk iuk j wij (12a)

S(Vk , V ) = uTk W1n =

n∑i ,j=1

uk iwij (12b)

S(V, V ) = 1Tn W1n =

n∑i ,j=1

wij (12c)

where 1n is the n-length vector of ones. It can also be shown that theseequations hold true for the fuzzy modularity in (7) if we consider Vk

(or cluster k) to be a fuzzy set. Thus, Qg can be written in the sameform as (4) by using (12a)–(12c).

Conceptually, S(Va , Vb ) is the W -induced inner product of the setmemberships of the individual vertices in sets Va and Vb . To put it an-other way, S(Va , Vb ) represents the weighted sum of the edge weightsbetween Va and Vb , where the summation weight of edge (i, j) isuaiubj . This has the interpretation of the joint a posteriori probabilityp(a|i, b|j) = uaiubj . It can also be interpreted as a fuzzy intersectionoperator, which we will explore more in an upcoming paper. Goingfurther with this idea, the membership (or probability) of any vertex iin the universal set V is always 1; hence, (12b) and (12c) also maketheoretical sense. If uik = p(k|i)—the a posteriori probability of clus-ter k given vertex i—then (12a) divided by (12c) is p̂c ,c , and (12b)divided by (12c) is p̂c , for a given mixture model. Since the idea isto maximize Q, we want to maximize the membership of large edgeweights in Vk , while minimizing the membership of the large stubweights (the weights of the edges that connect Vk to some other clus-ter). For these reasons, we believe that our fuzzy modularity in (7) is themost theoretically pleasing extension, as it both preserves the originalprobabilistic interpretation as well as allows for a fuzzy interpretationin the presence of fuzzy (or probabilistic or possibilistic) clusters.

The equivalent of (12a) and (12b) for Qz and Ql are (8b) and (8c),where Vk is a now a crisp set composed of the respective harden-ing method. All three fuzzy modularities, Qg , Qz , and Ql , representS(V, V ) as (12c). Therefore, the probabilistic interpretation does nothold true for Qz and Ql . Furthermore, it is easy to see that Sz (Va , Vb )does not have a fuzzy intersection interpretation either—it violatesthe boundary condition, i.e., (uai + 1)/2 �= uai . For these reasons,we believe that Qg provides a more theoretically consistent fuzzy in-terpretation of NG modularity. We now will compare the modularityfunctions for a real social network clustering application.

V. EXPERIMENTS

To test the modularity functions, we implemented the modificationof the MULTICUT spectral clustering algorithm described in [16],applying FCM to the resulting spectral features extracted from thegraph G. To compute Qh we hardened the resulting partition accordingto the maximum membership rule and applied equation (6). For Qz , wewill show results for λ = 0.01, 0.1, 0.25, and 0.5. Our proposed fuzzymodularity in (7) was computed directly on the fuzzy partition U . For

Page 4: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013 1173

Fig. 1. Karate club social network [14].

all experiments, we set the FCM fuzzifier m = 2. FCM was initializedrandomly by selecting c vectors from the spectral features as initialcluster centers. For each trial, we then ran the clustering algorithm forc = 2, 3, . . . , 10 (for example). The modularity indices were calculatedfor each c-partition, and the maximum determined the chosen value ofc. One hundred trials comprised the full experiment.

We produce two visualizations to show the results of the experi-ments. The first visualization is an image, as shown in Fig. 2(a), thatdisplays the number of trials that each modularity function maximizedat a value of c. White indicates that the modularity function maximizedat that value of c most of the trials, while black indicates the modularityfunction rarely maximized at that value of c. The second visualization,as shown in Fig. 2(b), is a plot of the normalized (divided by the maxi-mum) average value of the modularity function at each value of c overthe 100 trials.

Two datasets are used: Zachary’s Karate Club data [2] and the Amer-ican College Football data [3].1 Each of these datasets can be modeledas an undirected graph G = (V, E, W ) and have been extensively stud-ied in the social network literature.

A. Karate Club Data

The Karate Club data is an undirected graph with 34 vertices thatshow links between 34 members of a university karate club; it wascollected by Zachary in 1977 [2]. Edge weight wij indicates the relativestrength of the association between individuals i and j (number ofsituations in and outside the club in which interactions occurred). Themaximum value in W is 7 for the edge between members 26 and 32.

This dataset is a favorite among social network-ists, because theevolution of the relationship between pairs of members in the Karateclub—which was known and recorded by Zachary—provides a sort of“ground truth” for various social network analyses.

Zachary used these data and an information flow model of networkconflict resolution to explain the split up of this group into two fac-tions (the squares and circles in Fig. 1), following disputes among themembers. The principals in the split were the karate instructor (vertex1) and the president of the club (vertex 34).

Fig. 1 shows Zachary’s karate club network, as depicted by Newmanand Girvan [14]. Square nodes represent the instructor’s faction, andcircular nodes represent the president’s faction. The original belief wasthat this network should decompose well into two clusters, as shown in

1Both of these datasets can be downloaded from the UCI Network DataRepository at http://networkdata.ics.uci.edu/.

Fig. 2. Results of modularity functions on Karate Club data over 100 inde-pendent trials. (a) Choice of c. (b) Normalized average modularity value.

Fig. 1, because its members bifurcated, following either the presidentor the instructor. However, various discussions of these data in theliterature disagree, and the consensus is that there are three clusters—persons 5, 6, 7, 11, and 17 make a third cluster.

Our experimental results also back up the claim that the Karate datahave three clusters. Fig. 2 shows the two visualizations of our results.View (a) shows that all the modularity indices clearly prefer c = 3,except for Qz (λ = 0.01), which seems to weakly indicate c = 8. Thispoints to, what we believe is, the biggest problem with Qz : the choiceof λ can affect the results. Zhang et al. also provide no guidance as tohow to choose λ for a specific problem. In the Football data example,this problem will be further exemplified.

Fig. 2(b) shows that Qz (λ = 0.01) deviates dramatically from thetrend shown by the other modularity indices. Needless to say, it seemsthat choosing low values of λ is a bad idea. The plots of the normalizedmean index value of the other six indices clearly show a preferencefor c = 3. Our generalized modularity has the sharpest peak of all theindices, which indicates its strength in indicating the preferred clusterstructure. Furthermore, Qg is also the only index that takes into fullaccount the fuzziness of the partition.

B. American College Football Data

The Football data are also an undirected graph; they have 115 ver-tices that show links between 115 Division I college football teamsduring the 2000 regular season [3]. Unlike the Karate data, the Footballdata are unweighted as wij ∈ {0, 1}; wij = 1 if teams i and j playedeach other and 0 else. The network represents 613 total games of the

Page 5: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

1174 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013

Fig. 3. Results of modularity functions on American College Football dataover 100 independent trials. (a) Choice of c. (b) Normalized average modularityvalue.

regular season. The teams come from 11 conferences, and teams typi-cally play most of their games against members of their own conference;hence, this outside information would suggest that c = 11. However,the Sunbelt conference teams play quite a few games outside of theirconference (mostly with the Western Athletic and Mid-American con-ferences). Thus, many feel that that this network has c = 10 or c = 11clusters [11].

The visualization of our experimental results shown in Fig. 3(a)backs up the conjecture that there are c = 10 or c = 11 clusters inthese data. Qh and Qg choose c = 10 or c = 11 for most of the trials,and three of the other five modularity indices show at least a moderatepreference for c = 10 or c = 11. Interestingly, Qg also indicates somecluster structure at c = 9; at c = 9, the partition shows more mixing be-tween the Sunbelt, Western Athletic, and Mid-American conferences,which matches well with our outside knowledge of these data. In con-trast with Qh and Qg , the other fuzzy indices also show a frequentchoice of c = 5 (note that Qg also shows a very weak preference orinfrequent choice of c = 5). Furthermore, the results of Qz are highlydependent on the value of λ. At λ = 0.5, c = 10 is the most frequentchoice, while at λ = 0.25, c = 5 is the most frequency choice, and atλ = 0.1, c = 6 is chosen most often. Again, the results for λ = 0.01seem to indicate that this is a bad choice for this parameter.

Fig. 3 plots the normalized mean index value over the 100 trialsfor all seven modularity functions. Of these, Qg and Qz (λ = 0.5)show the strongest peaks at c = 10 and c = 11; although Qz (λ = 0.5)has a very strong peak at c = 5, Qg has a somewhat less pronouncedpeak at c = 5. Interestingly, Qh is quite flat, compared with the othermodularities, but from Fig. 3(a), we know that Qh usually maximizesat c = 10 or c = 11. Qz for λ = 0.25, 0.1, and 0.01 shows behaviorthat is contrary to the other modularities, providing more evidence thatchoosing λ can be problematic.

Overall, we argue that the results shown in Fig. 3(a) present a strongcase for our generalized modularity as compared with the other fuzzymodularities. Qg takes into account the fuzzy memberships when com-puting the modularity but does not seem to be plagued by the problemsthat Qz and Ql exhibit, i.e., choosing λ in the case of Qz , and thefrequency at which Qz and Ql chose c = 5. To be fair, cluster va-lidity is an inexact science, and there could be a cluster structure inthe Football data at c = 5; however, based on our knowledge of thisdataset, c = 10 and c = 11 are more reasonable solutions. Finally, ourgeneralized modularity most closely mimics the statistically inspiredNG modularity, both in theory and in results.

VI. CONCLUSION

Our generalization of the NG modularity function in (7) repre-sents a step-forward in the problem of finding fuzzy communities ingraph data. First, it is a direct generalization of the underlying theoryof graph modularity, namely, providing a measure of the probabil-ity of a graph structure relative to a null hypothesis. Second, it is aparameter-free instantiation of fuzzy modularity. Finally, we showedthat on the well-studied Karate Club and American College Footballdatasets that our fuzzy modularity index performs comparably withNewman’s crisp modularity function, the Zhang et al. fuzzy index,and Liu’s fuzzy index. Furthermore, we argue that our modularityfunction more effectively identifies the cluster structure in the Amer-ican College Football data, as compared with the other two fuzzyindices.

As we stated earlier, the functions in (12a)–(12c) can be thought ofas fuzzy intersection operations—the intersection operator used in Qg

is multiplication. In the future, we will extend S(Va , Vb ) to other fuzzyintersection operators

Sg (Va , Vb ) =n∑

i ,j=1

i(uai , uaj )wij

where i(·, ·) is a fuzzy intersection operator, e.g., min or multiplication.It is our conjecture that different fuzzy intersection operators will elicitdifferent types of communities in the social network (or, more generally,the undirected weighted graph). We will do a thorough examinationwith real data to show how different operators affect the clusteringresults.

We will also examine how our fuzzy modularity in (7) can bedirectly maximized. Initial work in this direction indicates that (7)can be posed as a generalized eigenvalue problem. Our current ef-forts are focused on establishing this maximization problem withinthe existing spectral clustering framework and on stabilizing the nu-merical issues involved in reaching the solution. Second, we will ex-amine how fuzzy modularity can be generalized as a cluster validityindex (or quality function) for asymmetric adjacency matrices or di-rected graphs. Finally, we will do a comprehensive comparison ofvarious validity indices and clustering algorithms on a wide assort-ment of graph-based and social network data. As always, we be-lieve that there is no free lunch in the clustering game and that thebest attacks at these problems will involve an ensemble of clusteringtools.

REFERENCES

[1] S. Wasserman and K. Faust, Social Network Analysis. Cambridge, U.K.:Cambridge Univ. Press, 1994.

[2] W. Zachary, “An information flow model for conflict and fission in smallgroups,” J. Anthropol. Res., vol. 33, pp. 452–473, 1977.

[3] M. Girvan and M. E. J. Newman, “Community structure in social andbiological networks,” Proc. Nat. Acad. Sci., vol. 99, pp. 7821–7826, 2002.

Page 6: A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 21, NO. 6, DECEMBER 2013 1175

[4] L. A. Zadeh, “Similarity relations and fuzzy orderings,” Inf. Sci., vol. 3,pp. 177–200, 1971.

[5] J. M. Blin, “Fuzzy relations in group decision theory,” J. Cybern., vol. 4,pp. 17–22, 1974.

[6] J. C. Bezdek, B. Spillman, and R. Spillman, “A fuzzy relation space forgroup decision theory,” Fuzzy Sets Syst., vol. 1, pp. 255–268, 1978.

[7] J. C. Bezdek, B. Spillman, and R. Spillman, “Fuzzy relation spaces forgroup decision theory: An application,” Fuzzy Sets Syst., vol. 2, pp. 5–14,1979.

[8] B. Spillman, J. C. Bezdek, and R. Spillman, “Development of an instru-ment for the dynamic measurement of consensus,” Commun. Monographs,vol. 46, pp. 1–12, 1979.

[9] S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486,pp. 75–174, 2010.

[10] J. Reichardt and S. Bornholdt, “Detecting fuzzy community structures incomplex networks with a Potts model,” Phys. Rev. Lett., vol. 93, no. 21,pp. 218701-1–218701-4, 2004.

[11] S. Zhang, R. S. Wang, and X. S. Zhang, “Identification of overlappingcommunity structure in complex networks using fuzzy c-means cluster-ing,” Statist. Mech. Appl., vol. 374, no. 1, pp. 483–490, 2007.

[12] J. C. Bezdek, J. M. Keller, R. Krishnapuram, and N. R. Pal, Fuzzy Modelsand Algorithms for Pattern Recognition and Image Processing. Norwell,MA, USA: Kluwer, 1999.

[13] J. Liu, “Fuzzy modularity and fuzzy community structure in networks,”Eur. Phys. J. B, vol. 77, pp. 547–557, 2010.

[14] M. E. J. Newman and M. Girvan, “Finding and evaluating communitystructure in networks,” Phys. Rev. E, vol. 69, pp. 026113-1–026113-15,2004

[15] M. E. J. Newman, “Modularity and community structure in net-works,” Proc. Nat. Acad. Sci., vol. 103, no. 23, pp. 8577–8582,2006.

[16] D. Verna and M. Meila, “A comparison of spectral clustering algorithms,”Dept. Comput. Sci. Eng, Univ. Washinton, Seattle, WA, USA, Tech. Rep.03-05-01, 2003