Infinite’Edge’Par..on’Models’for’Overlapping’Community’Detec… · 2020-06-17 ·...

Preview:

Citation preview

The  community-­‐affilia1on  graph  model  of  Yang  &  Leskovec  (2012)  can  be  considered  as  a  special  case  if  we  restrict                                                    .  

A   hierarchical   gamma   process   infinite   edge   par11on   model   is  proposed  to  factorize  the  binary  adjacency  matrix  of  an  unweighted  undirected  rela1onal  network  under  a  Bernoulli-­‐Poisson  link:    Ø The   Bernoulli-­‐Poisson   link   connects   each   edge   to   a   latent   count  that  is  further  par11oned.  Each  node  is  assigned  to  one  or  mul1ple  latent  communi1es  depending  on  how  its  edges  are  par11oned.    

Ø The  model  describes  both  homophily  and  stochas1c  equivalence,  and   is  scalable   to  big  sparse  networks  by   focusing   its  computa1on  on  pairs  of  linked  nodes.    

Ø It   can   not   only   discover   overlapping   communi1es   and   inter-­‐community  interac1ons,  but  also  predict  missing  edges.    

Ø The   number   of   communi1es   is   automa1cally   inferred   in   a  nonparametric   Bayesian  manner,   and   efficient   inference   via  Gibbs  sampling  is  derived  using  novel  data  augmenta1on  techniques.  

Infinite  Edge  Par..on  Models  for  Overlapping  Community  Detec.on  and  Link  Predic.on  Mingyuan  Zhou  

Department  of  Informa.on,  Risk,  and  Opera.ons  Management  The  University  of  Texas  at  Aus.n,  Aus.n,  TX,  USA  

Introduc.on  

Model  and  Inference  

q   Modeling  Components  

Bernoulli-­‐Poisson  Link:  

Overlapping  community  structure:  

q   Scalability  for  Big  Sparse  Networks  q   Hierarchical  Gamma  Process  

q   Protein-­‐Protein  interac.on  network   q   NIPS234  Coauthor  network  

Poisson  Factor  Analysis:  

Modeling  Assorta1vity:  

Both  assorta1vity    and  dissorta1vity:  

Link  binary  to  count:  

Marginal  distribu1on:  

Condi1onal  posterior:  

The  count                                        represents  how  oVen  nodes  i  and   j   interact   due   to   their   affilia1ons   with  communi1es  k1  and  k2,  respec1vely.    

Computa1on   is   mainly   spent   on   pairs   of   linked   nodes,   as   if  b_ij=0,  then  all                                  are  equal  to  zeros  almost  surely.  

 O(dN)  instead  of  O(N^2),  where  d  is  the  average  node  degrees.    

q   Hierarchical  Gamma  Process  Edge  Par..on  Model  

q   Gamma  Process  Edge  Par..on  Model  

q Gibbs  Sampling  via  Data  Augmenta.on  and  Marginaliza.on  Using  inference  techniques  developed  for  the  Bernoulli-­‐Poisson  link,  and  the  Poisson,  mul1nomial,  and  nega1ve  binomial  distribu1ons.  

2015  

Example  Results  

Recommended