Community Structure in Large Social and Information...

Community Structure in LargeCommunity Structure in LargeSocial and Information NetworksSocial and Information Networks

Michael W. Mahoney

Stanford University

(For more info, see: http://cs.stanford.edu/people/mmahoney)

Lots and lots of large data!

• DNA micro-array data and DNA SNP data

• High energy physics experimental data

• Hyper-spectral medical and astronomical image data

• Term-document data

• Medical literature analysis data

• Collaboration and citation networks

• Internet networks and web graph data

• Advertiser-bidded phrase data

• Static and dynamic social network data

Networks and networked data

Interaction graph model ofnetworks:• Nodes represent “entities”• Edges represent “interaction”between pairs of entities

Lots of “networked” data!!• technological networks

– AS, power-grid, road networks

• biological networks– food-web, protein networks

• social networks– collaboration networks, friendships

• information networks– co-citation, blog cross-postings,advertiser-bidded phrase graphs...

• language networks– semantic networks...

• ...

Algorithmic vs. Statistical Perspectives

Computer Scientists• Data: are a record of everything that happened.• Goal: process the data to find interesting patterns and associations.• Methodology: Develop approximation algorithms under differentmodels of data access since the goal is typically computationally hard.

Statisticians• Data: are a particular random instantiation of an underlying processdescribing unobserved patterns in the world.• Goal: is to extract information about the world from noisy data.• Methodology: Make inferences (perhaps about unseen events) bypositing a model that describes the random variability of the dataaround the deterministic model.

Lambert (2000)

Perspectives are NOT incompatible

• Statistical/probabilistic ideas are central to recent work on developingimproved randomized algorithms for matrix problems.

• Intractable optimization problems on graphs/networks yield toapproximation when assumptions are made about network participants.

• In boosting (a statistical technique that fits an additive model byminimizing an objective function with a method such as gradient descent)the computation parameter (i.e., the number of iterations) also serves asa regularization parameter.

Sponsored (“paid”) SearchText-based ads driven by user query

Community Structure in Large Social and Information...

Documents

INCREASE/DECREASE 3,337,685 · SAN DIEGO MSA - PER SUBMARKET ANALYSIS RENT & OCCUPANCY STATS SAN DIEGO MSA - PER SUBMARKET ANALYSIS SALES DATA (50+) SALES DATA (5-49) San Diego MSA

Algorithmic+and+Statistical+Perspectives++ on+Large6Scale ...hyperion.usc.edu/UQ-SciML-2018/Mahoney.pdf · (Lin and Altman (2005) Am J Hum Genet) • The figure renders visual support

KANSAS CITY SUBMARKET DETAIL REPORT...Submarket Detail Report | March 2017 # Property Units Year 1 Antioch Crossing 96 2016 2 Brighton Creek Apartments 286 2016 3 Haven at Shoal Creek

MB Real Estate's 2012 4th Quarter Chicago Market Overview Submarket Snapshots

Using sub query. Sub query A query inside another query

CRESTVIEW NORTH APARTMENTS - 151 UNITS...Townhomes are located in well-established Northeast Carmichael/West Fair Oaks submarket of Sacramento. This submarket has led the Sacramento

Chapter 7 The Query Compiler Query Processor ： Query Parser Tree Logical Query Plan Physical Query Plan Query Structure Relational Algebraic Expression

Greater Downtown Miami Mid-Year Report...2 Introduction 4 Greater Downtown Miami Market Submarket Map 5 What’s Changed Since Year-End 2016? 9 Submarket Analysis 11 Unit Sizes and

Query Processing and Query Optimization

Nannie Helen Burroughs - Neighborhood Retail Submarket

Extracting Structure from Matrices & Tensors by Random ...tgkolda/tdw2004/Mahoney.pdf · Extracting Structure from Matrices & Tensors by Random Sampling Michael W. Mahoney Yale University

SUBLEASE AVAILABLE IN HIGH-GROWTH SUBMARKET 4350 LOCKHILL ... · SUBLEASE AVAILABLE IN HIGH-GROWTH SUBMARKET 4350 LOCKHILL SELMA 4350 LOCKHILL SELMA SAN ANTONIO, TEXAS 8249 The information

© 2007 John Wiley & Sons Chapter 4 - Market and Submarket AnalysisPPT 4-1 Market/Submarket Analysis Chapter Four Copyright © 2007 John Wiley & Sons, Inc

Buffalo Niagara Advanced Business Services · SOUTH SUBMARKET EAST SUBMARKET Operational Advantages Buffalo Niagara’s low cost of real estate and wages help a business’ bottom

LoopNet · OFFICE SUBMARKET REPORT Submarket Key Statistics 2 Leasing 3 Rent 7 Construction 9 Sales 13 ... Net Absorption SF 628 K 810,518 374,938 5,942,916 2015 Q2 (66,765) 2010

Beginning Query Conference 2015 1. Aeries Query 2. Using Query 3

Houston Office Submarket Update 12_2012

Q2 2019 CHARLOTTE...April 2019 SUBMARKET SNAPSHOT 5 YEAR CHANGE ... prestigious neighborhoods of Myers Park, Dilworth, and SouthPark, the Airport submarket offers a wealth of executive

Introduction to Database Systems CSE 444...Query Evaluation Steps Parse & Check Query Decide how best to answer query: query optimization Query Execution SQL query 3 Return Results

CHMA: Greenville-Spartanburg, South Carolina...demand. Includes an estimated demand for 200 mobile homes in the Greenville submarket and 100 mobile homes in the Spartanburg submarket