C Faloutsos 15 781 10 701 CMU SCS Graph analysis laws tools Christos Faloutsos Carnegie Mellon University CMU SCS Overall Outline Laws mainly power laws Generators and Tools 15 781 10 701 c 2008 C Faloutsos 3 CMU SCS Outline Problem definition Motivation Static dynamic laws generators Tools CenterPiece graphs Tensors Other projects Virus propagation e bay fraud detection Conclusions 15 781 10 701 c 2008 C Faloutsos 4 1 C Faloutsos 15 781 10 701 CMU SCS Motivation Data mining find patterns rules outliers Problem 1 How do real graphs look like Problem 2 How do they evolve Problem 3 How to generate realistic graphs TOOLS Problem 4 Who is the master mind Problem 5 Track communities over time 15 781 10 701 c 2008 C Faloutsos 5 CMU SCS Problem 1 Joint work with Dr Deepayan Chakrabarti CMU Yahoo R L 15 781 10 701 c 2008 C Faloutsos 6 CMU SCS Graphs why should we care web hyper text graph IR bi partite graphs doc terms T1 D1 DN TM and more 15 781 10 701 c 2008 C Faloutsos 7 2 C Faloutsos 15 781 10 701 CMU SCS Graphs why should we care Internet Map lumeta com Friendship Network Moody 01 15 781 10 701 Food Web Martinez 91 Protein Interactions genomebiology com c 2008 C Faloutsos 8 CMU SCS Graphs why should we care network of companies board of directors members viral marketing web log blog news propagation computer network security email IP traffic and anomaly detection 15 781 10 701 c 2008 C Faloutsos 9 CMU SCS Problem 1 network and graph mining 15 781 10 701 How does the Internet look like How does the web look like What is normal abnormal which patterns laws hold c 2008 C Faloutsos 10 3 C Faloutsos 15 781 10 701 CMU SCS Graph mining Are real graphs random 15 781 10 701 c 2008 C Faloutsos 11 CMU SCS Laws and patterns Are real graphs random A NO Diameter in and out degree distributions other surprising patterns 15 781 10 701 c 2008 C Faloutsos 12 CMU SCS Solution 1 Power law in the degree distribution SIGCOMM99 internet domains att com log degree ibm com 0 82 log rank 15 781 10 701 c 2008 C Faloutsos 13 4 C Faloutsos 15 781 10 701 CMU SCS Solution 1 Eigen Exponent E Eigenvalue Exponent slope E 0 48 May 2001 Rank of decreasing eigenvalue A2 power law in the eigenvalues of the adjacency matrix 15 781 10 701 c 2008 C Faloutsos 14 CMU SCS But How about graphs from other domains 15 781 10 701 c 2008 C Faloutsos 15 CMU SCS Web In and out degree distribution of web sites Barabasi IBM CLEVER log indegree from Ravi Kumar Prabhakar Raghavan Sridhar Rajagopalan Andrew Tomkins 15 781 10 701 log freq c 2008 C Faloutsos 16 5 C Faloutsos 15 781 10 701 CMU SCS Web In and out degree distribution of web sites Barabasi IBM CLEVER log freq from Ravi Kumar Prabhakar Raghavan Sridhar Rajagopalan Andrew Tomkins 15 781 10 701 log indegree c 2008 C Faloutsos 17 CMU SCS The Peer to Peer Topology Jovanovic Frequency versus degree Number of adjacent peers follows a power law 15 781 10 701 c 2008 C Faloutsos 18 CMU SCS More power laws citation counts citeseer nj nec com 6 2001 100 cited pdf log count log count 10 Ullman 1 100 15 781 10 701 1000 log citations c 2008 C Faloutsos 10000 log citations 19 6 C Faloutsos 15 781 10 701 CMU SCS Swedish sex web Albert Laszlo Barabasi Nodes people Females Males Links sexual relationships http www nd edu networks Publication 20Categories 04 20Talks 2005 norway3hours ppt 4781 Swedes 18 74 59 response rate 15 781 10 701 c 2008 C Faloutsos 20 Liljeros et al Nature 2001 CMU SCS Swedish sex web Albert Laszlo Barabasi Nodes people Females Males Links sexual relationships http www nd edu networks Publication 20Categories 04 20Talks 2005 norway3hours ppt 4781 Swedes 18 74 59 response rate 15 781 10 701 c 2008 C Faloutsos 21 Liljeros et al Nature 2001 CMU SCS More power laws web hit counts w A Montgomery Web Site Traffic log count Zipf ebay users sites log in degree 15 781 10 701 c 2008 C Faloutsos 22 7 C Faloutsos 15 781 10 701 CMU SCS epinions com who trusts whom Richardson Domingos KDD 2001 count trusts 2000 people user out degree 15 781 10 701 c 2008 C Faloutsos 23 CMU SCS Outline Problem definition Motivation Static dynamic laws generators Tools CenterPiece graphs Tensors Other projects Virus propagation e bay fraud detection Conclusions 15 781 10 701 c 2008 C Faloutsos 24 CMU SCS Motivation Data mining find patterns rules outliers Problem 1 How do real graphs look like Problem 2 How do they evolve Problem 3 How to generate realistic graphs TOOLS Problem 4 Who is the master mind Problem 5 Track communities over time 15 781 10 701 c 2008 C Faloutsos 25 8 C Faloutsos 15 781 10 701 CMU SCS Problem 2 Time evolution with Jure Leskovec CMU MLD and Jon Kleinberg Cornell sabb CMU 15 781 10 701 c 2008 C Faloutsos 26 CMU SCS Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter diameter O log N diameter O log log N What is happening in real data 15 781 10 701 c 2008 C Faloutsos 27 CMU SCS Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter diameter O log N diameter O log log N What is happening in real data Diameter shrinks over time 15 781 10 701 c 2008 C Faloutsos 28 9 C Faloutsos 15 781 10 701 CMU SCS Diameter ArXiv citation graph Citations among physics papers 1992 2003 One graph per year 2003 diameter 29 555 papers 352 807 citations time years 15 781 10 701 c 2008 C Faloutsos 29 CMU SCS Diameter Autonomous Systems Graph of Internet One graph per day 1997 2000 2000 diameter 6 000 nodes 26 000 edges 15 781 10 701 number of nodes c 2008 C Faloutsos 30 CMU SCS Diameter Affiliation Network Graph of collaborations in physics authors linked to papers 10 years of data 2002 diameter 60 000 nodes 20 000 authors 38 000 papers 133 000 edges 15 781 10 701 time years c 2008 C Faloutsos 31 10 C Faloutsos 15 781 10 701 CMU SCS Diameter Patents diameter Patent citation network 25 years of data 1999 2 9 million nodes 16 5 million edges time years 15 781 10 701 c 2008 C Faloutsos 32 CMU SCS Temporal Evolution of the Graphs N t nodes at time t E t edges at time t Suppose that N t 1 2 N t Q what is your guess for E t 1 2 E t 15 781 10 701 c 2008 C Faloutsos 33 CMU SCS Temporal Evolution of the Graphs N t nodes at time t E t edges at time t Suppose that N t 1 2 N t Q …
View Full Document