Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型

Propertities of Networks and Random Graphs

1. Key network propertities

1. degree distribution p(k)p(k)
probability that a randomly chosen node has degree k:
p(k)=Nk/Np(k)=N_k/N
in which,Nk=number of nodes with degree kin\ which, N_k=number\ of\ nodes\ with\ degree\ k.
2. path length hh

  • distance: the shortest path between two nodes
  • diameter: the maximum distance between any pair of nodes in a graph
  • average path length
    h=12Emaxi,jihij\overline h=\frac 1{2E_{max}}\cdot \sum_{i,j\neq i}h_{ij}
    Emax=n(n1)2E_{max}=\frac{n(n-1)}{2}
    3. clustering coefficient (for undirected graph) CC
    to what extent do my friends know each other
    Ci=2eiki(ki1)C_i=\frac {2e_i}{k_i(k_i-1)}
    in which,ei is the number of edges of nodeiin\ which, e_i\ is\ the\ number\ of\ edges\ of\ node_i; ki is the degree of nodeik_i\ is\ the\ degree\ of\ node_i.
    4. connectivity ss
    size of the largest connected component
    BFS (breath first search)

2. Erdos_Renyi random graphs

GnpG_{np}: undirected graph with nn nodes where each edge (u,v)(u,v) appears i.i.d. (independently) with probability pp
GnmG_{nm}: undirected graph with nn nodes and mm edges picked uniformly at random

GnpG_{np}:

1. degree distribution

  • it is binomial:
    P(k)=(n1k)pk(1p)n1kP(k)={n-1\choose k}p^k(1-p)^{n-1-k}
  • average degree:
    k=p(n1)pn\overline k=p(n-1)\approx{pn}
  • variance
    σ2=p(1p)(n1)\sigma^2=p(1-p)(n-1)
  • an important property
    σk=1pp1n11(n1)1/2\frac{\sigma}{\overline k}=\sqrt{ \frac{1-p}{p}\frac{1}{n-1}}\approx{\frac{1}{(n-1)^{1/2}}}
    as n increases,the distribution becomes narrowand the degree of a node is in the vincinity of kas\ n\ increases, the\ distribution\ becomes\ narrow\, and\ the\ degree\ of\ a\ node\ is\ in\ the\ vincinity\ of\ k

2. clustering coefficient

  • def:
    given Ci=2ei/ki(ki1)given\ C_i=2e_i/k_i(k_i-1)
    then:
    the expected E(ei) is pki(ki1)/2the\ expected\ E(e_i)\ is\ p\cdot k_i(k_i-1)/2
    then:
    E(Ci)=p=kn1knE(C_i)=p=\frac{\overline k}{n-1}\approx \frac {\overline k}{n}

3. path length

  • fact: the path length of E-R graph is O(logn)O(\log n)
  • def: expansion: α\alpha
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型
    basically it means given a subset of nodes SS, the proportion of edges leaving S.
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型

4. connectivity

  • k>1,largest conn. comp. exists\overline k>1,largest\ conn.\ comp.\ exists
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型
    Although real networks are not random graphs, the model GnpG_{np} is still extremely useful. It is the reference model for the rest of the class.

3. small-world network

  1. key characteristics: high clustering and low path length
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型
  2. generating approach:
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型
  3. property:
    Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型Machine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型

4. Kronecker graph modelMachine Learning with Graph 笔记 Lecture 2 网络特性与随机图模型