John L. Taylor
Network science is an interdisciplinary field which studies complex networks.
A complex network is a graph with non-trivial topological features- features that do not occur in simple networks such as lattices or random graphs, but often occur in real graphs.
The burgeoning field of computer science has shifted our view of the physical world from that of a collection of interacting material particles to one of a seething network of information. In this way of looking at nature, the laws of physics are a form of software, or algorithm, while the material world-the hardware-plays the role of a gigantic computer.
A graph is composed of vertexes and edges: G = (V, E). The vertexes represent things and the edges pair-wise relationships.
| Network | Type | Vertex | Edge |
|---|---|---|---|
| The Internet | Technological | Computer/Router | Cable/Wireless |
| Power Grid | Technological | Generating Station/Substation | Transmission Line |
| Affiliation Network | Social | Person or Group | Membership |
| Friendship Network | Social | Person | Friendship |
| WWW | Information | Web Page | Link |
| Citation Networks | Information | Article | Citation |
| Metabolic Networks | Biological | Metabolite | Metabolic Reaction |
| Neural Networks | Biological | Neuron | Synapse |
| Food Networks | Biological | Species | Predation |
Complex networks may be considered in the following ways:
Networks model significant features of complex distributed systems, leading to new ways to predict, explain and influence natural and artificial systems.
Network Science + ??? = FUN & PROFIT!
The simple graph is composed of vertices and edges that can be represented as an edge list:
E(g)
Edge sequence:
[1] B -- A
[2] C -- B
Or an adjacency matrix which represents which vertices are adjacent to which other vertices.
get.adjacency(g)
3 x 3 sparse Matrix of class "dgCMatrix"
A B C
A . 1 .
B 1 . 1
C . 1 .
The edge sequence is directed in the following way:
E(g)
Edge sequence:
[1] A -> B
[2] B -> C
Note the asymmetry of the adjacency matrix:
get.adjacency(g)
3 x 3 sparse Matrix of class "dgCMatrix"
A B C
A . 1 .
B . . 1
C . . .
In addition to the edge sequence structure, the weighted graph has a vector of weights applied to its edges:
E(g)$weight
[1] 6 3 1
Edge sequence:
E(bg)
Edge sequence:
[1] Group1 -- A
[2] Group2 -- A
[3] Group1 -- B
[4] Group3 -- B
[5] Group1 -- C
[6] Group3 -- C
[7] Group2 -- D
[8] Group4 -- D
[9] Group3 -- E
[10] Group4 -- E
Adjacency matrix (V x V):
get.adjacency(bg)
9 x 9 sparse Matrix of class "dgCMatrix"
A B C D E Group1 Group2 Group3 Group4
A . . . . . 1 1 . .
B . . . . . 1 . 1 .
C . . . . . 1 . 1 .
D . . . . . . 1 . 1
E . . . . . . . 1 1
Group1 1 1 1 . . . . . .
Group2 1 . . 1 . . . . .
Group3 . 1 1 . 1 . . . .
Group4 . . . 1 1 . . . .
An incidence matrix represents the relationship between two classes of objects: V x G
get.incidence(bg)
Group1 Group2 Group3 Group4
A 1 1 0 0
B 1 0 1 0
C 1 0 1 0
D 0 1 0 1
E 0 0 1 1
There are three primary levels at which a graph may be analyzed:
This is a data set of Padgett (1994), consisting of weddings among leading Florentine families in Renaissance Florence (1282-1500).
data(flo)
g=graph.adjacency(flo,mode="undirected"
,weighted=NULL)
m <- as.matrix(flo)
glayout <- layout.fruchterman.reingold(g)
summary(g)
IGRAPH UN-- 16 20 --
attr: name (v/c)
Graph density measures how many edges are in set E compared to the maximum possible number of edges between vertices in set V. Low density means lo connectivity, while higher density means higher connectivity. Dense graphs may be more resistant to link failures, while communicating information, disease, etc. more readily.
graph.density(g)
[1] 0.1667
The hub degree is the degree of the highest-degree node/s.
detach("package:sna", unload=TRUE)
max(degree(g))
[1] 6
library("sna")
Average path length is defined as the average number of steps along the shortest paths for all possible pairs of network.
average.path.length(g)
[1] 2.486
The diameter of a graph is the length of the shortest path between the most distanced nodes.
diameter(g)
[1] 5
The clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. This is a closely related calculation called the transitivity ratio, which is the number of closed triplets in a graph over the number of connected triples.
clustering_coef<-transitivity(g, type="global")
clustering_coef
[1] 0.1915
The degree distribution is the probability distribution of these degrees over the whole network, meaning it represents the likelihood of a randomly selected node to have a given number of degrees.
| Network | Random | Scale-Free | Small-World | n-Regular |
|---|---|---|---|---|
| Density | .040 | .039 | .040 | .040 |
| Hub Degree | 10 | 40 | 6 | 4 |
| Avg. Path Length | 3.318 | 2.749 | 4.755 | 12.878 |
| Diameter | 6 | 5 | 10 | 25 |
| Clust. Coef. | .036 | .063 | .330 | .5 |
Cliques in an undirected graph are subsets of its vertices such that every two vertices in the subset are connected by an edge. In other words, cliques are complete subgraphs of a network.
These sorts of structures are particularly useful in identifying communities.
A few of the most significant centrality measures include:
A network is said to have community structure if the nodes of the network can be easily grouped into (potentially overlapping) sets of nodes such that each set of nodes is densely connected internally.
Some networks may not have any meaningful community structure, such as the random graphs and Barabasi-Albert scale-free graphs.
Social network between members of a university karate club, led by president John A. and karate instructor Mr. Hi (pseudonyms).
Wayne W. Zachary studied conflict and fission in this network, as the karate club was split into two separate clubs, after long disputes between two factions of the club, one led by John A., the other by Mr. Hi.
Random walks made to determine how likely one will stay within group.
Network diffusion is the process whereby states are spread over networks. These states may be informational, viral, failures or otherwise.
There are many models of diffusion, including the SI and SIR models.
The Susceptible-Infected Model works simply. Given a network g: g(V, E):