In mathematics, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices, nodes, or points which are connected by edges, arcs, or lines. A graph may be undirected, meaning that there is no distinction between the two vertices associated with each edge, or its edges may be directed from one vertex to another; see Graph (discrete mathematics) for more detailed definitions and for other variations in the types of graph that are commonly considered. Graphs are one of the prime objects of study in discrete mathematics.

In the following study, we try to introduce you some R packages such as “igraph” and “statnet” which can help you to find out more about this theory and it’s beneficial functions.

  1. Package Installation
  2. Matrices in R
  3. Simple Network Generation
  4. Network Attributes
  5. Special Types of Graphs (Standard Structures)
  6. Preparing Network Data
  7. Network Descriptives
  8. Centrality and Distances
  9. Subgraphs and Communities
  10. Statistical Inferences
  11. Graph Visualization
  12. Network plot improvement

1. “igraph” Package Installation

2. Matrices in R

Matrices are rectangular array of numbers, symbols, or expressions, arranged in rows and columns. Each matrix can be a representation of a graph. For creating matrices in R, you can use below fuctions: Create a matrix:

##   X Y Z
## X 0 1 0
## Y 1 0 1
## Z 1 0 0

A two columns matrix as edgelist:

##      [,1] [,2]
## [1,] "X"  "Y" 
## [2,] "Y"  "X" 
## [3,] "Y"  "Z" 
## [4,] "Z"  "X"
## [1] "matrix"

3. Simple Network Generation

Generate simple graph with three nodes with numbers of 1 to 4 as IDs. The edges will be look like as 1–2, 2–3, 3–4,4–1

## IGRAPH U--- 4 4 -- 
## + edges:
## [1] 1--2 2--3 3--4 1--4

Graph with six vertices , and directed by default:

## IGRAPH D--- 6 4 -- 
## + edges:
## [1] 1->2 2->3 3->4 4->1

Graph named by characters:

When the edge list has vertex names, the number of nodes is not needed

## IGRAPH DN-- 3 3 -- 
## + attr: name (v/c)
## + edges (vertex names):
## [1] Mehran->Payman Payman->Iman   Iman  ->Mehran

Isolate vertices in a graph: In named graphs we can specify isolates by providing a list of their names. “vertex.label.dist” is the distance of the label from the center of the vertex. If it is 0 then the label is centered on the vertex. If it is 1 or more than 1 then the label is displayed beside the vertex.

For small graphs, we can also use a description of this kind of operators:

+- or -+ for directed ties pointing left & right,

++ for a symmetric tie, and

for sets of vertices.

4. Network Attributes

All objects in igraph can have attributes including vertices, edges and graph such as name, weight (for edges) and type (e.g. bi-partite graphs). The explanation of an igraph object starts with up to four letters:

  1. D or U, for a directed or undirected graph
  2. N for a named graph (where nodes have a name attribute)
  3. W for a weighted graph (where edges have a weight attribute)
  4. B for a bipartite (two-mode) graph (where nodes have a type attribute)
  5. The two numbers that follow the letters, refer to the number of nodes and edges in the graph.

And it also lists node & edge attributes, such as:

(g/c) - graph-level character attribute (v/c) - vertex-level character attribute (e/n) - edge-level numeric attribute

Building a sample graph using “igraphdata” package:

## IGRAPH DN-- 755 23473 -- US airports
## + attr: name (g/c), name (v/c), City (v/c), Position (v/c),
## | Carrier (e/c), Departures (e/n), Seats (e/n), Passengers (e/n),
## | Aircraft (e/n), Distance (e/n)
## + edges (vertex names):
##  [1] BGR->JFK BGR->JFK BOS->EWR ANC->JFK JFK->ANC LAS->LAX MIA->JFK
##  [8] EWR->ANC BJC->MIA MIA->BJC TEB->ANC JFK->LAX LAX->JFK LAX->SFO
## [15] AEX->LAS BFI->SBA ELM->PIT GEG->SUN ICT->PBI LAS->LAX LAS->PBI
## [22] LAS->SFO LAX->LAS PBI->AEX PBI->ICT PIT->VCT SFO->LAX VCT->DWH
## [29] IAD->JFK ABE->CLT ABE->HPN AGS->CLT AGS->CLT AVL->CLT AVL->CLT
## [36] AVP->CLT AVP->PHL BDL->CLT BHM->CLT BHM->CLT BNA->CLT BNA->CLT
## + ... omitted several edges
## $name
## [1] "US airports"

Vertex attributes:

## [1] "name"     "City"     "Position"
## [1] "Bangor, ME"    "Boston, MA"    "Anchorage, AK" "New York, NY" 
## [5] "Las Vegas, NV" "Miami, FL"

Edges attributes: The weight attribute is treated in a special way for edges and can affect the result of some calculations, such as path lengths and centrality measures.

## [1] "Carrier"    "Departures" "Seats"      "Passengers" "Aircraft"  
## [6] "Distance"
## [1] "British Airways Plc"       "British Airways Plc"      
## [3] "British Airways Plc"       "China Airlines Ltd."      
## [5] "China Airlines Ltd."       "Korean Air Lines Co. Ltd."

Sequences: “V()” for vertices and “E()” for edges are the sequence functions, allow selections and access to attributes. They return special “igraph.vs” and “igraph.es” objects that can be reused.

## + 5/755 vertices, named:
## [1] BGR BOS ANC JFK LAS
## + 1/755 vertex, named:
## [1] JFK

Access all attributes of a vertex with double square brackets:

## + 1/755 vertex, named:
##   name         City         Position
## 4  JFK New York, NY N403823 W0734644

Adding Attributes: The dollar sign allows us to manipulate attributes much like a list.

## [1] "Bangor, ME"    "Boston, MA"    "Anchorage, AK" "New York, NY" 
## [5] "Las Vegas, NV"
## + 5/755 vertices, named:
##   name          City         Position Group
## 1  BGR    Bangor, ME N444827 W0684941     B
## 2  BOS    Boston, MA N422152 W0710019     A
## 3  ANC Anchorage, AK N611028 W1495947     A
## 4  JFK  New York, NY N403823 W0734644     B
## 5  LAS Las Vegas, NV N360449 W1150908     B

Edge Selectors: We can also access edges between named vertices using the special operators:

E(network)[X %–% Y] selects edges between vertex sets X and Y, ignoring direction E(network)[X %->% Y] selects edges from vertex sets X to vertex set Y E(network)[X %<-% Y] selects edges from vertex sets Y to vertex set X

## + 26/23473 edges (vertex names):
##  [1] BOS->JFK BOS->JFK JFK->BOS JFK->BOS BOS->JFK JFK->BOS BOS->JFK
##  [8] JFK->BOS BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK JFK->BOS
## [15] JFK->BOS JFK->BOS JFK->BOS BOS->JFK JFK->BOS BOS->JFK BOS->JFK
## [22] JFK->BOS JFK->BOS BOS->JFK JFK->BOS BOS->JFK
## + 12/23473 edges (vertex names):
##  [1] JFK->BOS JFK->BOS JFK->BOS JFK->BOS JFK->BOS JFK->BOS JFK->BOS
##  [8] JFK->BOS JFK->BOS JFK->BOS JFK->BOS JFK->BOS
## + 14/23473 edges (vertex names):
##  [1] BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK
##  [8] BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK BOS->JFK

All Carriers from JFK to BOS.

## [1] "JetBlue Airways"              "Compass Airlines"            
## [3] "Pinnacle Airlines Inc."       "Comair Inc."                 
## [5] "Atlantic Southeast Airlines"  "American Eagle Airlines Inc."
## [7] "Chautauqua Airlines Inc."

Edges Between Groups: The edge selectors can be used between groups of vertices:

## + 35/23473 edges (vertex names):
##  [1] LAX->JFK LAX->JFK LAX->JFK LAX->JFK SAN->JFK SFO->JFK SFO->JFK
##  [8] SFO->JFK BUR->JFK LAX->JFK LGB->JFK OAK->JFK SAN->JFK SFO->JFK
## [15] SJC->JFK SMF->JFK LAX->JFK LAX->JFK LAX->JFK SAN->JFK SAN->JFK
## [22] SFO->JFK SFO->JFK SNA->JFK LAX->ALB LAX->JFK LAX->JFK SFO->JFK
## [29] SFO->JFK SFO->JFK BUR->FRG LAX->JFK LAX->JFK SFO->JFK SFO->JFK

Induced subgraph: To get a new graph containing the selected vertices we must also copy over all of the edges between those vertices. This is done by the “induced_subgraph” function.

## IGRAPH DN-- 34 381 -- US airports
## + attr: name (g/c), name (v/c), City (v/c), Position (v/c), Group
## | (v/c), Carrier (e/c), Departures (e/n), Seats (e/n), Passengers
## | (e/n), Aircraft (e/n), Distance (e/n)
## + edges (vertex names):
##  [1] LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO
##  [8] LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO LAX->SFO
## [15] LAX->SFO LAX->SFO LAX->SAN LAX->SAN LAX->SAN LAX->SAN LAX->SAN
## [22] LAX->SAN LAX->SAN LAX->SAN LAX->SAN LAX->SAN LAX->SAN LAX->SAN
## [29] LAX->SAN LAX->SAN LAX->SMF LAX->SMF LAX->SMF LAX->SMF LAX->SMF
## [36] LAX->SMF LAX->SMF LAX->SNA LAX->BUR LAX->OAK LAX->OAK LAX->OAK
## + ... omitted several edges

Neighbourhoods: A common task is to subset all of the neighbours of a particular vertex. To return all neighbours within a distance, d, of a number of targets.

## [[1]]
## + 487/755 vertices, named:
##   [1] JFK BGR BOS ANC LAS MIA LAX PBI PIT SFO IAD ABE AVP BDL BNA BUF BWI
##  [18] CLE CLT CMH CVG DCA DTW ILM IND MSP MSY ORF PHL PWM RDU RIC SAV SRQ
##  [35] STL SYR ALB BTV PVD ROC SWF FLL LWB MCO TPA CKB IAH ORD OKC ATL AUS
##  [52] DEN DFW HOU PDX PHX RSW SAN SAT SEA SLC SMF SNA ACY JAX SJU STT BUR
##  [69] OAK SJC EGE BQN LGB PSE FAR FWA FOE EWR LGA MHT PIE SFB CAK GSO MDT
##  [86] MKE MYR PHF XNA SCE BHB PBG PQI MCI MDW MEM FRG PTK PGD IAG ACK LEB
## [103] MVY PVC BMG AUG HYA RKD RUT SLK TEB BFI HNL OGG HOM AIN BTI FAI GAL
## [120] KSM NUI OTZ PIZ RBY SCC ADQ MCG TAL BET AKN ANI BRW CDV DLG EMK ENA
## [137] UNK VDZ OME RDB SHH SMK SVA WBB ADK JNU KTN YAK CDB DRF DUT KNW KVC
## [154] ILI NLG KCL KNK SDP SNP STG NNL PDB AEX GEG ICT BHM HPN LIT SDF MAF
## + ... omitted several vertices

Which returns a list containing the vertices within 2 of JFK . If we want the neighbourhood of a vertex as a new graph we can do

## [[1]]
## IGRAPH DN-- 487 20484 -- US airports
## + attr: name (g/c), name (v/c), City (v/c), Position (v/c), Group
## | (v/c), Carrier (e/c), Departures (e/n), Seats (e/n), Passengers
## | (e/n), Aircraft (e/n), Distance (e/n)
## + edges (vertex names):
##  [1] BGR->JFK BGR->JFK BOS->EWR ANC->JFK JFK->ANC LAS->LAX MIA->JFK
##  [8] EWR->ANC BJC->MIA MIA->BJC TEB->ANC JFK->LAX LAX->JFK LAX->SFO
## [15] AEX->LAS BFI->SBA ELM->PIT GEG->SUN ICT->PBI LAS->LAX LAS->PBI
## [22] LAS->SFO LAX->LAS PBI->AEX PBI->ICT PIT->VCT SFO->LAX IAD->JFK
## [29] ABE->CLT ABE->HPN AGS->CLT AGS->CLT AVL->CLT AVL->CLT AVP->CLT
## [36] AVP->PHL BDL->CLT BHM->CLT BHM->CLT BNA->CLT BNA->CLT BNA->DCA
## + ... omitted several edges

Adding: The edges and vertices functions can be used to add edges and nodes.

Deleting: For deletion it is easy to use edge and vertex selectors. Note this will remove all attached edges. All edge and vertex subsetting can be used here.

Now, we want to check some network attributes of a graph that is made by ourselves. In order to view vertices and edges:

## + 5/5 edges (vertex names):
## [1] Mehran--Noah   Noah  --Siamak Noah  --Siamak Siamak--Siamak
## [5] Mehran--Mehran
## + 7/7 vertices, named:
## [1] Mehran Noah   Siamak Mina   Milad  Parsa  Sara

Directly examining the network:

## 7 x 7 sparse Matrix of class "dgCMatrix"
##        Mehran Noah Siamak Mina Milad Parsa Sara
## Mehran      1    1      .    .     .     .    .
## Noah        1    .      2    .     .     .    .
## Siamak      .    2      1    .     .     .    .
## Mina        .    .      .    .     .     .    .
## Milad       .    .      .    .     .     .    .
## Parsa       .    .      .    .     .     .    .
## Sara        .    .      .    .     .     .    .
## Mehran   Noah Siamak   Mina  Milad  Parsa   Sara 
##      1      1      0      0      0      0      0

For adding attributes to the network, vertices, or edges:

## [1] "Mehran" "Noah"   "Siamak" "Mina"   "Milad"  "Parsa"  "Sara"
## $name
## [1] "Mehran" "Noah"   "Siamak" "Mina"   "Milad"  "Parsa"  "Sara"  
## 
## $gender
## [1] "male"   "male"   "male"   "female" "male"   "male"   "female"
## $type
## [1] "facebook" "facebook" "facebook" "facebook" "facebook"
## 
## $weight
## [1] 20 20 20 20 20

Set attributes to graph object:

## IGRAPH UNW- 7 5 -- facebook Network
## + attr: name (g/c), ID (g/c), name (v/c), gender (v/c), type
## | (e/c), weight (e/n)
## + edges (vertex names):
## [1] Mehran--Noah   Noah  --Siamak Noah  --Siamak Siamak--Siamak
## [5] Mehran--Mehran

How to specify graphical parameters?

There are three ways to give values to the parameters described below, in section ‘Parameters’. We give these three ways here in the order of their precedence.

The first method is to supply named arguments to the plotting commands: plot.igraph, tkplot or rglplot. Parameters for vertices start with prefix ‘vertex.’, parameters for edges have prefix ‘edge.’, and global parameters have no prefix. Eg. the color of the vertices can be given via argument vertex.color, whereas edge.color sets the color of the edges. layout gives the layout of the graphs.

The second way is to assign vertex, edge and graph attributes to the graph. These attributes have no prefix, ie. the color of the vertices is taken from the color vertex attribute and the color of the edges from the color edge attribute. The layout of the graph is given by the layout graph attribute. (Always assuming that the corresponding command argument is not present.) Setting vertex and edge attributes are handy if you want to assign a given ‘look’ to a graph, attributes are saved with the graph is you save it with save or in GraphML format with write_graph, so the graph will have the same look after loading it again.

If a parameter is not given in the command line, and the corresponding vertex/edge/graph attribute is also missing then the general igraph parameters handled by igraph_options are also checked. Vertex parameters have prefix ‘vertex.’, edge parameters are prefixed with ‘edge.’, general parameters like layout are prefixed with ‘plot’. These parameters are useful if you want all or most of your graphs to have the same look, vertex size, vertex color, etc. Then you don’t need to set these at every plotting, and you also don’t need to assign vertex/edge attributes to every graph.

List names of the graph attributes:

## [1] "name" "ID"
## [1] "facebook Network"
## $name
## [1] "facebook Network"
## 
## $ID
## [1] "connections"
## $name
## [1] "facebook Network"

The graph gr4 has two edges between Siamak to Noah, and loops from Siamak and Mehran to themselves. For simplifying the graph and remove loops & multiple edges between the same nodes, you can use edge.attr.comb to show how edge attributes are to be combined - possible options include ignore, sum, prod, min, max, random, first, last, mean, Coursesn, concat.

The following combination behaviors are predefined:

list(“ignore”) The attribute is ignored and dropped.

list(“sum”) The sum of the attributes is calculated. This does not work for character attributes and works for complex attributes only if they have a sum generic defined. (E.g. it works for sparse matrices from the Matrix package, because they have a sum method.)

list(“prod”) The product of the attributes is calculated. This does not work for character attributes and works for complex attributes only if they have a prod function defined.

list(“min”) The minimum of the attributes is calculated and returned. For character and complex attributes the standard R min function is used.

list(“max”) The maximum of the attributes is calculated and returned. For character and complex attributes the standard R max function is used.

list(“random”) Chooses one of the supplied attribute values, uniformly randomly. For character and complex attributes this is implemented by calling sample.

list(“first”) Always chooses the first attribute value. It is implemented by calling the head function.

list(“last”) Always chooses the last attribute value. It is implemented by calling the tail function.

list(“mean”) The mean of the attributes is calculated and returned. For character and complex attributes this simply calls the mean function.

list(“Coursesn”) The Coursesn of the attributes is selected. Calls the R Coursesn function for all attribute types.

list(“concat”) Concatenate the attributes, using the c function. This results almost always a complex attribute.

For instance, you can specify that the weight of the new edge should be sum of the weights of the corresponding edges in the old graph; and that the rest of the attributes should be ignored (=dropped).

## IGRAPH UNW- 7 2 -- facebook Network
## + attr: name (g/c), name (v/c), gender (v/c), weight (e/n)
## + edges (vertex names):
## [1] Mehran--Noah   Noah  --Siamak

5. Special Types of Graphs (Standard Structures)

In this section, we are going to learn how to viasulize some well known graph structures:

“make_graph” knows the following graphs:

Bull The bull graph, 5 vertices, 5 edges, resembles to the head of a bull if drawn properly.

Chvatal This is the smallest triangle-free graph that is both 4-chromatic and 4-regular. According to the Grunbaum conjecture there exists an m-regular, m-chromatic graph with n vertices for every m>1 and n>2. The Chvatal graph is an example for m=4 and n=12. It has 24 edges.

Coxeter A non-Hamiltonian cubic symmetric graph with 28 vertices and 42 edges.

Cubical The Platonic graph of the cube. A convex regular polyhedron with 8 vertices and 12 edges.

Diamond A graph with 4 vertices and 5 edges, resembles to a schematic diamond if drawn properly.

Dodecahedral, Dodecahedron Another Platonic solid with 20 vertices and 30 edges.

Folkman The semisymmetric graph with minimum number of vertices, 20 and 40 edges. A semisymmetric graph is regular, edge transitive and not vertex transitive.

Franklin This is a graph whose embedding to the Klein bottle can be colored with six colors, it is a counterexample to the neccessity of the Heawood conjecture on a Klein bottle. It has 12 vertices and 18 edges.

Frucht The Frucht Graph is the smallest cubical graph whose automorphism group consists only of the identity element. It has 12 vertices and 18 edges.

Grotzsch The Groetzsch graph is a triangle-free graph with 11 vertices, 20 edges, and chromatic number 4. It is named after German mathematician Herbert Groetzsch, and its existence demonstrates that the assumption of planarity is necessary in Groetzsch’s theorem that every triangle-free planar graph is 3-colorable.

Heawood The Heawood graph is an undirected graph with 14 vertices and 21 edges. The graph is cubic, and all cycles in the graph have six or more edges. Every smaller cubic graph has shorter cycles, so this graph is the 6-cage, the smallest cubic graph of girth 6.

Herschel The Herschel graph is the smallest nonhamiltonian polyhedral graph. It is the unique such graph on 11 nodes, and has 18 edges.

House The house graph is a 5-vertex, 6-edge graph, the schematic draw of a house if drawn properly, basicly a triangle of the top of a square.

HouseX The same as the house graph with an X in the square. 5 vertices and 8 edges.

Icosahedral, Icosahedron A Platonic solid with 12 vertices and 30 edges.

Krackhardt kite A social network with 10 vertices and 18 edges. Krackhardt, D. Assessing the Political Landscape: Structure, Cognition, and Power in Organizations. Admin. Sci. Quart. 35, 342-369, 1990.

Levi The graph is a 4-arc transitive cubic graph, it has 30 vertices and 45 edges.

McGee The McGee graph is the unique 3-regular 7-cage graph, it has 24 vertices and 36 edges.

Meredith The Meredith graph is a quartic graph on 70 nodes and 140 edges that is a counterexample to the conjecture that every 4-regular 4-connected graph is Hamiltonian.

Noperfectmatching A connected graph with 16 vertices and 27 edges containing no perfect matching. A matching in a graph is a set of pairwise non-adjacent edges; that is, no two edges share a common vertex. A perfect matching is a matching which covers all vertices of the graph.

Nonline A graph whose connected components are the 9 graphs whose presence as a vertex-induced subgraph in a graph makes a nonline graph. It has 50 vertices and 72 edges.

Octahedral, Octahedron Platonic solid with 6 vertices and 12 edges.

Petersen A 3-regular graph with 10 vertices and 15 edges. It is the smallest hypohamiltonian graph, ie. it is non-hamiltonian but removing any single vertex from it makes it Hamiltonian.

Robertson The unique (4,5)-cage graph, ie. a 4-regular graph of girth 5. It has 19 vertices and 38 edges.

Smallestcyclicgroup A smallest nontrivial graph whose automorphism group is cyclic. It has 9 vertices and 15 edges.

Tetrahedral, Tetrahedron Platonic solid with 4 vertices and 6 edges.

Thomassen The smallest hypotraceable graph, on 34 vertices and 52 edges. A hypotracable graph does not contain a Hamiltonian path but after removing any single vertex from it the remainder always contains a Hamiltonian path. A graph containing a Hamiltonian path is called tracable.

Tutte Tait’s Hamiltonian graph conjecture states that every 3-connected 3-regular planar graph is Hamiltonian. This graph is a counterexample. It has 46 vertices and 69 edges.

Uniquely3colorable Returns a 12-vertex, triangle-free graph with chromatic number 3 that is uniquely 3-colorable.

Walther An identity graph with 25 vertices and 31 edges. An identity graph has a single graph automorphism, the trivial one.

Zachary Social network of friendships between 34 members of a karate club at a US university in the 1970s. See W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977).

Simple graph

The “make_graph” function comparing with “graph_literal” is a new function, which can be do the “graph_literal” process too.

“~” is a sign of path in make_graph function.

Multigraph or pseudograph

##  [1] FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE
## [12]  TRUE FALSE  TRUE

If you wanted to use “graph” function to create a weighted graph, you can give the edgelist as a dataframe which has three columns, first two columns illustrates each edge and the third column specifies their weights.

Directed Graph (digraph)

Weighted graph

Empty graph

Full graph

Simple star graph

Tree graph

Paths tree

Stars tree

Ring graph

Bipartite Graph

##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [12]  TRUE  TRUE  TRUE  TRUE

Erdos-Renyi random model

A random graph is the simplest of the graph models.

Here, every edge exists with the probability value, and we only fix the average number of edges (the density).

## IGRAPH U--- 10 10 -- Erdos renyi (gnp) graph
## + attr: name (g/c), type (g/c), loops (g/l), p (g/n)
## + edges:
##  [1] 1-- 2 1-- 4 1-- 5 3-- 5 1-- 6 6-- 7 5-- 9 7-- 9 6--10 8--10

Watts-Strogatz small-world model

Creates a lattice (with dim=dimensions and size=nodes across dimension) and rewires edges randomly with probability p. The neighborhood in which edges are connected= nei. You can allow loops and multiple edges.