Hot questions for Using Ggplot2 in ggnetwork

Question:

Problem:

I want to use a different size scale for edges and nodes in a network created with the ggnetwork package. If you look at the image below you may think I already did that since they all have different sizes but in fact edge and node size are not scaled independently in this plot. They are both scaled by the scale_size_area() function towards the same maximum (30). If I get larger nodes my edges will shrink. I guess my problem boils down to scaling the size of different geoms with different functions. So for example how can I scale the size of the nodes with scale_size_area(max_size = 30) and the size of the edges with scale_size_continuous(range = c(1,6))?

Example code:
#load packages
library(network)
library(sna)
library(ggplot2)
library(ggnetwork)

#create data
#data for edges
edge_df<-data.frame(group1=c("A","A","B"),
           group2=c("B","C","C"),
           connection_strength=c(1,2,3))

#data for nodes/vertexes
vertex_df<-data.frame(group=c("A","B","C"),
                      groupsize=c(2,3,4))

#create network
my_network<-network(edge_df[,1:2],directed = FALSE)

#add edge attribute (interaction strength) to network object
set.edge.attribute(my_network, "connection_strength", edge_df$connection_strength)

#add node/vertex info to network object with the special %v% operator
my_network %v% "groupsize" = vertex_df$groupsize 

#plot
ggplot(my_network, aes(x = x, y = y, xend = xend, yend = yend,color=vertex.names)) +
  #edge size depends on connection strength
  geom_edges(color = "black",aes(size=connection_strength/20)) +
  #node size depends on groupsize
  geom_nodes(aes(size=groupsize)) +
  #scale size area is good for nodesize but I want a different scaling for the edges
  scale_size_area(max_size = 30,guide=F)+
  scale_color_discrete(guide=F)+#remove colour legend
  scale_x_continuous(limits=c(-0.15,1.15))+#add some space to x-axis
  scale_y_continuous(limits=c(-0.15,1.15))+#add some space to y-axis
  theme_bw()#simple plot layout
Example plot:

Answer:

Almost a duplicate of this question.

I'm not a ggplot2 expert, but as far I understand, dual-scaling (e.g. having two y-axes or two color scales) contradicts the grammar of graphics.

The solution to the aforementioned question might work, but it's a hack.

Question:

I have a graph of vertices and edges which I'd like to plot using a fruchtermanreingold layout.

Here's the graph edges matrix:

edge.mat <- matrix(as.numeric(strsplit("3651,0,0,1,0,0,0,0,2,0,11,2,0,0,0,300,0,1,0,0,66,0,78,9,0,0,0,0,0,0,11690,0,1,0,0,0,0,0,0,0,0,493,1,1,0,4288,5,0,0,36,0,9,7,3,0,6,1,0,1,7,490,0,0,0,6,0,0,628,6,12,0,0,0,0,0,641,0,0,4,0,0,0,0,0,0,66,0,0,0,0,3165,0,281,0,0,0,0,0,0,0,0,45,1,0,0,35248,0,1698,2,0,1,0,2,99,0,0,6,29,286,0,31987,0,1,10,0,8,0,16,0,21,1,0,0,1718,0,51234,0,0,17,3,12,0,0,7,0,0,0,1,0,2,16736,0,0,0,3,0,0,4,630,0,0,0,9,0,0,29495,53,6,0,0,0,0,5,0,0,0,0,3,0,19,186,0,0,0,482,8,12,0,1,0,7,1,0,6,0,26338",
                              split = ",")[[1]]),
                   nrow = 14,
                   dimnames = list(LETTERS[1:14], LETTERS[1:14]))

I then create an igraph object from that using:

gr <- igraph::graph_from_adjacency_matrix(edge.mat, mode="undirected", weighted=T, diag=F)

And then use ggnetwork to convert gr to a data.frame, with specified vertex colors:

set.seed(1)
gr.df <- ggnetwork::ggnetwork(gr,
                              layout="fruchtermanreingold", 
                              weights="weight", 
                              niter=50000, 
                              arrow.gap=0)

And then I plot it using ggplot2 and ggnetwork:

vertex.colors <- strsplit("#00BE6B,#DC2D00,#F57962,#EE8044,#A6A400,#62B200,#FF6C91,#F77769,#EA8332,#DA8E00,#C59900,#00ACFC,#C49A00,#DC8D00",
                          split=",")[[1]]

library(ggplot2)
library(ggnetwork)

ggplot(gr.df, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "gray", aes(size = weight)) +
  geom_nodes(color = "black")+
  geom_nodelabel(aes(label = vertex.names),
                 color = vertex.colors, fontface = "bold")+
  theme_minimal() + 
  theme(axis.text=element_blank(), 
        axis.title=element_blank(),
        legend.position="none")

In my case each vertex actually represents many points, where each vertex has a different number of points. Adding that information to gr.df:

gr.df$n <- NA
gr.df$n[which(is.na(gr.df$weight))] <- as.integer(runif(length(which(is.na(gr.df$weight))), 100, 500))

What I'd like to do is add to the plot gr.df$n radially jittered points around each vertex (i.e., with its corresponding n), with the same vertex.colors coding. Any idea how to do that?


Answer:

I think sampling and then plotting with geom_point is a reasonable strategy. (otherwise you could create your own geom).

Here is some rough code, starting from the relevant bit of your question

gr.df$n <- 1
gr.df$n[which(is.na(gr.df$weight))] <- as.integer(runif(length(which(is.na(gr.df$weight))), 100, 500))

# function to sample
# https://stackoverflow.com/questions/5837572/generate-a-random-point-within-a-circle-uniformly
circSamp <- function(x, y, R=0.1){
    n <- length(x)
    A <- a <- runif(n,0,1)
    b <- runif(n,0,1)

    ind <- b < a
    a[ind] <- b[ind]
    b[ind] <- A[ind]

    xn = x+b*R*cos(2*pi*a/b)
    yn = y+b*R*sin(2*pi*a/b)
    cbind(x=xn, y=yn)
 }


# sample  
d <- with(gr.df, data.frame(vertex.names=rep(vertex.names, n),
                            circSamp(rep(x,n), rep(y,n))))

# p is your plot   
p + geom_point(data=d, aes(x, y, color = vertex.names),
               alpha=0.1, inherit.aes = FALSE) +
    scale_color_manual(values = vertex.colors)

Giving

Question:

library(dplyr)
library(ggnetwork)
library(ggplot)
library(igraph)
library(sna)

I have a data frame which looks like this, representing connections in a network between a number of objects:

origin <- c("A", "A", "B", "B", "C", "C", "B", "B")

dest <- c("D", "C", "D", "C", "B", "E", "E", "F")

net <- data.frame(origin, dest)

Then I summarise the data frame for use in ggnetwork like this, to show every combination of origin and destination as its own row:

df_edges <- net %>% group_by(origin, dest) %>% summarize(weight = n())

Then I convert to an igraph object, then a ggnetwork object like this:

net_igraph <- graph.data.frame(df_edges, directed = T)

df_net <- ggnetwork(net_igraph)

Finally, I want to plot in ggplot2. If I want to plot all connections together I can plot like this:

ggplot(df_net, aes(x = x, y = y, xend = xend, yend = yend, label = vertex.names)) + 
    geom_edges() +
    geom_nodetext() +
    geom_nodes()

But I want to plot as a facet_wrap, so that each origin is given its own panel, showing the connections to each connected destination. The problem is that when I plot like this, the destination nodes are not displayed:

ggplot(df_net, aes(x = x, y = y, xend = xend, yend = yend, label = vertex.names)) + 
    geom_edges() +
    geom_nodetext() +
    geom_nodes() + 
    facet_wrap(~ vertex.names)

How can I get the destination nodes to be displayed in each panel?

I looked on the help files for ggnetwork() and found to use the by = argument, but not sure what my chosen "edge attribute" would be.


Answer:

I couldn't find any direct way to achieve that, which is not surprising given that

head(df_net, 2)
#           x         y  na.x vertex.names      xend      yend na.y weight
# 1 1.0000000 0.1356215 FALSE            A 1.0000000 0.1356215   NA     NA
# 2 0.3039919 0.5152220 FALSE            B 0.3039919 0.5152220   NA     NA

That is, in every row there is only the origin vertex name. So, while adding the destination vertices is actually easy, adding their names requires some extra work.

The structure of df_net is such that first we have several (just as many as vertices) rows with weight being NA, those rows just define vertex positions (notice also that x coincides with xend and y with yend). Then we have just as many rows as edges corresponding to the edges, where to draw them.

However, there is an issue. For instance,

df_net[c(3, 7), ]
#            x        y  na.x vertex.names      xend       yend  na.y weight
# 3  0.4846586 0.000000 FALSE            C 0.4846586 0.00000000    NA     NA
# 31 0.3039919 0.515222 FALSE            B 0.4763860 0.02359162 FALSE      1

The second row corresponds to an edge from B to C. The problem is that xend and yend of the second row are not exactly as x and y of the first row. Hence, we cannot directly identify that this edge actually goes to C. For this purpose, we can use an approximate matching function defined as follows:

apprMatch <- function(x, y) apply(x, 1, function(z) which.min(colSums((t(y) - z)^2)))

It takes two matrices (two columns each) and for each row of x it finds the closest row of y. Given that the graph is not extremely dense, it should work without any problems (even when it is dense I would expect it to work).

Hence, let

ends1 <- with(df_net, cbind(xend, yend)[!is.na(weight), ])
ends2 <- with(df_net, cbind(x, y)[is.na(weight), ])

be those two matrices that we want to match. Then

df_net$to[!is.na(df_net$weight)] <- as.character(df_net$vertex.names[apprMatch(ends1,ends2)])

yields

tail(df_net, 2)
#    x         y  na.x vertex.names      xend        yend  na.y weight to
# 10 1 0.1356215 FALSE            A 0.5088354 0.006362567 FALSE      1  C
# 11 1 0.1356215 FALSE            A 0.8644390 0.614776499 FALSE      1  D

i.e., a destination vertex names column to. Thus, all in all we have

apprMatch <- function(x, y) apply(x, 1, function(z) which.min(colSums((t(y) - z)^2)))
ends1 <- with(df_net, cbind(xend, yend)[!is.na(weight), ])
ends2 <- with(df_net, cbind(x, y)[is.na(weight), ])
df_net$to[!is.na(df_net$weight)] <- as.character(df_net$vertex.names[apprMatch(ends1,ends2)])

ggplot(df_net, aes(x = x, y = y, xend = xend, yend = yend, label = vertex.names)) + 
  geom_edges() +
  geom_nodetext(vjust = 1, hjust = 1) + 
  geom_nodetext(aes(label = to, x = xend, y = yend), vjust = 1, hjust = 1) +
  geom_nodes() +
  geom_nodes(aes(x = xend, y = yend)) +
  facet_wrap(~ vertex.names)

where I also added vjust and hjust so that the vertex names are clearer.

Question:

I have a graph of vertices and edges which I'd like to plot using a fruchtermanreingold layout.

Here's the graph edges matrix:

edge.mat <- matrix(as.numeric(strsplit("3651,0,0,1,0,0,0,0,2,0,11,2,0,0,0,300,0,1,0,0,66,0,78,9,0,0,0,0,0,0,11690,0,1,0,0,0,0,0,0,0,0,493,1,1,0,4288,5,0,0,36,0,9,7,3,0,6,1,0,1,7,490,0,0,0,6,0,0,628,6,12,0,0,0,0,0,641,0,0,4,0,0,0,0,0,0,66,0,0,0,0,3165,0,281,0,0,0,0,0,0,0,0,45,1,0,0,35248,0,1698,2,0,1,0,2,99,0,0,6,29,286,0,31987,0,1,10,0,8,0,16,0,21,1,0,0,1718,0,51234,0,0,17,3,12,0,0,7,0,0,0,1,0,2,16736,0,0,0,3,0,0,4,630,0,0,0,9,0,0,29495,53,6,0,0,0,0,5,0,0,0,0,3,0,19,186,0,0,0,482,8,12,0,1,0,7,1,0,6,0,26338",split=",")[[1]]),nrow=14,dimnames=list(LETTERS[1:14],LETTERS[1:14]))

I then create and igraph object from that using:

gr <- igraph::graph_from_adjacency_matrix(edge.mat,mode="undirected",weighted=T,diag=F)

Then I follow examples of R's ggnetwork to convert gr to a data.frame:

set.seed(1)
gr.df <- ggnetwork::ggnetwork(gr,layout="fruchtermanreingold",weights="weight",niter=50000, arrow.gap=0)

And then I plot it using ggplot2 and ggnetwork:

ggplot2::ggplot(gr.df,ggplot2::aes(x=x,y=y,xend=xend,yend=yend))+
  ggnetwork::geom_edges(color="gray")+
  ggnetwork::geom_nodes(color="black")+
  ggnetwork::geom_nodelabel_repel(aes(label=vertex.names,color=vertex.names),fontface="bold",box.padding=unit(1,"lines"))+
  ggplot2::theme_minimal()+ggplot2::theme(axis.text=ggplot2::element_blank(),axis.title=ggplot2::element_blank(),legend.position="none")

Which gives:

My question is how to get the widths of the edge lines in this plot to be proportional to the edge weights in gr (igraph::E(gr)$weight)?

Looking at gr.df:

> head(gr.df)
          x         y  na.x vertex.names      xend      yend na.y weight
1 0.3637960 0.8873783 FALSE            A 0.3637960 0.8873783   NA     NA
2 0.7480217 0.4129375 FALSE            B 0.7480217 0.4129375   NA     NA
3 0.1306538 0.0000000 FALSE            C 0.1306538 0.0000000   NA     NA
4 0.4828271 0.6498561 FALSE            D 0.4828271 0.6498561   NA     NA
5 0.2243358 0.4484766 FALSE            E 0.2243358 0.4484766   NA     NA
6 1.0000000 0.6396669 FALSE            F 1.0000000 0.6396669   NA     NA

I see that the edges are not transferred from gr to gr.df.


Answer:

You can specify the size aesthetic in the geom_edges call. E.g. aes(size=weight)

ggplot2::ggplot(gr.df,ggplot2::aes(x=x,y=y,xend=xend,yend=yend))+
  ggnetwork::geom_edges(color="gray",size=aes(size=weight))+
  ggnetwork::geom_nodes(color="black")+
  ggnetwork::geom_nodelabel_repel(aes(label=vertex.names,color=vertex.names),fontface="bold",box.padding=unit(1,"lines"))+
  ggplot2::theme_minimal()+ggplot2::theme(axis.text=ggplot2::element_blank(),axis.title=ggplot2::element_blank(),legend.position="none")

As a note, the dataframe created does contain the vertices, and the edges and relevant weights. The vertices are listed first and the edges come next. In this case, if you look at the whole dataframe (or just tail), you'll see that the weight values are there for the edges.