In this lab we will review concepts of community detection and how to represent our network in a map.

First we will load the data and the libraries we will be using. Remember that on the previous lab we exported a data set at the end, we will continue using this objects.

# libraries we will use.
library(dplyr) # Data manipulation
library(sf) # Spatial data manipulation
library(tidygraph) # Network manipulation and analysis
library(purrr) # for data manipulation
library(ggraph) # for network visualization
library(ggpubr) # for arranging plots

# If you are starting a new session, load the files and libraries again 
# Load the data for nodes
node <- STNet::SwinePrem %>% # load the data from the STNet library
  mutate(id = as.character(id)) # change the id to variable to character
# Load the data for edges
edge <- STNet::SwineMov %>% 
  mutate(id_orig = as.character(id_orig), id_dest = as.character(id_dest))

# create the network object
net <- as_tbl_graph(edge) %N>% 
  left_join(node, by = c('name' = 'id'))  # Now we join to the node data frame to include other variables

1 Community detection

In this section we will use different algorithms to identify communities in our network.

# First we need to simplify the network
c <- net %E>%  # we call our network and activate the edges
  mutate(N = as.integer(1)) %>%  # create a variable for the number of movements (each row is 1 movement)
  convert(to_simple) %E>%  # now we will convert it to a simple network
  mutate(weight = map_int(.orig_data, ~.x %>% pull(N) %>% sum())) %N>%  # We have to sum all the repeated movements
  mutate(walktrap = factor(group_walktrap(weights = weight)), # use the walktrap algorithm for community detection
         infomap = factor(group_infomap(weights = weight))) # use the infomap for community detection

Then we will create an empty list to fill with plots and compare the different algorithms.

# Create the empty list
CP <- list()
# Make a plot for the edges only
pc <- c %>% # our simplified network
  ggraph(layout = 'nicely') + # call the ggraph function
  geom_edge_link() + # add the edges
  theme(legend.position = 'bottom') # set the legend position to bottom

CP[['Infomap']] <- pc + # We call our plot with only the edges
  geom_node_point(aes(col = infomap), size = 2) + # we add the nodes
  labs(title = 'Infomap') # title of our plot

CP[['Walktrap']] <- pc +
  geom_node_point(aes(col = walktrap), size = 2) +
  labs(title = 'Walktrap') 

# We arrange our plots in a single figure
ggarrange(plotlist = CP)

## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

We can select specific communities to show by filtering by the newly created community variable.

c %N>% 
  filter(walktrap == '1') %>% # filter for community 1
  ggraph(layout = 'nicely') + # call the ggraph function
  geom_edge_link() + # add the edges
  geom_node_point(aes(col = walktrap), size = 2) + # we add the nodes
  theme(legend.position = 'bottom') + # set the legend position to bottom
  labs(title = 'Subset of the nework for the community 1')

Exercise: Now increase the number of steps in the walk trap algorithm, what happens when we increase the number of steps? What do you think would be the optimal number of steps for this example

2 Spatial representation of the network

Now we will use the network we created and the spatial location of our farms to see the movements on a map.
We will be using the sf package to manipulate the spatial objects, and the ggplot2 package for visualization.
In the STNet package there is a spatial polygons data, which includes the counties in the state of Iowa.

# Loading the packages
library(sf) # Package for spatial objects
library(ggplot2) # package for plots

# We load the spatial object from the package STNet
iowa <- st_read(system.file("data/Io.shp", package = "STNet"))

## Reading layer `Io' from data source 
##   `/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/STNet/data/Io.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 99 features and 2 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -96.63567 ymin: 40.37454 xmax: -90.13931 ymax: 43.50465
## Geodetic CRS:  WGS 84

# plot map using sf
plot(iowa$geometry)

Next we will transform the nodes as a spatial points object, for this we use the function st_to_sf() and we need to specify the names of the columns that have the spatial coordinates.

NodeSp <- net %N>% # This is our node data.frame
  mutate(indegree = centrality_degree(mode = 'in')) %>% # calculate the indegree
  data.frame() %>% 
  st_as_sf(coords = c("long", "lat"), # Variables for the coordinates
           crs = st_crs(iowa)) # This is the CRS we are using

2.1 Plotting our map

One of the nice things of ggplot is that we can create a map and store it in an object and later we can keep adding layers to this map. So first we will create a map of the state.

map <- ggplot() +
  geom_sf(data = iowa, # name of the spatial dataset
          color="grey20", # color of the shape border
          fill="white", # fill of the shape
          size=0.4) + # width of the border
  theme_void() # This is a theme form ggplot

2.2 Plot the nodes

Once we have the base map of the state, we can add the spatial points data we created previously. We can specify the size of the points using a variable.

map + geom_sf(data = NodeSp, # name of our data
              aes(color = farm_type, # we color the nodes by farm type
                  size = indegree)) +
  ggtitle("Farms and their indegree") # the title of our plot

Exercise: Make the same plot, but make the size of the nodes relative to outdegree

2.3 Adding the euclidean contacts

We can also add the connection between the nodes represented by euclidean distance (fancy word for straight line). But first we will need to add the coordinates to the edge data using joins:

# First we need to include the information of origins and destination
edge <- edge %>%
  left_join(node[c("id", "lat", "long")], by = c("id_orig" = "id")) %>%
  rename(O_Lat = lat, O_Long = long) %>%
  left_join(node[c("id", "lat", "long")], by = c("id_dest" = "id")) %>%
  rename(D_Lat = lat, D_Long = long)

In the previous lab we calculated the euclidean distance between each pair of farms involved in a movement. Here we will visualize those movements.

# The function geom_segment adds a straight line between two coordinates:
map +
  geom_segment(data=edge, 
               aes(x=O_Long, y=O_Lat, # this is where the line starts
                   xend=D_Long, yend=D_Lat)) # this is where it ends

# We can add the information of the type of movement to change the color of the line and the number of animals for the transparency
map +
  geom_segment(data=edge, 
               aes(x=O_Long, y=O_Lat,
                   xend=D_Long, yend=D_Lat, 
                   color=type_orig,
                   alpha = pigs.moved))

2.4 Putting everything together

Now we will add both the farm locations and the direction of the movements between the farms on a map.

#plot nodes & edges - add both commands geom_segment and geom_point#
map +
  geom_segment(data=edge, 
               aes(x=O_Long, y=O_Lat,
                   xend=D_Long, yend=D_Lat,
                   alpha = pigs.moved), 
               show.legend=F) +
  geom_sf(data = NodeSp, 
          aes(color = farm_type, 
              size = indegree), show.legend = "point")

2.5 Subsetting the data

Sometimes we are interested in a particular type of movements. We can subset this using the dplyr functions such as filter(). In the next plot we will select only the movements that comes from sow farms.

# plot movements from sow farms only
map +
  edge %>%
  filter(type_orig == "sow farm") %>%
  geom_segment(data = ., 
               aes(x=O_Long, y=O_Lat, 
                   xend=D_Long, yend=D_Lat, 
                   color = type_orig), show.legend = F) +
  geom_sf(data = NodeSp, 
          aes(color = farm_type), size=3, show.legend = "point")

We can be even more specific and filter the movements from sow farms that are directed to GDU.
We will also add at the end the function ggplotly() from the package plotly to obtain a map were we can zoom and hover over some features to obtain more information.

# We store the map of movements between GDU to sow farm
m <- map +
  edge %>%
  filter(type_orig == 'GDU', type_dest == "sow farm") %>%
  geom_segment(data = ., aes(x=O_Long, y=O_Lat,
                             xend=D_Long, yend=D_Lat, 
                             color = type_orig), show.legend = F) +
  geom_sf(data = NodeSp, 
          aes(color = farm_type),
          size=3, show.legend = "point") +
  ggtitle("GDU to Sow farm Movments")

# We use the function from plotly to transform our map into n interactive map.  
library(plotly)

ggplotly(m)

3 More on interactive maps.

If you are interested in more about interactive maps plotly also has option for using background maps, but for this you need to get a public Mapbox access token, which is free, but requires registration. Some great resources for more information:

Here I provide an example of the kind of maps you can get using mapbox, but this This code will not run unless you use your own API Key

# Sys.setenv('MAPBOX_TOKEN' = yourKey)
plot_mapbox() %>%
  add_segments(
    data = group_by(edge, id_orig, id_dest),
    x = ~O_Long, xend = ~D_Long,
    y = ~O_Lat, yend = ~D_Lat,
    alpha = 0.1, size = I(1), hoverinfo = "none"
  )   %>%
  add_markers(data = node,
    x = ~long, y = ~lat, text = ~id,
    split = ~farm_type,
    hoverinfo = "text"
  ) %>%
  layout(
    mapbox = list(
      style = 'open-street-map',
      zoom = 6,
      center = list(lat = 42, lon = -93)
    ))

This lab has been developed with contributions from: Jose Pablo Gomez-Vazquez, Jerome Baron, and Beatriz Martinez-Lopez.
Feel free to use these training materials for your own research and teaching. When using the materials we would appreciate using the proper credits. If you would be interested in a training session, please contact: jpgo@ucdavis.edu

Network visualization