Written on

Holy ifelse() statements Batman!

Found in [R , ggraph , data-visualization] by @awhstin on

If you were like me Batman cartoons, movies and television shows had been a staple of your Saturday morning for years. They all started with the ‘Bat-man’ first appearing in comic books in 1939, and have come in many iterations from the dark and brooding to the fun and campy. Sadly the world recently lost the original Batman, Adam West, who starred in the 1960’s Batman TV series. I recently stumbled upon an article on Mental Floss that detailed the different villains from that series and decided to make a little tribute to that series and Adam West.
Update!
When I first converted my website to the blogdown package I had a few things break, like this post, that I decided to just remove and revisit later. A couple days ago I was contacted on Twitter about this post and if it was still around. Nothing cures procrastination like a little bit of accountability.

Code

Before we start here are some really handy tips on the ggraph package key elements, nodes, edges, and layouts. I decided to do this in part to also teach myself the ggraph package because my mind struggles to grasp the concept of networks with nodes, and edges and those links I provided earlier helped a lot. We will be getting our data from a really cool Mental Floss article that you should give a read, A Visual Guide to All 37 Villains in the Batman TV Series.

library(rvest)
library(ggraph)
library(igraph)
library(tidyverse)

#rvest
chars<-read_html('http://mentalfloss.com/article/60213/visual-guide-all-37-villains-batman-tv-series')%>%
  html_nodes('#article-1 > div:nth-child(1) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > div:nth-child(2) > h4')%>% 
  html_text()%>%
  data.frame(stringsAsFactors = F)
chars$name<-sub(".+?. ","",chars$.)
chars$id<-as.integer(lapply(strsplit(chars$.,'. '),'[',1))

apps<-read_html('http://mentalfloss.com/article/60213/visual-guide-all-37-villains-batman-tv-series')%>%
  html_nodes('strong i')%>% 
  html_text()%>%
  data.frame(stringsAsFactors = F)

apps$id<-seq(1:37)
villians<-inner_join(apps,chars,by = c('id'))

Now we need to clean and organize the data.

#massage data
raw.seasons<-separate_rows(villians,..x,sep = "SEASON ")
raw.seasons$..y<-as.integer(unlist(lapply(strsplit(raw.seasons$..x,' *'),'[',1)))
raw.seasons<-separate_rows(raw.seasons,..x,sep = "([^0-9])")
raw.seasons$..x<-as.numeric(raw.seasons$..x)
batman<-subset(raw.seasons,!is.na(..x))

#arrange to plot
names(batman)[1:4]<-c('from','season','char','to')
batman<-batman[,c('from','to','season','char')]
batman$to<-paste0(batman$season,batman$to)
batman$from<-batman$char
batman$from<-gsub(' \\(','\n\\(',batman$char) #this bit makes nice names
#create igraph object
graph<-graph_from_data_frame(batman)
V(graph)$degree<-degree(graph)

Data! Now the part I wanted to get to when I started this thing, the graph. There are a ton of ifelse statements to help customize the end points which makes the code look a little unwieldy but the end product looks great I think.

n.names<-grep("[[:digit:]]",V(graph)$name,value=T)

ggraph(graph,layout='fr') + 
  geom_edge_link(aes(colour = factor(season)))+
  geom_node_point(aes(size=ifelse(V(graph)$name %in% n.names,1,degree)),
                  colour=ifelse(V(graph)$name %in% n.names,'#363636','#ffffff'),
                  show.legend = F)+ 
  theme_graph(background = 'grey20',text_colour = 'white',base_family="Roboto Light",
              base_size = 10,
              subtitle_size = 10,
              title_family = 'Roboto Slab',
              title_size = 22,
              title_face = "plain")+ 
  theme(legend.position = 'bottom')+ 
  scale_edge_color_brewer('Season',palette = 'Dark2')+ 
  geom_node_text(aes(label = name,fontface = 'bold'),
                 color = ifelse(V(graph)$name %in% n.names,'grey40','white'),
                 size = ifelse(V(graph)$name %in% n.names,1.75,2.5),repel = T,check_overlap = T)+
  labs(title = 'Batman Villains',subtitle = 'Plotting 37 Batman villains across 3 seasons with\nnode ends representing season & episode number',
       caption = 'ggraph walkthroughs available at: http://www.data-imaginist.com/\n Data from: http://mentalfloss.com/article/60213/visual-guide-all-37-villains-batman-tv-series')

I think the question on everyone’s mind now is when are we getting that King Tut villain movie? As a final note, if you are part of the Twitter-verse I suggest giving the ggraph creator @thomasp85 a follow as well as @dataandme who provided the much needed push to get this back up.