Workout Wednesday with pollstR

I frequent the FiveThirtyEight website. Nearly every day I am on there reading an article over lunch or just admiring the visualizations. One such visualizations I like looking at is their ‘How Popular/Unpopular is Donald Trump’ calculation page. As I was reading their methodology I came across some of the data behind it. This is where I read that some of the data comes from the Huffington Post Pollster which has a handy R client already built called pollstR. So today I am working on using the pollstR data to try and make a visualization similar to the one from FiveThirtyEight.



#packages
library(pollstR)
library(tidyverse)
library(hrbrthemes)

Now we can follow the pollstR documentation to grab the data.



slug <- "donald-trump-favorable-rating"
polls <- pollster_charts_polls(slug)[["content"]]
trendlines <- pollster_charts_trendlines(slug)[["content"]]
data <- gather(polls, response, value,Favorable, Unfavorable) %>%
  mutate(value = if_else(is.na(value), 0, value))

I am only looking at favorable vs unfavorable currently, but you will notice undecided is also available. Then we can subset the data from Trump’s first day, which was January 20, 2017. Then divide the data for labels. (There may be a different way to do this so if you have tips please let me know.)



t.poll<-subset(data,start_date >='2017-01-20')
un<-subset(t.poll,response=='Unfavorable')
fa<-subset(t.poll,response=='Favorable')

Now plot!



ggplot() +
  geom_point(data = t.poll,aes(x = end_date, y = value, color = response),alpha = 0.5) +
  geom_smooth(data = t.poll, method = "loess" ,aes(x = as.Date(1, as.Date(t.poll$end_date, origin=start)), y = value, color = response,fill=response))+
  theme_ipsum(grid=F)+
  geom_hline(yintercept=50,color='#444444',alpha=.5)+
  scale_x_date(limits = c(min(t.poll$end_date),min(t.poll$end_date)+365))+
  geom_vline(xintercept=as.numeric(max(t.poll$end_date)),color='#444444',alpha=.5,linetype='dotted')+
  geom_label(data = fa, aes(x= max(end_date)+50, y =round(mean(value),1), label=paste(sep=' ',round(mean(value),1),'Favorable'),color = response))+
  geom_label(data = un, aes(x= max(end_date)+50, y = round(mean(value),1), label=paste(sep=' ',round(mean(value),1),'Unfavorable'),color = response))+
  labs(x='',caption='The label displays the mean value rating per response group')+
  theme(legend.position = 'none',panel.grid.major.y = element_line(color='#dedede'),panel.grid.major.x = element_line(color='#dedede'))