Hurricane Dorian in the News

The ‘newsflash’ package is something I played with back in 2017 during Hurricane Irma. Having recently been under watch for Dorian, I’m revisiting it to visualize it’s news coverage.

# Set CRAN Mirror
options(repos = c(CRAN = "http://cran.rstudio.com"))

# Set time zone
options(tz="America/New_York")

# Newsflash documentation
# https://github.com/hrbrmstr/newsflash

# Install packages
# devtools::install_github("hrbrmstr/newsflash")
# install.packages("tidyverse")
# install.packages("ggalt")
# install.packages("brbrthemes")
# install.packages("anytime")

# Load libraries
library(newsflash)
## NOTE: There are breaking changes to the package API due to GDELT's v2 API
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.6
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0
## ── Conflicts ──────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(ggalt)
library(hrbrthemes)
## NOTE: Either Arial Narrow or Roboto Condensed fonts are *required* to use these themes.
##       Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
##       if Arial Narrow is not on your system, please see http://bit.ly/arialnarrow
library(anytime) # Get errors without this
## Warning: package 'anytime' was built under R version 3.5.2

Let’s look at Chryons before we examine network TV.

# Look at chryons (captioning)
ch <- read_chyrons("2019-09-02")
head(ch)
## # A tibble: 6 x 5
##   ts                  channel duration details           text             
##   <dttm>              <chr>      <int> <chr>             <chr>            
## 1 2019-09-02 00:00:00 MSNBCW        20 MSNBCW_20190901_… "NATIONAL HURRIC…
## 2 2019-09-02 00:01:00 CNNW          59 CNNW_20190902_00… HURRICANE DORIAN…
## 3 2019-09-02 00:02:00 MSNBCW        30 MSNBCW_20190901_… NEW UPDATE ON HU…
## 4 2019-09-02 00:03:00 CNNW          45 CNNW_20190902_00… "HURRICANE DORIA…
## 5 2019-09-02 00:04:00 CNNW          62 CNNW_20190902_00… HURRICANE WARNIN…
## 6 2019-09-02 00:04:00 MSNBCW        49 MSNBCW_20190901_… "NATIONAL HURRIC…

Within the news on Labor Day, how often was Dorian mentioned? We’ll create a column just for that purpose.

# Examine where Dorian mentioned ("mention" column)
# Convert time to hourly chunks ("hour" column)
ch2 <- mutate(ch, 
  hour = lubridate::hour(ts),
  text = tolower(text),
  mention = grepl("dorian", text))
head(ch2)
## # A tibble: 6 x 7
##   ts                  channel duration details     text       hour mention
##   <dttm>              <chr>      <int> <chr>       <chr>     <int> <lgl>  
## 1 2019-09-02 00:00:00 MSNBCW        20 MSNBCW_201… "nationa…     0 TRUE   
## 2 2019-09-02 00:01:00 CNNW          59 CNNW_20190… hurrican…     0 TRUE   
## 3 2019-09-02 00:02:00 MSNBCW        30 MSNBCW_201… new upda…     0 TRUE   
## 4 2019-09-02 00:03:00 CNNW          45 CNNW_20190… "hurrica…     0 TRUE   
## 5 2019-09-02 00:04:00 CNNW          62 CNNW_20190… hurrican…     0 FALSE  
## 6 2019-09-02 00:04:00 MSNBCW        49 MSNBCW_201… "nationa…     0 TRUE

Now, let’s visualize this to see how it trended throughout the day.

ch2 %>% filter(mention) %>% 
  count(hour, channel) %>% 
  ggplot(aes(hour, n)) +
  geom_segment(aes(xend=hour, yend=0), 
               color = "lightslategray", size=1) + 
  scale_x_continuous(name="Hour (GMT)", breaks=seq(0, 23, 6),
                   labels=sprintf("%02d:00", seq(0, 23, 6))) +
  scale_y_continuous(name="# Chyrons", limits=c(0,20)) +
  facet_wrap(~channel, scales="free") +
  labs(title="'Dorian' Mentions per Hour on 2019 Labor Day") 
## Warning: Removed 3 rows containing missing values (geom_segment).

Next, let’s examine networks. First, lets see a sample of what’s even available.

# How about networks?
ne <- list_networks(widget=FALSE)
head(ne,20)
## # A tibble: 20 x 6
##    StationID Description         Market      Network StartDate  EndDate   
##    <chr>     <chr>               <chr>       <chr>   <date>     <date>    
##  1 ALJAZ     Al Jazeera          Internatio… ALJAZ   2017-09-11 2017-09-11
##  2 ALJAZAM   Al Jazeera America  NationalDi… ALJAZAM 2013-08-20 2013-08-20
##  3 BBCNEWS   BBC News            Internatio… BBCNEWS 2017-01-01 2017-01-01
##  4 BETW      BET - San Francisc… San Franci… BET     2016-12-13 2016-12-13
##  5 BLOOMBERG Bloomberg           National    BLOOMB… 2013-12-05 2013-12-05
##  6 CNBC      CNBC                National    CNBC    2009-07-02 2009-07-02
##  7 CNN       CNN                 National    CNN     2009-07-02 2009-07-02
##  8 COM       Comedy Central      NationalSp… COM     2011-05-10 2011-05-10
##  9 CSPAN     CSPAN               National    CSPAN   2009-06-04 2009-06-04
## 10 CSPAN2    CSPAN2              National    CSPAN   2009-06-04 2009-06-04
## 11 CSPAN3    CSPAN3              National    CSPAN   2012-01-26 2012-01-26
## 12 CURRENT   CurrentTV - San Fr… San Franci… Curren… 2012-01-13 2012-01-13
## 13 DW        DeutscheWelle       Internatio… DW      2017-09-19 2017-09-19
## 14 FBC       FOX Business        National    FBC     2012-08-20 2012-08-20
## 15 FOXNEWS   FOX News            National    FOXNEWS 2009-07-02 2009-07-02
## 16 HLN       HLN - Maryland (HL… Maryland    HLN     2009-07-02 2009-07-02
## 17 KBCW      CW - San Francisco… San Franci… CW      2010-07-16 2010-07-16
## 18 KCAU      ABC - Sioux City (… Sioux City  ABC     2015-10-13 2015-10-13
## 19 KCCI      CBS - Des Moines (… Des Moines  CBS     2015-10-14 2015-10-14
## 20 KCNC      CBS - Denver (KCNC) Denver      CBS     2016-01-01 2016-01-01

Similar to our chryon analysis, let’s plot mentions over time. Let’s look at news coverage across a three week period instead.

# Query Dorian data
# Primary term = Dorian
# Secondary term = Hurricane
# Starting just before Labor Day weekend
# https://rdrr.io/github/hrbrmstr/newsflash/man/query_tv.html
dorian <- query_tv("Dorian", mode = "TimelineVol",
                   start_date = "2019-08-27", end_date = "2019-09-12")
head(dorian)
## # A tibble: 6 x 3
##   network   date                 value
##   <chr>     <dttm>               <dbl>
## 1 Bloomberg 2019-08-27 12:00:00 0.0301
## 2 Bloomberg 2019-08-28 12:00:00 0.580 
## 3 Bloomberg 2019-08-29 12:00:00 0.573 
## 4 Bloomberg 2019-08-30 12:00:00 1.87  
## 5 Bloomberg 2019-08-31 12:00:00 0.314 
## 6 Bloomberg 2019-09-01 12:00:00 0.386

Let’s plot this, breaking up the different networks into individual facets.

# Visualize network coverage over 3 weeks
# Themes causing errors, so commenting out parts
# https://github.com/hrbrmstr/hrbrthemes
query_tv('Dorian', mode = "TimelineVol",
                start_date = "2019-08-27", end_date = "2019-09-12") %>% 
        arrange(date) %>% 
        ggplot(aes(date, value, group=network)) +
        ggalt::geom_xspline(aes(color=network)) +
        ggthemes::scale_color_tableau(name=NULL) +
        labs(x=NULL, y="Volume Metric", 
             title="'Dorian' Trends Across National Networks") +
        facet_wrap(~network) +
        # theme_ipsum_rc(grid="XY") +
        theme(legend.position="none") 

We see that coverage built up, was significant for a while, and then quickly dropped off.

Another way the ‘newsflash’ package allows us to examine this topic is via a word cloud. When Dorian was mentioned, what else was said?

# What are the top words associated with Dorian?
# Let's do a word cloud
wc <- query_tv('Dorian', mode = "WordCloud", 
               start_date = "2019-09-01", end_date = "2019-09-03")
head(wc)
## # A tibble: 6 x 2
##   label     count
##   <chr>     <dbl>
## 1 Dorian    100  
## 2 Hurricane  71.2
## 3 Storm      59.1
## 4 Bahamas    41.5
## 5 Now        35.3
## 6 Florida    35.0

Next, let’s plot this!

ggplot(wc, aes(x=1, y=1)) +
        ggrepel::geom_label_repel(aes(label=label, size=count), segment.colour="#00000000", segment.size=0) +
        scale_size_continuous(trans="sqrt") +
        labs(x=NULL, y=NULL) +
        theme_ipsum_rc(grid="") +
        theme(axis.text=element_blank()) +
        theme(legend.position="none") 

This isn’t my favorite word cloud viz, but it gets the job done.

[THE END]