How Heavily Armed Is Your State?

We will answer this question using R. First, load the dataset. It contains state names, census regions, census divisions, 2013 populations and the number of registered gun owners (also in 2013). This data comes to us from CBS News.

# Set working directory
setwd("~/Desktop")

# Load dataset
library(readxl)
Firearms_Data_KAD <- read_excel("~/Desktop/R/Per Capita Firearms Registered/Firearms_Data_KAD.xlsx")

head(Firearms_Data_KAD)
## # A tibble: 6 x 6
##   StateName  StateLower  CensusRegion Division  `2013Pop` `2013Registered…
##   <chr>      <chr>       <chr>        <chr>         <dbl>            <dbl>
## 1 Wyoming    wyoming     West         Mountain     582658           114052
## 2 District … district o… South        South At…    646449            42897
## 3 Arkansas   arkansas    South        West Sou…   2959373           123130
## 4 New Mexico new mexico  West         Mountain    2085287            84471
## 5 Virginia   virginia    South        South At…   8260405           248939
## 6 Idaho      idaho       West         Mountain    1612136            39019

Next, let’s load all of the packages we’ll use for data transformations and visualizations.

# Load needed packages
library(ggplot2)
library(RColorBrewer)
library(lattice)
library(ggthemes)
library(maps)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(sqldf)
## Loading required package: gsubfn
## Loading required package: proto
## Loading required package: RSQLite
library(choroplethr)
## Loading required package: acs
## Loading required package: stringr
## Loading required package: XML
## 
## Attaching package: 'acs'
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following object is masked from 'package:base':
## 
##     apply

Let’s do some basic analysis and create a preliminary boxplot.

# Add percentage value
Firearms_Data_KAD$Pct <- Firearms_Data_KAD$`2013RegisteredOwners`/Firearms_Data_KAD$`2013Pop`

# Add value for ordering bar charts
Firearms_Data_KAD$Order <- 1-Firearms_Data_KAD$Pct

# Box and whisker plot using Lattice
boxplot <- bwplot(Pct ~ Division, data=Firearms_Data_KAD,
                  xlab="US Census Regional Divisions",
                  ylab="PerCapita Registered Gun Owners",
                  main="How Heavily Armed is Your Division? (2013)")
boxplot

Next, let’s use ggplot2 and a Wall Street Journal Theme to make this easier on the eyes.

# Box and whisker plot using ggplot2
boxplot2 <- ggplot(Firearms_Data_KAD, aes(Division, Pct)) +
  geom_boxplot() + ylab("PerCapita Registered Gun Owners") +
  xlab("US Census Regional Divisions") + 
  labs(title="Division Owners")

boxplot2 + coord_flip() + theme_wsj()

Now let’s step back a level of detail and see if there are similarities in Census Regions.

boxplot3 <- ggplot(Firearms_Data_KAD, aes(CensusRegion, Pct)) +
  geom_boxplot() + ylab("PerCapita Registered Gun Owners") +
  xlab("US Census Regions") + 
  labs(title="Regions")

boxplot3 + coord_flip() + theme_wsj()

Next, let’s dive into state-level detail to see the outliers.

# Bar plot by state

# Order states
# states <- c("Wyoming", "District of Columbia", "Arkansas", "New Mexico", "Virginia", "Idaho", "Alabama", "Nevada", "Alaska", "Louisiana", "Pennsylvania", "Maryland", "New Hampshire", "Georgia", "Indiana", "Kentucky", "Utah", "Texas", "Oklahoma", "Colorado", "South Carolina", "South Dakota", "Ohio", "Oregon", "Connecticut", "Montana", "Tennessee", "North Carolina", "Kansas", "Florida", "Minnesota", "North Dakota", "Arizona", "West Virginia", "Illinois", "Maine", "Washington", "Missouri", "Wisconsin", "California", "Nebraska", "Mississippi", "Vermont", "Iowa", "New Jersey", "Hawaii", "Massachusetts", "Michigan", "Delaware", "Rhode Island", "New York")
Firearms_Data_KAD$StateName <- factor(Firearms_Data_KAD$StateName, 
                                      levels = Firearms_Data_KAD$StateName[order(Firearms_Data_KAD$Pct)])
Firearms_Data_KAD$StateName
##  [1] Wyoming              District of Columbia Arkansas            
##  [4] New Mexico           Virginia             Idaho               
##  [7] Alabama              Nevada               Alaska              
## [10] Louisiana            Pennsylvania         Maryland            
## [13] New Hampshire        Georgia              Indiana             
## [16] Kentucky             Utah                 Texas               
## [19] Oklahoma             Colorado             South Carolina      
## [22] South Dakota         Ohio                 Oregon              
## [25] Connecticut          Montana              Tennessee           
## [28] North Carolina       Kansas               Florida             
## [31] Minnesota            North Dakota         Arizona             
## [34] West Virginia        Illinois             Maine               
## [37] Washington           Missouri             Wisconsin           
## [40] California           Nebraska             Mississippi         
## [43] Vermont              Iowa                 New Jersey          
## [46] Hawaii               Massachusetts        Michigan            
## [49] Delaware             Rhode Island         New York            
## 51 Levels: New York Rhode Island Delaware Michigan ... Wyoming
# Bar plot by state
barplot <- qplot(StateName, data=Firearms_Data_KAD, geom="bar",
            weight=Pct, ylab="PerCapita Registered Gun Owners") 
barplot <- barplot + theme(axis.text.x = element_text(angle=90, hjust=1))

barplot + theme_wsj() + coord_flip()

What does this look like on a map? Do we see any regional trends?

Firearms_Data_KAD$region <- Firearms_Data_KAD$StateLower
  
Firearms_Data_KAD$value <- Firearms_Data_KAD$Pct

map <- state_choropleth(
      Firearms_Data_KAD,
      legend = "Pct",
      title = "PerCapita Registered Gun Owners (2013)",
      num_colors = 1,
      zoom = NULL)

map + theme_map()

Rather than looking at distributions or state-specific data, what is the average gun ownership rate for each region?

# Average by Region

regiondata <- sqldf("select distinct Division, sum(`2013Pop`) as Pop, sum(`2013RegisteredOwners`) as Owners  from Firearms_Data_KAD group by Division")
head(regiondata)
##             Division      Pop Owners
## 1 East North Central 46662180 430545
## 2 East South Central 18716202 249693
## 3    Middle Atlantic 34469280 204755
## 4           Mountain 22881245 466634
## 5        New England 14618806 112220
## 6            Pacific 51373178 415969
regiondata$Pct <- regiondata$Owners/regiondata$Pop
head(regiondata)
##             Division      Pop Owners         Pct
## 1 East North Central 46662180 430545 0.009226851
## 2 East South Central 18716202 249693 0.013341008
## 3    Middle Atlantic 34469280 204755 0.005940217
## 4           Mountain 22881245 466634 0.020393733
## 5        New England 14618806 112220 0.007676414
## 6            Pacific 51373178 415969 0.008097007
regiondata$Division <- factor(regiondata$Division, 
                                      levels = regiondata$Division[order(regiondata$Pct)])
head(regiondata)
##             Division      Pop Owners         Pct
## 1 East North Central 46662180 430545 0.009226851
## 2 East South Central 18716202 249693 0.013341008
## 3    Middle Atlantic 34469280 204755 0.005940217
## 4           Mountain 22881245 466634 0.020393733
## 5        New England 14618806 112220 0.007676414
## 6            Pacific 51373178 415969 0.008097007
barplot2 <- qplot(Division, data=regiondata, geom="bar",
                  weight=Pct, ylab="PerCapita Registered Gun Owners")

barplot2 + theme_wsj() + coord_flip()

In summation, we find the following:
  • Wyoming has, by far, the highest percentage of residents who are registered gun owners
  • New York state has the lowest rate of gun ownership
  • In general, Western and Mountain states have the highest rate of registered gun ownership
  • However, don’t forget that the District of Columbia is #2

    [THE END]