The Most Lucrative Traffic Camera in D.C.

5 minute read

Introduction

I like to keep an eye on the DC open data portal as there are often interesting stories tucked away in municipal data. We’ll use R and various bits of the tidyverse to explore the January 2017 Moving Violation data. My goal is to find the traffic camera that produced the most tickets, and the camera with the highest total fines, using R. Chunks of R code and output are included throughout this post and the entire file is avaialble on my github. Download the code to perform your own analysis. If you find something interesting please let me know.

Moving Violation Data

The DC open data portal is a great resource for interesting municipal data. We’ll look at the January 2017 Moving Violation data downloaded on 1/13/2018 as a CSV file. The data are regularly revised so if you’re running the same analysis you could see different results. The data contain 158,414 rows with fields like violation type, address, fine amounts, location, date, and several geographic indiciators. Here are some interesting columns from the first 5 rows in the CSV file.

LOCATION TICKETTYPE FINEAMT TICKETISSUEDATE VIOLATIONDESC
600 BLK KENILWORTH AVE NE S/B PHOTO 100 2017-01-24 SPEED 11-15 MPH OVER THE SPEED LIMIT
2200 BLOCK K ST NW E/B PHOTO 200 2017-01-24 SPEED 21-25 MPH OVER THE SPEED LIMIT
DC295 NE .1MILE S/O EASTERN AVE S/B PHOTO 150 2017-01-24 SPEED 16-20 MPH OVER THE SPEED LIMIT
100 BLK IRVING ST NW W/B PHOTO 100 2017-01-24 SPEED 11-15 MPH OVER THE SPEED LIMIT
600 BLK KENILWORTH AVE NE S/B PHOTO 100 2017-01-24 SPEED 11-15 MPH OVER THE SPEED LIMIT

Prepare

This post assumes a level of fluency tidyverse style of R. For an introduction to the tidy way of thinking about code check out R for Data Science, and the documentation for dplyr, tibble, and magittr.

Explore

Totals and Sums

First we’ll look at the total number of tickets, and the total amount fined throughout January 2017, using the dplyr summarize verb and the n() function.

# violations is the moving violation data in a tibble 
# loaded with read_csv
violations %>% 
    summarize(count=n(), total=sum(FINEAMT))
count total
158414 18607200

D.C issued $18.6M in moving violation fines January 2017. That total is kind of incredible.

Ticket Type

First we’ll start with TICKETTYPE, using dplyr (and group_by) to sum the fine amounts and count the total number of tickets per ticket type.

violations %>% 
  group_by(TICKETTYPE) %>% 
  summarize(count=n(), total=sum(FINEAMT))
TICKETTYPE count total
MOVING 2466 237130
NO TICKET TYPE 48 4770
PHOTO 155900 18365300

The vasty majority (155,900) of moving violation tickets issued are photo tickets so we’ll focus on those from now on.

photos <- violations %>% 
    filter(TICKETTYPE=="PHOTO")

Fine Amount

Another interesting looking field is the fine amount (FINEAMT) We’ll count the different amounts and the total value of fines for each ticket amount.

photos %>% 
  group_by(FINEAMT) %>% 
  summarize(count=n(),total=sum(FINEAMT))
FINEAMT count total
50 1090 54500
100 106248 10624800
150 42848 6427200
200 4554 910800
300 1160 348000

The distribution is fairly regular, with $100 as the most common ticketed amount. As a group those $100 tickets account for $10.6 million dollars in fines.

Location

Finally lets look at counts of tickets by location. We’ve got several options for location - Maryland State Pane Coordinates, a block address with direction, and a numerical ID by address. Lets take a look at the LOCATION and ADDRESS_ID fields first.

photos %>% 
    group_by(LOCATION) %>%
    summarise(count=n(),total=sum(FINEAMT)) %>%
    arrange(desc(count)) %>% head(10)
LOCATION ADDRESS_ID count total
600 BLK KENILWORTH AVE NE S/B 813891 28642 3434500
DC295 SW .7 MILES S/O EXIT 1 S/B 810381 13590 1602200
2200 BLK K ST NW W/B 805065 13266 1561800
1400 BLK S CAPITOL ST SE S/B 277954 10338 1143500
2200 BLOCK K ST NW E/B 805065 9464 1118000
1400 BLK S CAPITOL ST SE N/B 277954 7020 785300
2200 BLK SOUTH DAKOTA AVE NE SE/B 286707 6682 732400
600 BLK NEW YORK AVENUE NE W/B 288753 6562 745800
3RD ST TUNNEL NW N/B BY MA AVE 814983 4054 452400
4600 BLK MASSACHUSETTS AVE NW NW/B 301549 3580 374600

Simply grouping byLOCATION doesn’t give us the whole picture. The bold rows show that ADDRESS_ID could also be called intersection. Lets group by ADDRESS_ID and see how the ranking looks.

photos %>% 
    group_by(ADDRESS_ID) %>% 
    summarise(addresses=toString(sort(unique(LOCATION))), count=n(),total=sum(FINEAMT)) %>% 
    arrange(desc(count)) %>% 
    select(addresses, count, total) %>% 
    head(10)
addresses count total
600 BLK KENILWORTH AVE NE S/B 28642 3434500
2200 BLK K ST NW W/B, 2200 BLOCK K ST NW E/B 22730 2679800
1400 BLK S CAPITOL ST SE N/B, 1400 BLK S CAPITOL ST SE S/B 17358 1928800
DC295 SW .7 MILES S/O EXIT 1 S/B 13590 1602200
600 BLK NEW YORK AVE NE E/B, 600 BLK NEW YORK AVENUE NE W/B 9484 1073600
2200 BLK SOUTH DAKOTA AVE NE SE/B 6682 732400
3RD ST TUNNEL NW N/B BY MA AVE 4054 452400
4600 BLK MASSACHUSETTS AVE NW NW/B 3580 374600
DC295 NE .1MILE S/O EASTERN AVE N/B, DC295 NE .1MILE S/O EASTERN AVE S/B 3116 349900
3RD ST TUNNEL NW S/B BY 3RD ST 3008 348500

The camera at the southbound 600 block of Kenilworth Ave NE is located right off 295 where the limit drops from 55 to 25, and issued the most tickets and total fines. The combined east and west bound cameras at the 2200 block of K St NW accounted for the second most. The K street cameras are cleverly placed in an uphill section of a 25MPH zone on either end of an underpass.

Conclusion and Future Directions

We’ve identified the most lucrative camera for the District - 600 Kenilworth Ave NE, which issued 28642 photo tickets worth almost $3.5M in January 2017. R made it fairly easy to perform this analysis but we’ve just scratched the surface of whats possible. There are still several interesting directions with these data. For example: look at non-photo moving violations, or weekday vs weekend activity, or do some cameras issue a particular type of ticket -red light vs speeding? Or you could combine the photo ticket data with accident data to find similar intersections with dissimilar accidents or ticket incidents. If you find anything interesting here, or have other questions about hacking on municipal data with R, I’d love to hear about it. Thanks for reading.

Categories:

Updated: