The Most Lucrative Traffic Camera in D.C.
Introduction
I like to keep an eye on the DC open data portal as there are often interesting stories tucked away in municipal data. We’ll use R and various bits of the tidyverse to explore the January 2017 Moving Violation data. My goal is to find the traffic camera that produced the most tickets, and the camera with the highest total fines, using R. Chunks of R code and output are included throughout this post and the entire file is avaialble on my github. Download the code to perform your own analysis. If you find something interesting please let me know.
Moving Violation Data
The DC open data portal is a great resource for interesting municipal data. We’ll look at the January 2017 Moving Violation data downloaded on 1/13/2018 as a CSV file. The data are regularly revised so if you’re running the same analysis you could see different results. The data contain 158,414 rows with fields like violation type, address, fine amounts, location, date, and several geographic indiciators. Here are some interesting columns from the first 5 rows in the CSV file.
LOCATION | TICKETTYPE | FINEAMT | TICKETISSUEDATE | VIOLATIONDESC |
---|---|---|---|---|
600 BLK KENILWORTH AVE NE S/B | PHOTO | 100 | 2017-01-24 | SPEED 11-15 MPH OVER THE SPEED LIMIT |
2200 BLOCK K ST NW E/B | PHOTO | 200 | 2017-01-24 | SPEED 21-25 MPH OVER THE SPEED LIMIT |
DC295 NE .1MILE S/O EASTERN AVE S/B | PHOTO | 150 | 2017-01-24 | SPEED 16-20 MPH OVER THE SPEED LIMIT |
100 BLK IRVING ST NW W/B | PHOTO | 100 | 2017-01-24 | SPEED 11-15 MPH OVER THE SPEED LIMIT |
600 BLK KENILWORTH AVE NE S/B | PHOTO | 100 | 2017-01-24 | SPEED 11-15 MPH OVER THE SPEED LIMIT |
Prepare
This post assumes a level of fluency tidyverse style of R. For an introduction to the tidy way of thinking about code check out R for Data Science, and the documentation for dplyr, tibble, and magittr.
Explore
Totals and Sums
First we’ll look at the total number of tickets, and the total amount fined throughout January 2017, using the dplyr summarize verb and the n() function.
count | total |
---|---|
158414 | 18607200 |
D.C issued $18.6M in moving violation fines January 2017. That total
is kind of incredible.
Ticket Type
First we’ll start with TICKETTYPE, using dplyr (and group_by) to sum the fine amounts and count the total number of tickets per ticket type.
TICKETTYPE | count | total |
---|---|---|
MOVING | 2466 | 237130 |
NO TICKET TYPE | 48 | 4770 |
PHOTO | 155900 | 18365300 |
The vasty majority (155,900) of moving violation tickets issued are photo tickets so we’ll focus on those from now on.
Fine Amount
Another interesting looking field is the fine amount (FINEAMT
) We’ll count the different amounts and the total value of fines for each ticket amount.
FINEAMT | count | total |
---|---|---|
50 | 1090 | 54500 |
100 | 106248 | 10624800 |
150 | 42848 | 6427200 |
200 | 4554 | 910800 |
300 | 1160 | 348000 |
The distribution is fairly regular, with $100 as the most common ticketed amount. As a group those $100 tickets account for $10.6 million dollars in fines.
Location
Finally lets look at counts of tickets by location. We’ve got several options for location - Maryland State Pane Coordinates, a block address with direction, and a numerical ID by address. Lets take a look at the LOCATION
and ADDRESS_ID
fields first.
LOCATION | ADDRESS_ID | count | total |
---|---|---|---|
600 BLK KENILWORTH AVE NE S/B | 813891 | 28642 | 3434500 |
DC295 SW .7 MILES S/O EXIT 1 S/B | 810381 | 13590 | 1602200 |
2200 BLK K ST NW W/B | 805065 | 13266 | 1561800 |
1400 BLK S CAPITOL ST SE S/B | 277954 | 10338 | 1143500 |
2200 BLOCK K ST NW E/B | 805065 | 9464 | 1118000 |
1400 BLK S CAPITOL ST SE N/B | 277954 | 7020 | 785300 |
2200 BLK SOUTH DAKOTA AVE NE SE/B | 286707 | 6682 | 732400 |
600 BLK NEW YORK AVENUE NE W/B | 288753 | 6562 | 745800 |
3RD ST TUNNEL NW N/B BY MA AVE | 814983 | 4054 | 452400 |
4600 BLK MASSACHUSETTS AVE NW NW/B | 301549 | 3580 | 374600 |
Simply grouping byLOCATION
doesn’t give us the whole picture. The bold rows show that ADDRESS_ID
could also be called intersection. Lets group by ADDRESS_ID
and see how the ranking looks.
addresses | count | total |
---|---|---|
600 BLK KENILWORTH AVE NE S/B | 28642 | 3434500 |
2200 BLK K ST NW W/B, 2200 BLOCK K ST NW E/B | 22730 | 2679800 |
1400 BLK S CAPITOL ST SE N/B, 1400 BLK S CAPITOL ST SE S/B | 17358 | 1928800 |
DC295 SW .7 MILES S/O EXIT 1 S/B | 13590 | 1602200 |
600 BLK NEW YORK AVE NE E/B, 600 BLK NEW YORK AVENUE NE W/B | 9484 | 1073600 |
2200 BLK SOUTH DAKOTA AVE NE SE/B | 6682 | 732400 |
3RD ST TUNNEL NW N/B BY MA AVE | 4054 | 452400 |
4600 BLK MASSACHUSETTS AVE NW NW/B | 3580 | 374600 |
DC295 NE .1MILE S/O EASTERN AVE N/B, DC295 NE .1MILE S/O EASTERN AVE S/B | 3116 | 349900 |
3RD ST TUNNEL NW S/B BY 3RD ST | 3008 | 348500 |
The camera at the southbound 600 block of Kenilworth Ave NE is located right off 295 where the limit drops from 55 to 25, and issued the most tickets and total fines. The combined east and west bound cameras at the 2200 block of K St NW accounted for the second most. The K street cameras are cleverly placed in an uphill section of a 25MPH zone on either end of an underpass.
Conclusion and Future Directions
We’ve identified the most lucrative camera for the District - 600 Kenilworth Ave NE, which issued 28642 photo tickets worth almost $3.5M in January 2017. R made it fairly easy to perform this analysis but we’ve just scratched the surface of whats possible. There are still several interesting directions with these data. For example: look at non-photo moving violations, or weekday vs weekend activity, or do some cameras issue a particular type of ticket -red light vs speeding? Or you could combine the photo ticket data with accident data to find similar intersections with dissimilar accidents or ticket incidents. If you find anything interesting here, or have other questions about hacking on municipal data with R, I’d love to hear about it. Thanks for reading.