Preparing for a Data Science Career in Crime Data Analysis with a Master’s Degree

New York City was a dangerous place in 1990. There were 2,605 murders in the city that year, more than 5,000 rapes, and over 200,000 reported incidents of violent crime overall.

Much of that crime happened in the subway.

A Transit cop named Jack Maple slapped a map of the subway system up on his office wall and started sticking pins in the locations where robberies happened. He noticed clusters in the incidents, places that were more dangerous than others. He began dispatching his units there proactively before any crimes were even reported. Robberies in those places dropped, only to rise elsewhere, a story told by his pins. Maple continued to use the data to predict places where crimes would occur, calling the map on his wall a “chart of the future.”

Perhaps surprisingly, no one had attempted such a metrics-based approach to law enforcement before. Other officers made fun of Maple’s charts, calling them “wallpaper.” But the laughter didn’t last long.

Crime in the subways dropped 27 percent during Maple’s tenure with the Transit police, and when commissioner Bill Bratton was appointed to head the NYPD, he took Maple to One Police Plaza to implement the “charts of the future” department-wide. Maple called it “CompStat.”

By the end of the decade, violent crime in New York had declined by more than half and CompStat was in the process of being implemented by almost every major police agency across the country.

Data now drives many policing decisions and is increasingly incorporated into daily operations in departments around the world:

  • Census data is analyzed to proactively predict social pressures and trends to dictate patrol sectors and scheduling
  • Crime rates are tracked and compared to enforcement efforts to find the most efficient policing responses
  • Incident responses are analyzed to inform decisions on the most effective methods of crowd control.

CompStat Becomes the New Face of Policing

In an age where every business decision is calculated on a spreadsheet, it can seem surprising that police agencies took so long to apply serious data analysis to the business of preventing crime. But when Jack Maple joined the force, policing was more about reacting than acting— officers answered calls for help. Most didn’t consider that there might be a way to prevent those calls from occurring in the first place.

It wasn’t that crime data wasn’t compiled and reported. The FBI has maintained national statistics on crime as far back as 1960. The innovation of CompStat was in using the numbers, on a weekly or daily basis, to respond to the likelihood of crime on the basis of statistical probabilities.

The process has moved past push-pins in a map and into heavy-duty databases where the data that drives the success of CompStat is stored and organized. The data has branched out far past just crime reports, however. Today, CompStat systems may interface with:

  • Geographic Information Systems, revealing underlying terrain and building types that would impact enforcement efforts
  • Other city-services databases, such as utility records or housing data
  • Jail and prison parole records, predicting recidivism rates for repeat offenders
  • Public reporting systems, which produce non-classified versions of crime numbers for public access over the Internet

In New York, the system now ties to animal control databases and looks for instances of animal cruelty, which have been found to correlate to domestic violence incidents.

People Aren’t Numbers: The Challenge of Integrating Data with Community Policing

CompStat and similar data-driven policing efforts drew pushback from many community groups, which claimed that the focus on numbers effectively dehumanized citizens and criminals alike. Tied to Commissioner William Bratton’s “Broken Windows” initiative in New York, which stressed arrest and prosecution for minor crimes that might have previously been overlooked in an effort to address larger crime trends in a given area, CompStat resulted in a lot of heavy-handed enforcement in poor and minority communities.

There have also been recurring claims that the reduction in crime numbers is less due to effective law enforcement and more likely related to playing with the numbers to show the desired results. A crime previously recorded as a felony, for example, might be altered by an officer or commander to be reported as a misdemeanor in an effort to make felony numbers appear to drop in their sector.

In Milwaukee, as reported in a 2012 article in the Milwaukee-Wisconsin Journal-Sentinel, hundreds of assault cases were mis-categorized to make it appear that violent crime in the city had dropped by 2.3 percent from 2010 to 2011. In fact, crime had increased by more than one percent during that period.

But data scientists have an important role in addressing these criticisms. By designing algorithms that are difficult to game, they reduce the likelihood of such manipulation occurring. And new efforts by police agencies to track their use-of-force patterns against certain demographics can help officers become more sensitive to their own habits and patterns.

The philosophy that came with the implementation of CompStat was that if a metric can be monitored, it can be altered; this holds as true for police behavior as criminal behavior.

Beyond Crime: Community Policing and Service with Big Data

Police do far more than just respond to crime. Their role to protect and to serve extends into various types of community service. Data science has found traction in supporting those roles, as well.

In Los Angeles, the LAPD has designed a system to help predict earthquake aftershocks and impacts on various neighborhoods in the city. Although accurately predicting when quakes will occur is still beyond the reach of science, aftershocks for a known event can be anticipated with more accuracy. The LAPD uses the information to prepare post-quake responses and determine where to send additional units for assistance even when communications systems may be down.

In Seattle, police have been working with community social networking site Nextdoor to monitor reports of suspicious activity and citizen complaints. Although the interactions the agency has with citizens via Nextdoor (or other social media platforms, for that matter) pretty much amounts to good old-fashioned beat-cop information gathering and community policing, SPD used analytics to determine what social media sites they should have an ear to in the first place. Sites without a critical mass of their target audience, they decided, would be a poor use of resources to monitor.

Big Data is also likely to have a hand in cataloging, tracking, and securing the proliferation of records coming from new police body-camera and audio systems. Devising secure methods to store this information while allowing it to be quickly referenced in order to comply with public records laws and inform decisions on future enforcement actions will be a significant challenge for data scientists working in the area of law enforcement.

Back to Top