Building a National Database on Fatal and Non-Fatal Police Shootings in the US, 2015

Yuchen Hou, Criminal Justice
NML Award: The Social Justice Award (April 2019)

Project Description

No single existing database can allow researchers to determine how often and under what circumstances civilians have died from or survived police shootings in the US. To help fill the gap, Yuchen Hou’s dissertation research uses open sources to build a crowdsourced national database on fatal and non-fatal police shootings (FNFPS) in the US in 2015.

Identifying as completely as possible the counts and the attributes of FNFPS will follow a three-step approach: (1) The universe of FNFPS occurring in 2015 will be extracted from the Gun Violence Archive database. (2) Additional open source documents for validating incidents will be located using the Google search engine searching for incident information. (3) A coding team of 7 undergraduates from John Jay College will code and examine relevant attributes of FNFPS based on at least three different open source materials.

This database that captures incident-, context-, and agency-related information allows us to understand how selected features of situations, contexts, and police agencies interact with key parts of police shootings: officers, civilians, and places. Once the database is completed, this digital research will support multilevel analyses to estimate factors at individual, situational, contextual, and organizational levels that may contribute differently between fatal and non-fatal police shootings.

To make findings available to broader audiences, this research aims to create a CUNY Academic Commons website that displays a series of interactive maps visualizing the geographical distribution of FNFPS at national, regional, state, and city levels, and presents results from quantitative analyses. The web-embedded interactive maps will be generated using CARTO or Tableau online software, which are free of charge for students. In consultation with web developers, Hou will embed a searchable database on the website.

This website will not only allow for information audit and exchange to improve data accuracy, but it will also provide communities and social organizations with opportunities for external review of police legitimacy.

This research is one of the first attempts to apply emerging open source research methods to study police-citizen encounters which are most commonly measured by self-reports and official records. The creation of this crowdsourced database can help us understand how selected features of situations, contexts, and police agencies interact with key parts of police shootings: officers, civilians, and places. This open data collection effort allows us to provide a guideline for federal and local police agencies to follow in devising data reporting and collection systems on police use of deadly force. Findings will help police organizations, communities, and other stakeholders rethink this problem and collaboratively develop evidence-based policing reforms on policies and practices.

Work Completed to Date

In September 2018, Hou, in collaboration with one of his dissertation committee members, formed a faulty-student research team to build the CND on FNFPS. Hou has recruited and trained 7 students from John Jay College to use the SPSS statistical program to code incident-level data. Data collection was initiated in late September 2018. Currently, the research team has completed the incident-level data coding of police shootings that occurred from January to May 2015.

preliminary descriptive results of January cases

The following results had been presented at the American Society of Criminology Annual Meeting. The results of descriptive analyses of January cases show that frequencies of FNFPS vary across states, racial/ethnic groups, and levels of threat posed by encountered civilians.


The main issue of our data collection is missing data. Missing data may result from the reliability of data coding across different coders, and largely is contributed to the limited amount of available information provided by open sources. For example, information on involvedofficer is rarelyreported by the media. This precludes us to test officer-related hypotheses, such as if there is “One Trigger Finger for Whites and Another for Blacks” among officers with different demographic and educational background. The most common technique used to handle missing data is listwise deletion which has been criticized for limiting the amount of available information and resulted in biased modeling when “missingness” is not completely at random (Buuren, 2012). Compared to non-fatal shootings, more detailed information such as civilian race may be attained due to fatal outcomes being more newsworthy events. Thus, civilian race and other attributes of shooting incidents could be missing at random if the probability of non-appearance of information depends only on the completely observed shooting outcomes. This study attempts to address the missing at random problem which may be more salient in incident-level predictors (e.g., civilian race, weapon status, assaultive behavior, mental illness sign, etc.) by performing conditional multiple imputation models. Hou will compare the imputed data to the complete data with result from qualitative content analysis,in order tounderstand patterns of “missingness” in terms of individual and situational information on FNFPS described in open sources.