The rcrimeanalysis package is designed for the analysis of crime incident data from record management systems utilized by policing and security agencies throughout the United States. The package was contributed to provide a robust open source platform with crime analysis functionality since wide variations exist in the techniques that departments use, the levels of sophistication that police managers possess, and the frequency with which agencies engage in crime mapping.
Crime analysis as an operation involves the following:
plotting incident locations
differentiating incidents by crime type and adding topographic information for spatial context
identification of significant geographic relationships in the occurrence of criminal activity
discovery of patterns within temporal data of crimes
understanding crime distributions and their contributing factors
Crime mapping is essentially exploratory data analysis which help to uncover distributions, the distance between observations, and separate small and largescale variation, identify spatial patterns, and generate hypotheses that may explain the observable patterns. Discovered relationships or linkages among crime incidents are the intelligence which facilitates the development of strategies to deal with a given problem, make better decisions, target or effectively deploy resources, formulate resolution of the problem, and sustain efforts to assure continued solution.
This document introduces you to the set of tools in the rcrimeanalysis package, and shows you how to apply them to criminal record data.
The rcrimeanalysis package can be installed from CRAN.
The development version is also available from the author’s GitHub repo.
Installation is as follows:
The rcrimeanalysis package contains a sample dataset
crimes which contains information for 25,000 crime incidents in Chicago, IL from 2017-2019 1. This data was chosen as it is similar in structure to other police record management system (RMS) data in the United States. There are 22 different data elements for each crime incident.
geocode_address() function leverages the utilities of the Google Maps API for the batch geocoding of physical addresses. See the ggmap package for instructions on how to register Google Cloud Credentials
Geocoding is an essential task in crime analysis/policing and is commonly needed since location data is collected upon report (call for service or statement/affidavit) as an address from the caller or person involved. Once the data is transformed into a geospatial coordinate (latitude, longitude), analysis can take place.
kde_map() function performs a key crime analysis task: kernel density estimation for crime heat mapping. The function computes a kernel density estimation of the input crime data and returns an interactive leaflet widget of the incidents. The following computes a heat map of narcotics incidents with and without the incidents plotted:
The above maps illustrate hot spots and concentrations of criminal activities by place. The
pts parameter was included to facilitate the visualization of the data. The population of the pop-up boxes from the data is automated, given that the data is structured as is the example
id_repeat() function identifies crime incidents which occur at the same location. The identification of repeat crimes can be vital to the linkage of crime incidents and assessment of places. The following code detects 168 repeat crime series. The second detected repeat series is printed which shows that 4 different possession incidents occurred on the same sidewalk throughout 2019. This information could be used to investigate whether this is a location where the narcotics are being dealt, or if there is another rationale for these incidents.
narco_repeats <- id_repeat(narcotics) narco_repeats #> [] #> # A tibble: 4 x 22 #> id case_number date block iucr primary_type description location_descri~ #> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> #> 1 1.16e7 JC195914 3/22~ 0000~ 2024 NARCOTICS POSS: HERO~ SIDEWALK #> 2 1.17e7 JC305783 6/14~ 0000~ 2024 NARCOTICS POSS: HERO~ STREET #> 3 1.14e7 JB384254 8/8/~ 0000~ 2024 NARCOTICS POSS: HERO~ SIDEWALK #> 4 1.16e7 JC143516 2/7/~ 0000~ 2024 NARCOTICS POSS: HERO~ SIDEWALK #> # ... with 14 more variables: arrest <lgl>, domestic <lgl>, beat <dbl>, #> # district <dbl>, ward <dbl>, community_area <dbl>, fbi_code <chr>, #> # x_coordinate <dbl>, y_coordinate <dbl>, year <dbl>, updated_on <chr>, #> # latitude <dbl>, longitude <dbl>, location <chr>
ts_month_decomp() functions perform time series decomposition of crime incident data collected over time for pattern detection. Each function is optimized for a certain frequency interval of data in the time series (daily or monthly). The decomposition functions transform the raw crime data into a time series, perform locally weighted regression for smoothing, and plot the resultant decomposed components of the time series into identified seasonal, trend and irregular components. An example of the
ts_month_decomp() function is given below for the narcotics crimes:
The different components can be useful for both the development and evaluation of strategic action. For example, seasonality can be used for administrative planning through personnel and resource allocation. Changepoints in the trend component can be used to identify changes in the crime patterns. For example, if a policy change took place mid 2017, this change may have stimulated narcotics activity or have increased detection and closure rates of narcotics incidents.
Crime forecasting is not a technique widely practiced by police agencies to date. The practice of predictive modeling is the process of developing a framework or model which enables the understanding and quantification of the prediction accuracy on future, yet-to-be-seen data. This can be useful in a forward-looking sense to gain an understanding of what future criminal activity could look like. The
ts_forecast() function uses the crime data to predict the future crime trend and daily frequency with different confidence levels for the next 365 days.
Traditionally, space and time have been treated as isolated entities in crime analysis. Combination of these data into a spatio-temporal crime analysis workflow can facilitate a more robust analysis because an incident occurs as an interaction between persons or objects within both space and time domains.
kde_int_comp() function extends the traditional heat map (as seen with
kde_map()) to perform a comparsion across time intervals. Combining the distribution of incidents with time intervals effectively visualizes the time series data in space, which can be very useful in identifying displacements in criminal activity. The
kde_int_comp() function was used to evaluate the narcotics incidents from the beginning of 2017 and 2018. Returned is a net difference raster and then 3 leaflet (Interval 1, Interval 2, Net Difference) widgets for detailed comparison.