Title: | Access Crime Data from the Open Crime Database |
---|---|
Description: | Gives convenient access to publicly available police-recorded open crime data from large cities in the United States that are included in the Crime Open Database <https://osf.io/zyaqn/>. |
Authors: | Matthew Ashby [aut, cre, cph] |
Maintainer: | Matthew Ashby <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.3.5 |
Built: | 2025-03-02 03:48:45 UTC |
Source: | https://github.com/mpjashby/crimedata |
Convert the GEOID of a 2016 US Census block to the name or GEOID for the corresponding state, county, tract or block group.
block_geoid_to(geoid, to, name = FALSE) block_geoid_to_state(geoid, name = TRUE) block_geoid_to_county(geoid, name = TRUE) block_geoid_to_tract(geoid) block_geoid_to_block_group(geoid)
block_geoid_to(geoid, to, name = FALSE) block_geoid_to_state(geoid, name = TRUE) block_geoid_to_county(geoid, name = TRUE) block_geoid_to_tract(geoid) block_geoid_to_block_group(geoid)
geoid |
A character vector of 15-digit US Census block GEOIDs. |
to |
One of "state", "county", "tract", "block group" or (as an alias) "blockgroup". |
name |
Should the function return the state/county name rather than FIPS code? |
For details of the format of US Census GEOIDs, see https://www.census.gov/programs-surveys/geography/guidance/geo-identifiers.html.
A character vector of GEOIDs or names.
block_geoid_to("360810443021005", to = "county", name = TRUE)
block_geoid_to("360810443021005", to = "county", name = TRUE)
Access incident-level crime data from the Open Crime Database
The Crime Open Database (CODE) is a service that makes it convenient to use crime data from multiple US cities in research on crime. All the data are available to use for free as long as you acknowledge the source of the data.
For more about CODE data, see https://osf.io/zyaqn/.
To access CODE data, call get_crime_data
. Data are returned
as a 'tidy' tibble with each row corresponding to one recorded crime.
This site provides applications using data that has been modified for use from its original source, https://www.chicago.gov/, the official website of the City of Chicago. The City of Chicago makes no claims as to the content, accuracy, timeliness, or completeness of any of the data provided at this site. The data provided at this site is subject to change at any time. It is understood that the data provided at this site is being used at one's own risk.
Maintainer: Matthew Ashby [email protected] (ORCID) [copyright holder]
Useful links:
Report bugs at https://github.com/mpjashby/crimedata/issues
Retrieves data from the Open Crime Database for the specified years. Latitude and longitude are specified using the WGS 84 (EPSG:4326) co-ordinate reference system.
get_crime_data( years = NULL, cities = NULL, type = "sample", cache = TRUE, quiet = !interactive(), output = "tbl" )
get_crime_data( years = NULL, cities = NULL, type = "sample", cache = TRUE, quiet = !interactive(), output = "tbl" )
years |
A single integer or vector of integers specifying the years for which data should be retrieved. If NULL (the default), data for the most recent year will be returned. |
cities |
A character vector of city names for which data should be retrieved. Case insensitive. If NULL (the default), data for all available cities will be returned. |
type |
Either "sample" (the default), "core" or "extended". |
cache |
Should the result be cached and then re-used if the function is called again with the same arguments? |
quiet |
Should messages and warnings relating to data availability and processing be suppressed? |
output |
Should the data be returned as a tibble by specifying "tbl" (the default) or as a simple features (SF) object using WGS 84 by specifying "sf"? |
By default this function returns a one-percent sample of the 'core' data. This is the default to minimize accidentally requesting large files over a network.
Setting type = "core" retrieves the core fields (e.g. the type, co-ordinates and date/time of each offense) for each offense. The data retrieved by setting type = "extended" includes all available fields provided by the police department in each city. The extended data fields have not been harmonized across cities, so will require further cleaning before most types of analysis.
Requesting all data (more than 17 million rows) may lead to problems with memory capacity. Consider downloading smaller quantities of data (e.g. using type = "sample") for exploratory analysis.
Setting output = "sf" returns the data in simple features format by calling
sf::st_as_sf(..., crs = 4326, remove = FALSE)
For more details see the help vignette:
vignette("introduction", package = "crimedata")
A tibble containing data from the Open Crime Database.
Dataset containing records of homicides in nine large US cities in 2015, obtained from the Crime Open Database.
homicides15
homicides15
A tibble with 1,922 rows and 15 variables:
an integer unique identifier for the offense
name of the city in which the crime occurred
offense code, modified from the FBI NIBRS offense code
offense type name
date (and, in most cases, time) of the offense
approximate address of the offense*
approximate longitude
approximate latitude
type of location*
category of location type*
two-digit FIPS state code (possibly with leading zero)
three-digit FIPS county code (possibly with leading zero)
six-digit code for 2016 census tract
one-digit code for 2016 census block group
four-digit code for 2016 census block
More details of the data format are available on the Crime Open Database website. Variables marked * are only available for some of the data, due to limitations in the data published by some cities.
The variables in this dataset mirror those obtained by calling
get_crime_data(type = "core")
, except that some fields have been
removed because they are redundant (e.g. if they have the same value for all
rows in this dataset).
Get a tibble showing what years of crime data are available from which cities in the Open Crime Database.
list_crime_data(quiet = !interactive())
list_crime_data(quiet = !interactive())
quiet |
Should messages and warnings relating to data availability and processing be suppressed? |
A tibble
Dataset containing records of thefts of motor vehicles in New York City from 2014 to 2017, obtained from the Crime Open Database.
nycvehiclethefts
nycvehiclethefts
A tibble with 35,746 rows and 13 variables:
an integer unique identifier for the offense
date (and, in most cases, time) half-way between the first and last possible dates at which the offense could have occurred
first possible date (and, in most cases, time) at which the offense could have occurred
last possible date (and, in most cases, time) at which the offense could have occurred
approximate longitude
approximate latitude
type of location*
category of location type*
two-digit FIPS state code (possibly with leading zero)
three-digit FIPS county code (possibly with leading zero)
six-digit code for 2016 census tract
one-digit code for 2016 census block group
four-digit code for 2016 census block
More details of the data format are available on the Crime Open Database website. Variables marked * are only available for some of the data, due to limitations in the data published by some cities.
The variables in this dataset mirror those obtained by calling
get_crime_data(type = "core")
, except that some fields have been
removed because they are redundant (e.g. if they have the same value for all
rows in this dataset).