federal-crime-data-analysis

所属分类:数据挖掘/数据仓库
开发工具:Jupyter Notebook
文件大小:175225KB
下载次数:0
上传日期:2019-02-21 22:01:51
上 传 者sh-1993
说明:  联邦犯罪数据标准化和分析-Trace和BuzzFeed新闻
(Federal Crime Data Standardization and Analysis — The Trace and BuzzFeed News)

文件列表:
Makefile (639, 2019-02-22)
Pipfile (240, 2019-02-22)
data (0, 2019-02-22)
data\documentation (0, 2019-02-22)
data\documentation\nibrs-translations.csv (128583, 2019-02-22)
data\documentation\nibrs-variables.csv (339054, 2019-02-22)
data\documentation\shr-translations.csv (35112, 2019-02-22)
data\documentation\shr-variables.csv (6142, 2019-02-22)
data\identifier-crosswalk (0, 2019-02-22)
data\identifier-crosswalk\documentation (0, 2019-02-22)
data\identifier-crosswalk\documentation\35158-0001-Codebook.pdf (571485, 2019-02-22)
data\identifier-crosswalk\raw (0, 2019-02-22)
data\identifier-crosswalk\raw\35158-0001-Data.tsv (11577775, 2019-02-22)
data\standardized (0, 2019-02-22)
data\standardized\nibrs-agency-metadata.csv (16990222, 2019-02-22)
data\standardized\nibrs-victims.csv.z01 (52428800, 2019-02-22)
data\standardized\nibrs-victims.csv.z02 (52428800, 2019-02-22)
data\standardized\nibrs-victims.csv.zip (20436394, 2019-02-22)
data\standardized\reta-agency-metadata.csv (34597077, 2019-02-22)
data\standardized\reta-annual-counts.csv.zip (24909370, 2019-02-22)
data\standardized\shr-agency-metadata.csv (2378365, 2019-02-22)
data\standardized\shr-victims.csv (61599601, 2019-02-22)
documentation (0, 2019-02-22)
documentation\Data Dictionaries for Standardized Federal Data.pdf (120779, 2019-02-22)
documentation\fbi-data-specifications (0, 2019-02-22)
documentation\fbi-data-specifications\NIBRS Record Description.pdf (2038352, 2019-02-22)
documentation\fbi-data-specifications\Ret A Rec Descrip.pdf (291455, 2019-02-22)
documentation\fbi-data-specifications\Ret A negative entries.pdf (8765, 2019-02-22)
documentation\fbi-data-specifications\SHR 1962 to 1975.pdf (201293, 2019-02-22)
documentation\fbi-data-specifications\SHR Record Layout 1976 to 1979.pdf (232883, 2019-02-22)
documentation\fbi-data-specifications\SHR Record Layout 1980 to current.pdf (536641, 2019-02-22)
notebooks (0, 2019-02-22)
notebooks\analyze (0, 2019-02-22)
notebooks\analyze\00-analyze-reta.ipynb (1004323, 2019-02-22)
notebooks\analyze\01-analyze-shr.ipynb (1852058, 2019-02-22)
notebooks\analyze\02-analyze-nibrs.ipynb (505260, 2019-02-22)
notebooks\analyze\03-compare-datasets.ipynb (749678, 2019-02-22)
... ...

# Federal Crime Data Standardization and Analysis ### [*Click here for a detailed description of the data sources, methodology, and findings*](https://www.documentcloud.org/documents/5692683-Methodology-for-National-Analysis-of-Clearance.html) This repository includes methodologies, data, and code supporting the following articles, published by The Trace and BuzzFeed News: - "Shoot Someone In A Major US City, And Odds Are You’ll Get Away With It" (January 24, 2019) — [The Trace](https://www.thetrace.org/features/murder-solve-rate-gun-violence-baltimore-shootings) / [BuzzFeed News](https://www.buzzfeednews.com/article/sarahryley/police-unsolved-shootings) - "5 Things To Know About Cities’ Failure To Arrest Shooters" (January 24, 2019) — [The Trace](https://www.thetrace.org/2019/01/gun-murder-solve-rate-understaffed-police-data-analysis) / [BuzzFeed News](https://www.buzzfeednews.com/article/sarahryley/5-things-to-know-about-cities-failure-to-arrest-shooters) [*Click here for additional data and code from The Trace and BuzzFeed News's collaboration.*](https://github.com/the-trace-and-buzzfeed-news/introduction) # Raw Data The analysis is based, primarily, on three major datasets collected and published by the Federal Bureau of Investigation (FBI): Return A data, Supplementary Homicide Report data, and the National Incident-Based Reporting System. Each serves different purposes (for the FBI, and for our analyses), and each has different benefits and drawbacks. A detailed explanation of each can be found in the methodology linked above. The FBI's raw data files require many gigabytes of storage in total, and so are not directly included in this repository. The Trace and BuzzFeed News have [uploaded the raw files to the Internet Archive](https://archive.org/details/fbi-raw-data-files-nibrs-shr-return-a), where you can download them. # Standardized Data The raw Return A, NIBRS, and SHR data are formatted entirely differently from one another, use different terminology, different variables, and different data structures. To facilitate the combination and comparison of these three datasets, The Trace and BuzzFeed News created "standardized" versions of them all. You can find the standardized data in the [`data/standardized`](data/standardized) directory, and data dictionary of the standardized datasets in the [`documentation/`](documentation/) folder. The code to standardize the raw data can be found in the three Jupyter notebooks, written in the Python programming language, in the `notebooks/standardize` directory. Each notebook processes one of the three main federal datasets, and saves a standardized version to the `data/standardized` directory. # Data and Analysis The code to analyze the standardized data can be found in the four Jupyter notebooks, written in the Python programming language, in the `notebooks/analyze` directory. Each of the first three notebooks analyzes one of the three federal datasets; the fourth compares findings from the three datasets to one another. The findings are also summarized in the methodology linked in the first section of this document. # Data Disclaimer The data in this repository is a standardization of raw FBI data, from the three data collection programs described above. We have carefully checked the accuracy of our analysis, and shared our findings numerous experts in the law enforcement field prior to publication. We are sharing our data, methodology, and code in order to support further research and reporting on gun violence. However, users of this data may wish to independently verify the accuracy of their findings prior to making them public, as The Trace and Buzzfeed make no representations or warranties as to any third party use of this data. # Reproducibility Executing the notebooks above, in order, will reproduce the findings. You will need Python 3 installed, as well as the Python libraries specified in this repository's `Pipfile`. Before running the standardization code, you will need to download the raw data files from the Internet Archive, and place them in the `data/raw` directory, so that that folder's structure becomes: ``` data/ raw/ nibrs/ reta/ shr/ ``` You do not need to run the standardization code (which can take several hours to finish) in order to run the analysis code. But if you choose not to run the standardization code, you will first need to unzip the `data/standardized/nibrs-victims.csv.zip` and `data/standardized/nibrs-victims.csv.zip` files in order for the analysis to work. # Licensing All code in this repository is available under the [MIT License](https://opensource.org/licenses/MIT). The standardized data files are available under the [Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/) (CC BY 4.0) license. # Questions / Feedback For questions or feedback, please contact Jeremy Singer-Vine ([jeremy.singer-vine@buzzfeed.com](jeremy.singer-vine@buzzfeed.com)) and Sarah Ryley ([sryley@thetrace.org](sryley@thetrace.org)).

近期下载者

相关文件


收藏者