Photo credit: David Becker/Stringer/Getty Images
This is a brief summary of our final project for P8105 Data Science. The full project report can be viewed here.
This project is about mass shooting incidents (casualties greater than or equal to 3 people) occurred between 1966 and 2016. Our topic is “Mass Shootings in America (1966-2016)”. The project repo can be found here and the repo for this website is hosted seperately for publish purposes. The repo for the shiny app can be found here Our final project report is named “p8105_Final_Report.Rmd” and “p8105_Final_Report.html”. Our goal is to analyze the mass shooting events during this period to explore the potential causes and give out possible prevention suggestions. We are only using the credible multi-source certified data to keep internal validity. There are five members in this project (names listed at the bottom of this page). A hybrid of Agile and Waterfall project management methods were applied in this project. This can be seen in the commit and interactions histories in our GitHub repo. Tasks and stories were spread across the whole team and make individual deliveries to contribute to the final product. Collaborations were active and project activities including visioning, planning, documenting, developing, testing and presenting were all team efforts. Although there is no gold naming conventions and coding standards established in this class, we tried our best to practice as instructed by the teaching team and the feedbacks got from previous homework and midterm projects. Source control was achieved through Source Tree and GitHub Desktop applications, GitHub repo, and google docs.
Our source data was from Stanford’s Mass Shootings in America Project. The dataset we used can be found in the “data” folder in our repo. For more details, please refer to our whole report here.
ETL (Extract Transform and Load) was performed at the very beginning of this project’s development phase. R was used to load the original dataset and executed data cleaning and tidy processes. A new dataset including the fixed data and new variables interested was created and stored as a separate CSV file. In this process, we cleaned and fixed the date for the general public. With the discovering interesting findings of the school mass shooting incidents, several new variables were created to narrow down the school-related data for easier visualizations and data analysis.
Here are several dashboards for exploratory analysis. In each dashboard, there is a different focus on one aspect of the dataset. Detailed comments and findings can be found in the full report here.
Interactive US Mass Shooting Map and Descriptive Text Analysis
Under the “Mass Shooting Incidence - Heat and Bubble Map” tab, you can see an interactive map showing the total number of mass shooting cases in each state and some of the major cities.
Under the “What are inside Descriptions? - Text Analysis” tab, the keywords from those incidents’ descriptions are shown in different plots.
In this Dashboard, you can see the visualized summaries of different gun type used.
In this Dashboard, you can explore the motives for the mass shooting in the general public.
This is a shiny app hosted online. Under the different tabs, you can see different statistics about the school-related mass shooting events.
Under the “School-a hot spot of mass shooting” tab, you can see the mass shooting incident and victim numbers by place type.
Under the “A bloody history-mass shooting” tab, you can see the changes in numbers of victims and cases over the year.
Under the “Shooters in school” tab, you can change the data range and explore shooters’ position in that school and why these students committed mass shootings.
The trend for mass shootings is going up and the shooter tends to be younger. If we do not take precautions, there may be more serious gun issues in the near future. School shootings are persistent, however, this is a very concerning issue. Many students committed homicides because of excessive stress or mental problems, this should alert the educators to give more psychological interventions to guide the students on how to deal with pressure and failures. On the other hand, arising domestic violence is becoming a real trouble, especially for the past decade. To prevent mass shootings, all parties in the society should take actions. People should learn how to deal with troubles and conflicts by other means and how to protect themselves if there is a shooter-in-active situation. The government and Congress should take actions to regulate firearms more strictly if the gun ban is not a feasible option.
Shaoxuan Chen(tc2900), Pengfei Jiang(pj2325), Shanshan Song(ss5422), Zenghui Xue(zx2231), and Huijuan Zhang(hz2510)
Columbia Mailman School of Public Health
New York, New York 10032