This project involves analyzing 911 call data from Kaggle to understand the patterns and trends in emergency calls. The dataset includes various fields such as latitude, longitude, description, zipcode, and timestamps. The objective is to explore the data, extract meaningful insights, and visualize the results using Python and data science tools.
The dataset consists of the following columns:
lat: Latitude (float)lng: Longitude (float)desc: Description of the Emergency Call (string)zip: Zipcode (float, with some missing values)title: Title (string)timeStamp: Timestamp of the call (string)twp: Township (string, with some missing values)addr: Address (string)e: Dummy variable (always 1, integer)First, import the necessary libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
Read the CSV file into a DataFrame:
df = pd.read_csv('911.csv')
Check the information and head of the DataFrame:
df.info()
df.head()
Basic Questions
df['zip'].value_counts().head(5)
df['twp'].value_counts().head()
df['title'].nunique()
df['Reason'] = df['title'].apply(lambda title: title.split(':')[0])
df['Reason'].value_counts()
sns.countplot(x='Reason', data=df, palette='viridis')
df['timeStamp'] = pd.to_datetime(df['timeStamp'])
df['Hour'] = df['timeStamp'].apply(lambda time: time.hour)
df['Month'] = df['timeStamp'].apply(lambda time: time.month)
df['Day of Week'] = df['timeStamp'].apply(lambda time: time.dayofweek)
dmap = {0: 'Mon', 1: 'Tue', 2: 'Wed', 3: 'Thu', 4: 'Fri', 5: 'Sat', 6: 'Sun'}
df['Day of Week'] = df['Day of Week'].map(dmap)
sns.countplot(x='Day of Week', data=df, hue='Reason', palette='viridis')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
sns.countplot(x='Month', data=df, hue='Reason', palette='viridis')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
Note: Missing months can be filled by plotting the counts per month.
byMonth = df.groupby('Month').count()
byMonth['lat'].plot()
sns.lmplot(x='Month', y='twp', data=byMonth.reset_index())
df['Date'] = df['timeStamp'].apply(lambda time: time.date())
df.groupby('Date').count()['lat'].plot()
plt.tight_layout()
df[df['Reason'] == 'Traffic'].groupby('Date').count()['lat'].plot()
plt.title('Traffic')
plt.tight_layout()
df[df['Reason'] == 'EMS'].groupby('Date').count()['lat'].plot()
plt.title('EMS')
plt.tight_layout()
df[df['Reason'] == 'Fire'].groupby('Date').count()['lat'].plot()
plt.title('Fire')
plt.tight_layout()
dayHour = df.groupby(by=['Day of Week', 'Hour']).count()['Reason'].unstack()
sns.heatmap(dayHour, cmap='viridis')
sns.clustermap(dayHour, cmap='coolwarm')
dayMonth = df.groupby(by=['Day of Week', 'Month']).count()['Reason'].unstack()
sns.heatmap(dayMonth, cmap='coolwarm')
sns.clustermap(dayMonth, cmap='coolwarm')
The analysis of 911 call data reveals insights into the distribution of emergency calls by reason, time, and location. Visualizations such as count plots, heatmaps, and clustermaps help in understanding the patterns and trends in the data.
Feel free to explore further and modify the analysis to suit your needs!
This project is licensed under the MIT License.