Hurricane Maria Mortality Analysis

by Ian Flores Siaca

September 29, 2018

In [1]:
# From SO: https://stackoverflow.com/questions/27934885/how-to-hide-code-from-cells-in-ipython-notebook-visualized-with-nbviewer

from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
Out[1]:
In [2]:
from IPython.display import HTML, display
import pandas as pd

import tabulate

Context

Hurricane Maria struck Puerto Rico the 20th of September of 2017 at 6:15 a.m. entering through the municipality of Yabucoa as a Category 4 hurricane. According to the governments of Puerto Rico and the U.S. Virgin islands, the cost of the damage is estimated in $102 billion USD1. However, the impact wasn't only economical. Following a lawsuit presented by the Center for Investigative Journalism (CPI, by its spanish initials) and CNN, the Government of Puerto Rico, and more specifically the Demographic Registry, was forced to publish the individual-level data of all the deaths occurred from September 20, 2017 to June 11, 2018.

With access to this data we focused on answering two main questions. First, which municipalities in Puerto Rico need more aid to prevent deaths in the case of natural disasters? Second, which areas of the municipalities had higher death rates? This questions enable us with actionable insight to maximize aid and minimize the loss of life for future natural disasters.

The Data

As previously mentioned, the main data source was the individual-level data of all the deaths in the island after the hurricane (Available Here). It is worth noting that this dataset contains data that it's not confidential, but very sensitive as it contains all identifiable data such as names and addresses. I also used the 2017 Annual Estimates of the Resident Population from the United States Census Bureau (Available Here) to get the population estimates per municipality in 2017.

The Analysis

To answer the first of our questions Which municipalities in Puerto Rico need more aid to prevent deaths in the case of natural disasters?, I decided that it was better to first standarize the data to be on a comparable scale and then visualize it to see if there were spatial patterns. I did this by calculating the death rate for each municipality.

This maps highlights the top ten municipalities with the highest death rate. We see a spatial cluster of municipalities in the mountainous/central region of the island. We can observe as well that municipalities that are on the west coast of the island have a lower death rate than the other municipalities, this might be because they were the farthest from the Hurricane, but there might be other public policy possibilites as to why that is.

Moving on to the second question, Which areas of the municipalities had higher death rates? I wanted to compare the death rate in urban areas with the death rate in rural areas, see if it was significant, but allow for a greater variability as a urban area in the north part of the island is not very similar to an urban area in the central part of the island. Given this reasons, I opted to estimate the death rate in urban and rural areas and compare this estimates. It is worth noting that the categories of rural and urban are assigned by the Department of Health of Puerto Rico.

This estimates yielded an average mean of 1.98 with a confidence interval of 1.55 and 2.41. What this means is that on average 2 more persons died in rural areas than in urban areas. However, which rural areas might need more focus after another natural disaster? In the table below I present the top 10 municipalities with the highest rural death rate.

In [3]:
data_rural = pd.read_csv("../analysis/data/rural/percentile_100.csv").drop(['Unnamed: 0'], axis = 1).head(n = 10)
display(HTML(tabulate.tabulate(data_rural, tablefmt='html', headers = ['Municipality', 'Zone', 'Number of Deaths', 'Population (2017)', 'Death Rate per 1000 individuals'])))
Municipality Zone Number of Deaths Population (2017) Death Rate per 1000 individuals
0GUANICA RURAL 111 16363 6.7836
1SAN SEBASTIAN RURAL 242 37306 6.48689
2VIEQUES RURAL 55 8669 6.34445
3LARES RURAL 162 25772 6.28589
4AGUADILLA RURAL 320 53164 6.01911
5RINCON RURAL 85 14128 6.01642
6LAS MARIAS RURAL 50 8402 5.95096
7NARANJITO RURAL 161 28306 5.68784
8CAMUY RURAL 180 31732 5.67251
9LAJAS RURAL 129 22929 5.62606

If we are to focus on the urban areas, this would be the top 10 municipalities with the highest urban death rate:

In [4]:
data_urban = pd.read_csv("../analysis/data/urban/percentile_100.csv").drop(['Unnamed: 0'], axis = 1).head(n = 10)
display(HTML(tabulate.tabulate(data_urban, tablefmt='html', headers = ['Municipality', 'Zone', 'Number of Deaths', 'Population (2017)', 'Death Rate per 1000 individuals'])))
Municipality Zone Number of Deaths Population (2017) Death Rate per 1000 individuals
0SAN JUAN URBANO 2531 337288 7.50397
1BAYAMON URBANO 1174 179565 6.53802
2PONCE URBANO 919 140859 6.52425
3CATANO URBANO 150 24374 6.1541
4CAROLINA URBANO 866 154489 5.60558
5FAJARDO URBANO 166 31324 5.29945
6CAGUAS URBANO 641 129604 4.94584
7GUAYNABO URBANO 422 87328 4.83236
8MAYAGUEZ URBANO 335 75525 4.43562
9TOA BAJA URBANO 287 78092 3.67515

However, on an island with a limited budget and a complex public health system segregating between rural and urban areas might not be the most important factor in determining where to impact. With this last table, we can visualize the top 10 zones in the island with the highest death rate.

In [5]:
data_total = pd.read_csv("../analysis/data/total/percentile_100.csv").drop(['Unnamed: 0'], axis = 1).head(n = 10)
display(HTML(tabulate.tabulate(data_total, tablefmt='html', headers = ['Municipality', 'Zone', 'Number of Deaths', 'Population (2017)', 'Death Rate per 1000 individuals'])))
Municipality Zone Number of Deaths Population (2017) Death Rate per 1000 individuals
0SAN JUAN URBANO 2531 337288 7.50397
1GUANICA RURAL 111 16363 6.7836
2BAYAMON URBANO 1174 179565 6.53802
3PONCE URBANO 919 140859 6.52425
4SAN SEBASTIAN RURAL 242 37306 6.48689
5VIEQUES RURAL 55 8669 6.34445
6LARES RURAL 162 25772 6.28589
7CATANO URBANO 150 24374 6.1541
8AGUADILLA RURAL 320 53164 6.01911
9RINCON RURAL 85 14128 6.01642

Conclusion

In a context of climate change and a worsening economic crisis, Puerto Rico's organizations (Government and NGO's) need to improve their preparedness and their plans of action to fit the economic and infrastructure constrains present to be able to effectively save lives and maximize aid. There's been a lot of talk as to how the organizations should act upon on times of crisis, however, without data from previous disasters we are merely speculating as to which processes happened and how they happened. This analyses shows that more focus needs to be turned into the rural areas of the island as those are the ones with a higher death rate. Not only this, but remote and hard-to-access locations such as the mountainous region need to be aided with greater efficency as this ones as well suffered higher death rates.