Modeling the COVID-19 Outbreak in Singapore Using Python
Overview
The COVID-19 pandemic is an ongoing pandemic of coronavirus disease caused by SARS-CoV-2. The outbreak was identified in Wuhan, China, in December 2019, and the World Health Organization declared the outbreak a Public Health Emergency on January 30th and a pandemic on March 11th. For my ES2 Final project, I wanted to program an SEIR Epidemiology Model to build an understanding of the current pandemic. To narrow down the scope of my project, I focused on the Singaporean population and compared data from before social distancing and after social distancing measures were put into place.
Design & Algorithm
An epidemiological model uses a microscopic description (the role of an infectious individual) to predict the macroscopic behavior of disease spread through a population. It models a finite number of individuals in a sample society (in my case, Singapore) interacting with each other on a daily basis and the chance of contamination each interaction brings. Keep in mind: this is just a model; it is not meant to predict the future. Instead, the SEIR model helps others gain an understanding of how COVID-19 is spreading between social interactions had we not put social distancing measures into place. The design of my project is centered around four different pieces of code which I will explain in more detail below.
1. The Baseline SEIR Model
This model is similar to the classic SIR model with one additional extension: the Exposed category. The SEIR models the flow of people between four states: Susceptible, Exposed, Infected and Recovered (Figure 1). Each of these variables represent the number of people in each group. There are several equations with greek terms (see Figure 2 below) that tie each respective category to one another. The first is Beta (the average contact rate in the population), the second is Alpha (the inverse of the incubation period), the third is Gamma (the inverse of the mean infectious period), and the last is Rho (how quickly the disease spreads).
As you can see, the SEIR is a type of compartmental model in epidemiology that simplifies the mathematical modeling of infectious diseases. For the first part of my project, I modeled a baseline SEIR model, inputting my own initial values & parameters and creating graphs to model a baseline situation of how COVID-19 would look in this “model society of Singapore” for the S, E, I and R categories. I was able to do this by translating the equations in Figure 2 into a Python script, utilizing a for loop.
2. The Social Distancing SEIR Model
For the second part of my project, I adapted the Python code from Step 1 by introducing a new parameter: Rho (how quickly the disease spreads). By changing the values of Rho (a scale of 0 [everyone participates in social distancing] to 1 [no one participates in social distancing]), I was able to form a new SEIR graph comparing different E and I values with different Rho values to see the effects of social distancing in action. I will talk more about this Social Distancing graph in a later section of this paper, “Results & Analysis”.
3. The SEIR Model Extended
For the third part of my project, I extended my knowledge of the classic SEIR Model by modeling deaths as well as different enforcements of lockdown days. One of the really cool parts of this section is that I used a new Scipy tool called scipy.integrate which is a method for integrating functions given a fixed sample. This tool helped me take the integral of multiple derivatives when solving for the SEIR equations. I also added a new function that was able to plot different SEIR graphs based on given parameters through utilizing an if/else statement. On top of that, I explored three extensions: the exposed category, the death category, and the lockdown category (all similar functions to the first 2 steps but adapted to fit a new context).
4. The Singapore Specific Functions
For the last part of my project, I used the Pandas Library, a fast, powerful, flexible, and easy to use source data analysis and manipulation tool. This was extremely useful for me, especially when it came to sorting the data I collected from Channel News Asia. Through Pandas, I was able to organize & give headings to all of my columns & rows, essentially a “one up” to always using Numpy & Scipy. On top of that, I was able to plot a wide variety of graphs based on the data I collected, starting from January 19th and ending on April 27th, a total of 100 days to make the graphs neater & easier to compare with the previous steps above. The main difference between Step 4 and the previous steps is that Step 4 is a direct real world application & comparison of the SEIR model. In the previous steps, “sample Singaporean societies” were created to make it easier to formulate graphs & data; however, the SEIR Models in Steps 1, 2, and 3 assume & simplify 2 important aspects of the real world: (1) people carry lifelong immunity to a disease upon recovery, and (2) this model occurs over a time period where there are no migrations in and out of the population. With the Singapore Specific Functions, I can compare the real world modeling with the sample modeling.
Results & Analysis
The main purpose of my project was to learn the direct effects social distancing can have on flattening the curve. I focused on the Singaporean population and compared data from before social distancing and after social distancing measures were put into place, but before I could look at the real world data, I needed to simulate what it would look like in a model society. This is where the first three steps of my project come into play.
As you can see from these two graphs above, Figure 4 (from Step 1 of the project) depicts a model example of how Singapore would look like if no social distancing measures were put into place. The graph predicts that slightly more than 25% of people would be exposed to COVID-19 while a little over 10% would actually be affected. In Figure 7 (from Step 2 of the project), the graph changes with different Rho values. Rho (how quickly the disease spreads) is a numerical scale from 0 [everyone participates in social distancing] to 1 [no one participates in social distancing]. Singapore went through three different waves of social distancing measures: (1) no social distancing (Rho=1), (2) a little bit of social distancing (Rho=0.8), and (3) half social distancing (Rho=0.5). Based on the graph, the number of people who are exposed and infected by COVID-19 decreases as Rho decreases, emphasizing the importance of social distancing. This is also one of the reasons why I love and prefer the SEIR Model compared to the SIR Model as it is extremely useful when trying to understand the impact social distancing can have on a society.
My Python code in Steps 1 & 2 helped me answer my initial question of interest, but I still wasn’t satisfied with my project. I wanted to go one step further (Step 3). I noticed that Singapore went through three different waves of social distancing, so I wondered how the Singaporean Government enforcing lockdowns at different “critical times” would have an effect on the SEIR Model. I tested 40, 50, 60 and 100 days.
As you can see from the graphs above, even the specific day the Singaporean Government decides to implement a country wide lockdown has an important effect on minimizing the curve effects. The most ideal situation would be a lockdown anytime within the first 40 days of the outbreak (Figure 8 shows how very few people would be susceptible, exposed or infected to the virus). Figure 9 shows how even a lockdown implemented within the first 50 days would have immense effect on helping to minimize the number of people who are exposed and infected. Essentially, these four graphs attempt to show how the longer the Singaporean Government waits to implement the lockdown, the more we see the familiar SEIR curve. The sooner governments around the world choose to lockdown and implement social distancing measures, the faster the curve will flatten, allowing medical responders more time and resources to treat sick patients.
For the final part of my project, I wanted to graph a real life COVID-19 Singapore data set from Channel News Asia to compare the real world with my model. I compiled the data into a CSV file, and then I used the Pandas Library to help sort the data into different columns. In total, I had 9 columns: Date, ICU, General Wards, Total Hospitalized, In Care Facilities, Completed Isolation (Recovered), Discharged from Hospital (Recovered), Total Discharged, Demised/Died, and Total Cases. The graphs from the real life data did not exactly match any of the SEIR Models, but they’re interesting and should be explored in more detail below.
Let’s start by comparing Figure 12 with Figure 3. In Figure 3, I annotated the graph to draw your attention to the orange circle. As you can see, this is because I want the focus to be on the Recovered and Infected portion of the graph to compare with that of Figure 12. In both of these graphs, the trajectory and motion of the graphs are similar. They both show that in the initial stages of an outbreak, the total number of cases will rise exponentially while the total number of people discharged from the hospital will be slow and can’t compare to the exponential growth of total cases. The one thing that doesn’t match up is the overall duration. The SEIR Model in Figure 3 assumes that in 100 days, society will be back to normal; whereas, the real life Singapore data in Figure 12 shows that that is not the case. Looking at the Hospitalization trends in Singapore compared to the SEIR model, they also don’t match up as well. In Figure 13, the real world data is not continuously smooth; instead, it’s jumpy and goes up and down at random intervals. The peak of the real world (80 days) is similar to that of the SEIR model (70 days), but it doesn’t line up quite right. The one thing that is quite similar in both cases is that the number of people infected in the Singapore graph is 0.06 of the population fraction ((3000/5000000)*100 = 0.06) which is close to that of the 0.10 in the SEIR graph. Overall, this shows that SEIR models are a great way of representing data and understanding trends in the big picture, but they are not predictive or representative of what goes on in the real world. At the end of the day, a model is just that: a model. It should be taken lightly in terms of predictions, but it should be taken seriously when considering repeats of a similar situation in the near future.
Conclusions
Two weeks ago, if you had told me that I would try my own hand at modeling the COVID-19 outbreak in Singapore, I wouldn’t have believed you. Today, I’m proud of what I’ve accomplished thus far in this ES2 class. I set off on this project, planning on exploring the effects social distancing might have on a sample population, and I came out learning more than I have ever learned about epidemics and pandemics before. Perhaps the most valuable thing I learned is that models are a great way of learning and gaining an understanding about the real world, but they are not predictions of what is to come. I’m really proud of my results of this project, and I’m glad that I was able to accomplish this all within a short time period. I feel like I learned a lot more about this pandemic than I ever did before through this project, and that makes me happy because that was my original plan and intent for this project.
If I were to continue this project in the future, I would compare different countries and their data with the data I collected for Singapore. I would look at the ways Hong Kong managed to prevent COVID-19 taking a massive toll on their economy by starting social distancing earlier, and I would compare that to countries that went on lockdown later. I would look more into what makes a person susceptible as well as try to understand better ways of predicting where this pandemic is heading based on the data that I have now. Modeling is good, but prediction at the end of the day is better, and hopefully that’s where our society is heading soon.
Above all, I learned the importance of social distancing. It does a lot for everybody in the world! By staying home, you can not only save lives, but also help flatten the curve for so many other people. This is a great practice that more and more people should take seriously. Even though a lot of places in the world are opening up their borders now, it’s important to remember that the Spanish Flu came in three rounds, and that if we compare the past to the present, we’ve only been hit with the first wave. As the future continues to be something that we can’t predict, we should control what we can— stay home & do our individual parts!
References
https://cmdlinetips.com/2018/03/how-to-change-column-names-and-row-indexes-in-pandas/
https://co.vid19.sg/singapore/dashboard/confirmed?start=06-02-2020&end=11-02-2020
https://ndlib.readthedocs.io/en/latest/reference/models/epidemics/SIR.html
https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
http://www.mtholyoke.edu/~ahoyerle/math333/ThreeBasicModels.pdf
Acknowledgements
I would like to thank Professor Cross for giving me the initial idea to try to take a hand at epidemiology modeling with the current pandemic.
I would like to thank Alycia for the idea to compare social distancing measures before and after.
I would like to thank Gaby for the helpful comment and tip on my ES2 Midway Presentation which guided me in the right direction.