A Tutorial on Common Differential Equations and Solutions Useful for Modeling Epidemics Like COVID-19: Linear and Non-Linear Compartmentation Models


Purpose: To review some of the basic models, differential equations and solutions, both analytic and numerical, which produce time courses for the fractions of Susceptible (S), Infectious (I) and Recovered (R) fractions of the population during the epidemic and/or endemic conditions. Methods: Two and three-compartment models with analytic solutions to the proposed linear differential equations as well as models based on the non-linear differential equations first proposed by Kermack and McKendrick (KM) [1] a century ago are considered. The equations reviewed include the ability to slide between so-called Susceptible-Infected-Recovered (SIR), Susceptible-Infectious-Susceptible (SIS), Susceptible-Infectious (SI) and Susceptible-Infectious-Recovered-Susceptible (SIRS) models, effectively moving from epidemic to endemic characterizations of infectious disease. Results: Both the linear and KM model yield typical “curves” of the infected fraction being sought “to flatten” with the effects of social distancing/masking efforts and/or pharmaceutical interventions. Demonstrative applications of the solutions to fit real COVID-19 data, including linear and KM SIR fit data from the first 100 days following “lockdown” in the authors’ locale and to the total number of cases in the USA over the course of 1 year with SI and SIS models are provided. Conclusions: COVID-19 took us all by surprise, all wondering how to help. Spreading a basic understanding of some of the mathematics used by epidemiologists to model infectious diseases seemed like a good place to start and served as the primary purpose for this tutorial.

Share and Cite:

Mulkern, R. and Nosrati, R. (2022) A Tutorial on Common Differential Equations and Solutions Useful for Modeling Epidemics Like COVID-19: Linear and Non-Linear Compartmentation Models. Journal of Applied Mathematics and Physics, 10, 3053-3071. doi: 10.4236/jamp.2022.1010204.

1. Introduction

Since the introduction of coronavirus circa late 2019 (COVID-19), the sentient world has been subjected to media reports with curves representing the time courses of various aspects of the disease like the number of new cases, the number of deaths, the number of Intensive Care Unit (ICU) beds occupied, etc. For example, one common public relations strategy to enhance compliance with recommended health measures has been the concept of “flattening the curve”. The “curves” in such cases are often displayed with “peaks” accompanied by hard horizontal lines hovering over, through, or below the peaks, allowing a ready conception of the limits of various hospital capacities. Generating such epidemic curves and related endemic curves with useful compartmentation models is well-known in the field of epidemiology, with a rich history beginning nearly a century ago with the groundbreaking paper by Kermack and McCendrick [1]. Herein we review in tutorial fashion some of the basic models from which differential equations arise, provide derivations of analytic solutions when possible or provide numerical modeling examples when tackling non-linear aspects of infectious disease transmission. Specifically, the conceptualization of compartmentation and its role in identifying the various differential equations which are used by epidemiologists to model the course of infectious diseases, as well as the relationship between the different models and their solutions, whether analytic or numerical, are emphasized. Problems addressed within this tutorial include understanding the role of the different rate parameters within the model, associated with either social distancing aspects or pharmacological interventions, and to address how measured infection data can be fitted with appropriate models to extract relevant rate parameters.

2. Theory

Life and the diseases that affect it are so complex that it may seem absurd to reduce the effects of contagions upon life by the boxes shown in Figure 1, boxes which represent the susceptible fraction S, the infectious fraction I and the recovered fraction R of the general population. Nevertheless, such boxes and the arrows connecting them are indeed the conceptual starting point for the differential equations whose solutions we will seek in order to better understand and model important, everyday aspects of life during contagious diseases like COVID-19.

The top row of Figure 1 shows what is referred to as the SIR model and the SIR/SIS model. The pure SIR model involves transfer from the S box to the I box at the rate βS in the linear version and the rate βIS in the KM version, the latter reflecting the notion that the more infected people there are the greater the chance of susceptible people becoming infected. In the pure SIR model, it is assumed that the only way to leave the infected box is to go to the recovered box at a rate γ2I with no return to the infected or susceptible box, implying immunity or death. In the SIR/SIS model an additional mechanism exists for people in the

Figure 1. Schematic representations of the SI, SIS, SIR and SIRS models as described in the text. Note the flux rate between the S and I boxes in the linear model (parentheses) is not proportional to the infected fraction I as it is in the Kermack-McKendrick model.

infected box to return to the susceptible box at the rate γ1I, implying a recovery but without achieving immunity.

The middle row shows the simpler SIS/SI models in which individuals never gain immunity or death and so pass back and forth between the susceptible and infectious boxes with no recovery box available. In the SIS/SI models, S to I rate is βS for the linear model and βIS for the KM mode with a backward I to S rate of γI.

The bottom row in Figure 1 shows the SIRS model in which the immunity within the recovered fraction does not last forever. This is reflected by a flux back to the susceptible fraction as shown by the arrow labeled with the rate γ1R from the R box to the S box. In this model, we ignore the possibility of the infected fraction returning to the susceptible fraction. These simple concepts will form the baSIS for the following differential equations, their solutions, expository demonstrations of the effects of individual parameters and finally for modeling some actual data typical of the “curves” being sought to “flatten” which have become routine fixtures in the media.

2.1. The Linear Model

The first linear model (LM) proposed contains three free parameters. These are characterized by the rate constant of β for the virus converting healthy people into sick people, a rate constant γ2 at which sick people leave the infectious pool either through recovery with bestowed immunity or death into the “recovered pool”, and a third rate parameter γ1, which will allow for the possibility of infected people returning to the susceptible pool, i.e., without immunity. The latter provides a sliding parameter that may be adjusted to allow for SIR towards SIS conditions. The units of the rate constants are specified in days−1. The following set of differential equations describes the linearized version of the SIR/SIS conditions of the top row in Figure 1:

d f d t = β f + γ 1 g (1a)

d g d t = β f + γ 1 g γ 2 g (1b)

d m d t = γ 2 g (1c)

where f is the susceptible fraction of the population, g is the infectious fraction of the population and m is the recovered fraction of the population where the vanishing sum of Equations (1a)-(1c) yields the normalization condition,

f + g + m = f 0 + g 0 + m 0 = 1 (2)

where f 0 , g 0 , and m 0 are the starting fractions of each pool at t = 0. With apologies to Kermack and McKendrick [1], who used x, y and z for the actual pool numbers, not fractions, and modern adaptations of KM models which use S, I and R, for fractional populations once properly normalized [2] [3], our notation of f, g and m allow us to reserve their capital counterparts F, G and M for their respective Laplace transforms. Equations (1a)-(1c) are what we refer to as linearized versions of the non-linear equations provided by Kermack and McKendrick, albeit without the γ1 term [1]. Despite the simplification which allows for analytic solutions for f, g and m (vide infra), the essentials of three of the SIR, SIS and SI may be gleaned from them which will also apply to the non-linear KM versions discussed below. In the SIR model, the parameter γ1 is set to 0 which has the following implication. Once an individual passes from the S pool to the I pool, there is no going back. The assumption is that this individual will end up in the R pool, having either recovered with immunity or having died, in either case escaping reinfection. In the SIS model, γ2 is set to 0, effectively implying that there is no recovered pool R and individuals get sick and then return to the healthy but susceptible fraction. With SIS, a steady state level of the infected fraction is achieved. The SI model sets both γ1 and γ2 to 0 and is a limit of the SIS model in which everyone ultimately becomes infected. The SI model is the simplest of the three and interestingly the model selected by Mohazzabi et al. to fit CDC COVID-19 case data over the course of a year within the United States (3) and throughout the world [4]. Regardless, the Laplace transform of the system of Equations (1a)-(1c) leads to the following algebraic relations among F, G and M, with α being the Laplace transform variable:

α F f 0 = β F + γ 1 G (3a)

α G g 0 = β F + γ 1 G γ 2 G (3b)

α M m 0 = γ 2 G (3c)

where F = f e α t d t , etc., as integrated from t = 0 to with the tacit assumption that terms like f e α t at t = vanish. Solving for G from Equations (3a)-(3b) leads to:

G = ( α + β ) g 0 + β f 0 α 2 + α ( β + γ ) + β γ 2 = ( α + β ) g 0 + β f 0 ( α Ψ + ) ( α Ψ ) (4)

with γ γ 1 + γ 2 . The second equality in Equation (4) introduces the roots of the quadratic defining the denominator of the first equality which are:

Ψ ± = [ ( β + γ ) ± ( β + γ ) 2 4 β γ 2 ] / 2 . (5)

A similar derivation for F from Equations (3a)-(3b) yields,

F = f 0 ( α Ψ + ) ( α Ψ ) + γ 1 ( α + β ) g 0 + γ 1 β f 0 ( α + β ) ( α Ψ + ) ( α Ψ ) . (6)

To extract f and g from F and G, the following inverse Laplace transform rule [5] which utilizes the residues (Res) of F and G at their poles is invoked:

g = Res ( G ( α i ) ) e α i t (7a)

f = Res ( F ( α i ) ) e α i t (7b)

where the sum is over two poles for G and three poles for F. Proceeding by inspection, we find that g and f consist of two and three exponential functions of t, respectively, which read:

g = [ ( ( Ψ + + β ) g 0 + β f 0 ) e Ψ + t ( ( Ψ + β ) g 0 + β f 0 ) e Ψ t ] / ( Ψ + Ψ ) (8a)

f = f 0 e β t + γ 1 β f 0 e β t ( β + Ψ + ) ( β + Ψ ) + ( γ 1 ( Ψ + + β ) g 0 + γ 1 β f 0 ) e Ψ + t ( Ψ + + β ) ( Ψ + Ψ ) + ( γ 1 ( Ψ + β ) g 0 + γ 1 β f 0 ) e Ψ t ( Ψ + β ) ( Ψ Ψ + ) . (8b)

One could also solve for m via Equation (3c) and Equation (4) using the inverse Laplace transform but there is no need as the normalization condition of Equation (2) may be used once f and g are specified. A further set of differential equations which will underpin a linearized version of the SIRS model considers a flux from the susceptible fraction S to the infected fraction I and then to the recovered fraction R but then also incorporates the possibility of “immunity” within the recovered fraction to “wane” by introducing a flux from the recovered back to the susceptible fraction, as shown in the lower row of Figure 1. The differential equations governing this situation read:

d f d t = β f + γ 1 m (9a)

d g d t = β f γ 2 g (9b)

d m d t = γ 2 g γ 1 m . (9c)

Again, the Laplace transform approach may be gainfully applied to this system with the details left to the reader. One obtains for f, g and m the following time courses:

f = γ 1 γ 2 β γ 1 + β γ 2 + γ 1 γ 2 + ( f 0 ( β + Ψ + ) ( β + Ψ ) γ 1 γ 2 ) e β t ( β + Ψ + ) ( β + Ψ ) + ( m 0 γ 1 ( β + Ψ + ) ( γ 2 + Ψ + ) + g 0 γ 1 γ 2 ( β + Ψ + ) + γ 1 γ 2 β f 0 ) e Ψ + t Ψ + ( Ψ + + β ) ( Ψ + Ψ ) + ( m 0 γ 1 ( Ψ + β ) ( Ψ + γ 2 ) + g 0 γ 1 γ 2 ( Ψ + β ) + γ 1 γ 2 β f 0 ) e Ψ t Ψ ( Ψ + β ) ( Ψ Ψ + ) (10a)

m = β γ 2 β γ 1 + β γ 2 + γ 1 γ 2 + ( m 0 ( Ψ + + β ) ( Ψ + + γ 2 ) + g 0 γ 2 ( Ψ + + β ) + f 0 β γ 2 ) e Ψ + t Ψ + ( Ψ + Ψ ) + ( m 0 ( Ψ + β ) ( Ψ + γ 2 ) + g 0 γ 2 ( Ψ + β ) + f 0 β γ 2 ) e Ψ t Ψ ( Ψ Ψ + ) (10b)

g = 1 f m (10c)


Ψ ± = [ ( β + γ 1 + γ 2 ) ± ( β + γ 1 + γ 2 ) 2 4 ( β γ 1 + β γ 2 + γ 1 γ 2 ) ] / 2 . (11)

For non-vanishing rates of β, γ1 and γ2, the susceptible, infected, and recovered fractions achieve steady state values as t approaches ∞ which are given by:

f s s = γ 1 γ 2 β γ 1 + β γ 2 + γ 1 γ 2 (12a)

g s s = β γ 1 β γ 1 + β γ 2 + γ 1 γ 2 (12b)

m s s = β γ 2 β γ 1 + β γ 2 + γ 1 γ 2 . (12c)

2.2. SIR and SIRS Condition “Curve Flattening” with the Linear Model

Utilizing Equations (8a)-(8b), we first examine how decreasing the β parameter, the transmissibility rate, through such factors as social distancing, the use of personal protective gear, business closings, limited capacity recreational events, etc., can flatten the curve within the context of pure SIR conditions, e.g., γ 1 = 0 in Equations (1a)-(1c). The γ2 parameter, in contrast and if given no effective treatment, is largely based on the biology of the virus and the host which may be somewhat beyond our control, though the advent of vaccines and more nuanced control of sick patients does allow some control over the γ2 parameter and allows us to demonstrate how increasing this parameter via effective medical practices can also flatten the curve within the context of pure SIR conditions.

Figure 2(a) shows the effects of decreasing β (social distancing, etc.) on “flattening the curve” when β is decreased for a fixed γ2. The solid curves in the figure

Figure 2. (a) Effects of social distancing to “flatten the curve” with the linear model. Solid lines are plots of f, the susceptible fraction, vs time while the dashed traces are plots of g, the infected fraction, vs. time with increasing degrees of social distancing represented by the black, red, and blue curves, respectively. See text for specific parameter values. (b) Effects of potential pharmaceutical interventions evaluated with the linear model for a fixed social distancing parameter β with the solid black curve showing the decay of the susceptible fraction f and with the blue, red and black dashed traces showing increasingly effective pharmaceutical interventions on the infected fraction g, as reflected by increasing γ2 values. See text for specific parameter values. (c) The effects of allowing for a return from the recovered to the susceptible pool with positive values ofγ1 in Equations (9a)-(9c) are demonstrated by plotting the susceptible fraction (solid lines) and the infected fraction (dashed lines) as a function of time for variable immunity periods (see color inset) for fixed values of β and γ2 provided in the text. This unflattening of the curves with waning immunity lifts steady state values for infected and susceptible fractions away from 0, heralding an endemic vs. an epidemic situation. Immunity must last forever (dark blue curves) for an epidemic or pandemic which are the same things, just on different scales.

represent f, the healthy but susceptible fraction of the population. The dashed curves represent g, the sick fraction of the population. The curves were generated with a fixed γ2 value of 0.072 days−1 and β values of 0.096 days−1 (black curves), 0.048 days−1 (blue curves) and 0.024 days−1 (red curves). In this case, the strategy to strongly advise or otherwise mandate various degrees of social distancing to reduce the rate at which the virus spreads clearly worked. With the highest β value, perhaps representing the case of no social distancing, one sees the highest peak height of the g curve, a worst-case scenario with the potential to surpass various hospital capacities while the next two curves with lower β values show a successive lowering of the peak heights, hopefully flattened sufficiently so as to not overwhelm various hospital capacities. It is worth noting however that “flattening the curve” via increasing levels of social distancing also results in more infected people in the later stages of the pandemic. For example, Figure 2(a) indicates that at 70 days from t = 0 approximately 10% of the population remains infected in the case of the highest social distancing while only ~2% remains infected in the absence of social distancing. In fact, one finds identical areas beneath each of the three g curves in Figure 2(a) for this model as proven by calculating the area under g curves over all time from 0 to ∞. For SIR conditions ( γ 1 = 0 ) we have:

g ( t ) d t = f 0 β γ 2 β ( e β t e γ 2 t ) d t + g 0 e γ 2 t d t = f 0 + g 0 γ 2 (13)

thus, proving identical areas under the g curves for fixed γ2 and areas which are independent of β. This result has some implications regarding “flattening the curve”. Namely, within the confines of this model and with no way to affect γ2, then regardless of the degree of social distancing, masking, etc., invoked to “flatten the curve”, the same number of people will ultimately get sick, reach ICU units and/or ventilators, and die or recover.

Figure 2(b) shows the effects of potential pharmaceutical interventions when social distancing is fixed by using a β of 0.072 days−1 with the susceptible fraction f plotted as the solid black curve. The blue, red and black dashed curves plotted the infected fraction g for increasing levels of the effectiveness of potential pharmaceutical interventions and were generated using γ2 values of 0.024, 0.048 and 0.096 days−1, respectively. Figure 2(b) shows that effective interventions can alleviate the burden of disease as quantified by the area under each of the three curves, the red and black dashed curves showing half and one-fourth of the area of the blue curve, in accordance with their respective γ2 values (see Equation (13)).

The effects on the curves when the immunity bestowed upon the recovered fraction is allowed to wane, as portrayed by the lower row of Figure 1 and characterized via Equations (9)-(12) are provided in Figure 2(c) which shows that as one allows a flux from the recovered to the susceptible pool with a non-vanishing γ1 the steady state levels of the infected fraction rise from 0, in the pure SIR condition, to levels commensurate with Equation (12b) while the susceptible fraction also rises to non-vanishing levels (Equation (12a)) at long times. This “unflattening” of the curves are shown as solid and dashed lines in Figure 2(c) for the susceptible fraction f and the infected fraction g, respectively, for fixed β and γ2 values of 0.076 days−1 and 0.096 days−1, respectively, as γ1 value ranges from 0 to 0.07 days−1. This range of values for γ1 has been gleaned from some recent literature regarding various measurements of the length of immunity from COVID-19 following recovery [6] [7] [8]. The lifting of the steady state values from 0 to non-zero values for the infected and susceptible pools heralds the transition from an epidemic situation to an endemic situation.

The linear models proposed above successfully achieved the goal of generating curves amenable to flattening, curves which resemble those reported in the media. Furthermore, the curves so generated behaved quite sensibly with respect to alterations in the parameters such as β and γ2 as indicative of the effects of social distancing type interventions and pharmacological type interventions, respectively. We defer a discussion of the γ1 parameter in Equations (3a)-(3c) which allows for the possibility of a return of an individual from the I pool to the S pool for potential reinfection until after a discussion of the non-linear KM type models. There are a number of differences between the linear models and the KM models more commonly deployed for epidemiological considerations, differences we now discuss in the following exposition of the KM models.

2.3. Kermack and McKendrick (KM) Models SI, SIS and SIR

The article which Kermack and McKendrick published [1] shortly after the end of the “Great Influenza” of the last century [9] laid the foundations for much of the mathematical modeling associated with epidemics since. The following system of equations, normalized [2], slightly modified to include an I to S rate γ1, and cast in modern terminology, represent our KM starting points:

d S d t = β S I + γ 1 I (14a)

d I d t = β S I γ 1 I γ 2 I (14b)

d R d t = γ 2 I (14c)

where the normalization relation:

S + I + R = 1 (15)

follows from the vanishing sum of Equations (14a)-(14c). As Breda et al. [10] have recently and somewhat caustically remarked regarding the 1927 KM paper that “…even experienced experts believe that the paper is just about (this) system…” while pointing out that Kermack and McKendrick introduced this system of equations only as a special case of their more general formulation. Not being such experts, we happily consider these equations, noting that the most notable difference between the linear models and those of Equations (14a)-(14c) is the recognition that in the latter, the rate at which people are removed from the S fraction to the I fraction is not simply proportional to the S fraction but also to the infectious fraction I. The more people who are sick, the more likely a healthy person will encounter a sick person and so, quite reasonably, arises the ± βIS terms in Equations (14a) and (14b). Though this assumption has been challenged, particularly at high infection fractions [11], the equations posed have formed a very important backdrop from which to study not just disease spread but also some interesting mathematics [10] - [17]. Despite statements to the contrary [13], Equations (14a)-(14c) are non-linear so any hope that the Laplace transform approach to their solution might be of value is quickly vanquished, though this also means that notationally we no longer need to reserve capital letters for the Laplace transform counterparts of f, g and m and directly utilize the S, I and R representations for the three fractions. Rather than attempting to solve these equations as written, let’s first look at two simplified versions of them which are commonly deployed [3] [4] [10] [11] [12] [14] [15] [16] [17], the SI and SIS models, both of which have useful analytic solutions but do not lead to curves with “peaks” to flatten.

The SI model involves considering only the susceptible fraction S and the infectious fraction I with no loss from the latter to a recovered fraction. This is done by setting γ 1 = γ 2 = 0 in Equations (14a)-(14c) to obtain:

d S d t = β S I (16a)

d I d t = β S I (16b)

with S + I = 1 . To solve this set of equations, this normalization condition is used to substitute for S into Equation (16b) as in:

d I d t = β ( 1 I ) I (17a)

or rearranging

d I I ( 1 I ) = β d t (17b)

Now since d d l ( ln ( I 1 I ) ) = 1 I ( 1 I ) direct integration of the left side from I0 to I and the right side from 0 to t may be performed to obtain, after some tedious rearrangement, the following expression for I:

I = I 0 e β t 1 I 0 ( 1 e β t ) . (18)

Thus, we see that the SI model has an analytic solution which yields an infectious fraction I which is a monotonically increasing function of t, as shown by the black curve in Figure 3(a) which assumed a β value of 0.1 days−1 and an initially infected fraction, I0, of 0.01. For any value of the single parameter β, the infectious fraction tends towards unity given enough time. The model simply implies that sooner or later all the susceptible fraction becomes infected, though there is no curve to flatten. The SI model effectively contains no further information as the only other variable of interest S, is simply 1 − I. There is one interesting feature of the SI model which is shared by the other KM models discussed below. This is the feature that if I 0 = 0 , as in no sick people, the infectious fraction I will remain 0 and there will be no spread of disease with time. This is a very sensible feature indeed for an infectious phenomenon and one not shared by the linear model where the increase of the infected fraction (g in Equation (1b)) occurs from the start no matter what. Turning to the slightly more complicated SIS model, the governing differential equations read:

d S d t = β S I + γ I (19a)

d I d t = β S I γ I (19b)

In this model, the rate γ represents the flux of sick people back to the susceptible pool, recovery without immunity. We still have the normalization condition for the two fractions S + I = 1 so that Equation (19b) may be written as:

Figure 3. (a) Kermack-McKendrick model SI and SIS curves for a fixed value of β but with increasing values of γ (inset) and as described in the text. In these simulations with β > γ, a steady state level of infection is achieved which decreases as γ, the proportionality constant of the rate of the infected fraction back to the susceptible fraction, increases. (b) Kermack-McKendrick model SI and SIS curves for a fixed value of β but with increasing values of γ (inset) and as described in the text. In these simulations with β< γ, the vanishing steady state level of infection is achieved faster as γ, the proportionality constant of the rate of the infected fraction back to the susceptible fraction, increases.

d I d t = β ( 1 I ) I γ I (20)

After Shabbir et al. (16) we seek the solution to this equation by making the substitution I = 1 y and with d d t ( 1 y ) being ( 1 y 2 ) d y / d t , Equation (20) becomes:

d y d t = y ( γ β ) + β (21a)

or equivalently

d y y ( γ β ) + β = d t (21b)

Integrating the left and right sides from y0 to y and 0 to t, respectively, may be readily accomplished by noting that d d y ( ln ( β + y ( γ β ) ) ) = γ β β + y ( γ β ) .

Performing the requisite integrations and rearranging leads to the solution for y as:

y = ( β + ( γ β ) y 0 ) e ( γ β ) t β γ β (22)

whose inverse finally yields the sought after I as:

I = I 0 ( γ β ) e ( γ β ) t γ β + β I 0 ( 1 e ( γ β ) t ) I 0 e ( γ β ) t 1 + Ψ I 0 ( 1 e ( γ β ) t ) (23)

where we have defined Ψ β γ β . It is easily shown that Equation (23) of the SIS model reduces to Equation (18) of the SI model when γ is set to 0. The primary difference between SI and SIS is that the endpoint of the infectious fraction is not unity but rather some fraction less than unity. That is, given enough time not all the population becomes infected, as in SI, but rather the steady state achieved involves some smaller portion of the full population. Figure 3(a) shows SIS model simulations for the same β and I0 used for the SI model, β = 0.1 days−1, I0 = 0.01, but with γ values of 0.025, 0.05 and 0.075 days−1, respectively. Figure 3(b) shows what happens when the γ values exceed the β value. Namely, the infectious fraction dies out as shown using the β of 0.1 days−1 and γ values of 0.1 (black curve), 0.12 (red curve), 0.14 (blue curve) and 0.16 (magenta curve) days−1, respectively.

As mentioned briefly above Mohazzabi et al. used the SI model to fit available CDC measurements of the 5-day rolling average of individuals infected with COVID-19 over the course of a year within the United States [3] and also for similar data reported on a worldwide basis [4]. The SI model is, of course, just the SIS model with γ in Equations (19a) and (19b) set to 0. We have applied the SI and the SIS model to the same CDC data for the United States used by Mohazzabi et al. The results are shown in Figures 4(a)-(c). Figure 4(a) shows the actual data (black curve) while the red and blue curves fit the data as performed with the SI and SIS models, respectively.

To convert the fractional population of infected people I to the numbers of infected people, multiplication by the estimated total population for people living in the USA was performed assuming this population was 3.31 × 108 individuals, the same estimate used by Mohazzabi et al. [3]. Also, as in that work, the total number of infected individuals at t = 0, n0, was incorporated as a free parameter in addition to the free parameters β and γ. Our fits yielded the parameters shown

Figure 4. (a) Black curve shows the actual number of cases in the USA over 1 year with fits to the SI and SIS models (red and blue curves respectively). (b) Residuals from both fits to the data and (c) model predictions over an extended period with the parameters from the SI and SIS model fits, as provided in Table 1.

Table 1. Parameters for the SI andSIS model fits as seen in Figure 4(a) to actual number of cases in the USA over 1 year.

in Table 1, with the parameters for the SI model very similar to those reported by Mohazzabi et al. for this model. The model fitting for SI model was based on Equation (18) and the fitting coefficients were n0 and β. For the SIS model, the estimated n0 from SI model fit was used as an input and β and γ were estimated by fitting the data to Equation (23).

Using the SIS model with one additional free parameter a slightly higher β is returned and of course a non-vanishing γ, but with equivalent correlation coefficients R2 of 0.993 for both models fits. Visualization of this equivalence is provided in Figure 4(b) which shows the residuals from both fits. Despite the equivalence of the SI and SIS model fits the data over this limited time range of 1 year, the ramifications of longer time periods for either fit may be appreciated by Figure 4(c) in which the SI model leads to the entire population being infected while the SIS model parameters lead to significantly fewer numbers of infected people at long times. Of course, neither of these models incorporates a recovered population so the “bell” curves occurring in local populations and reported in the media will not be generated from such models and so we now turn to the three-compartment models to generate such curves.

Thus, we examine the original KM equations, Equations (14a)-(14c), albeit with the additional γ1 term, and seek their solutions, as others have [10] - [17], to no avail. There is simply no obvious analytic solution and those who have offered “algebraic” solutions offer up approximations, as KM did, and/or infinite series of elementary functions. It is, however, quite feasible to perform numerical evaluations of the KM equations and we do so with the aid of a mathematical trick originally provided by Kermack and Mckendrick who wrestled the three differential equations into one. Namely, one divides Equation (14a) by Equation (14c) to obtain:

d S d R = γ 1 β S γ 2 (24a)


d S γ 1 β S = d R γ 2 (24b)

which may be directly integrated to obtain:

S = γ 1 β ( γ 2 β S 0 ) e β R γ 2 . (25)

Applying the normalization condition and substituting Equation (25) into Equation (14c) one obtains:

d R d t = γ 2 I = γ 2 ( 1 R S ) = γ 2 ( 1 R γ 1 β + ( γ 1 β S 0 ) e β R γ 2 ) (26)

The only difference between this equation and that provided by Kermack and McKendrick nearly 100 years ago is the γ1 term which, again, simply allows for sliding between SIS and SIR conditions depending on the value of γ1 which, when non-zero, provides a means for people in the infected pool to go back to the susceptible pool in addition to, perhaps, proceeding to the recovered/dead pool via the rate γ2. It is now straightforward to utilize Equation (26) to numerically propagate a solution for R iteratively as in:

R i + 1 = R i + ( d R i / d t ) Δ t (27)

where d R i / d t is given by the right side of Equation (26) at each integer time step i and ∆t is the length of each time step. Since numerical evaluation was a luxury Kermack and McKendrick could not really afford, at least not easily in their time, they may be excused for expanding the exponential in Equation (26) to 2n’d order and providing an approximate solution in terms of standard functions. The advent of modern computing machines allows us to directly implement Equation (27) into Matlab (The Mathworks, Needham, MA) as a recursive relation, allowing us to generate the time courses of R, and subsequently I, as exemplified by the curves in Figure 5 which are intended to show how the standard SIR type curves from the KM model, with vanishing γ1, morph into the analytically tractable SIS like curves as one introduces stronger γ1 values in relation to γ2. The curves shown utilized a β value 0.1 days−1, an I0 of 0.01, a time sampling ∆t of 0.25 days and a total of 800 time samples from 0 to 200 days. The red curve is a pure SIR curve with γ 1 = 0 and γ 2 = 0.04 days−1, the magenta curve is a pure SIS curve with γ 2 = 0 and γ 1 = 0.04 days−1. The dashed black line through the magenta curve was generated with the analytic solution to the SIS model embodied in Equation (23) and confirms the accuracy of the numerical solution (magenta curve) in this regime. The numerically generated black and blue curves are SIS/SIR mixtures with γ1 value of 0.04 days−1 and γ2 values of 0.005 and 0.01 days−1, respectively. The introduction of positive γ2 values into pure SIS models is seen to result in a “flattening” of the pure SIS model curves whose steady state values after many days are not representative of the curves typically reported in the media for COVID-19 epidemic outbreaks. The thick red curve in Figure 5, the pure SIR model with vanishing γ1, is more representative of the curves reported in the media for COVID-19 outbreaks.

An application of the linear model and the KM model with SIR conditions to some real data is provided in Figure 6(a) where the number of cases reported in Massachusetts for the first 100 days following the initial lockdown in March of 2020 are plotted along with fits to both the linear model (red curve) and the KM model (blue curve). The parameters for the linear and KM model fits are provided in Table 2, where SF is a Scaling Factor found from a three-parameter fit

Figure 5. Numerically generated SIR and SIRS curves. The red curve is a pure SIR curve with γ1 = 0 and γ2 = 0.04 days−1, the magenta curve is a pure SIS curve with γ2 = 0 and γ1 = 0.04 days−1. The dashed black line through the magenta curve was generated with the analytic solution to the SIS model embodied in Equation (23) and confirms the accuracy of the numerical solution (magenta) in this regime. The numerically generated black and blue curves are SIS/SIR mixtures with γ1 value of 0.04 days−1 and γ2 values of 0.005 and 0.01 days−1, respectively.

Figure 6. (a) Infected population reported in Massachusetts beginning on March 20, 2020, immediately following the initial lockdown in our state along with fits to the linear model of equation 8a with a scaling factor added as an additional free parameter (red curve) and to the Kermack and McKendrick (KM) model using manual adjustments of the free parameter and the same scaling factor found for the regression analysis to the linear model. Table 2 provides the fit parameters. The correlation coefficients for the linear model and the KM model were 0.83 and 0.87, respectively. (b) Residual plots evaluated from the linear and KM model fits to the data of Figure 6(a).

Table 2. Parameters for the linear model and KM model fits seen in Figure 5(a) to the first 100 days of COVID-19 cases as reported in Massachusetts following the initial “lockdown” of March 2020. SF is a scaling factor which was ascertained as a free parameter from the linear model fit and then applied to the KM fit.

(β, γ2 and SF) to the linear model and was used for the KM fit as well. Note that only for the linear model with its analytic solution could the standard non-linear regression analyses of Matlab be used for the three-parameter fit while for the numerical KM curve shown, manual adjustment of the free parameters β and γ2 was performed to minimize the r2 value with f0 and SF fixed at 0.95 and 4798, respectively. The r2 values of 0.83 and 0.87 for the linear and KM models, respectively, are quite similar, albeit slightly better for the KM model (Table 2) with the residuals for both fits shown in Figure 6(b).

3. Discussion

Summarizing the differential equations and solutions discussed, we first note that with only two compartments—the fractions S and I alone—both the linear model and the KM approach, in which one of the rates connecting the S to I boxes is proportional to the I fraction (Figure 1), support analytic solutions as derived above but do not yield the up and down curves typical of COVID-19 curves reported in the media. For the three-compartment model in which the recovery fraction R may or may not have waning immunity, only the linear model supports analytic solutions while the KM approach must be dealt with numerically. The benefits of having analytic solutions to readily understand the effects of parameters which reflect either social distancing or pharmaceutical interventions are obvious (Figure 2), as are the benefits of being able to provide analytic expressions for areas under the curves or their peak heights. It is also clear that the three compartment linear model can compete reasonably well with the KM approach in modeling typical “curves” as shown by the similar correlation coefficients found from the fits of Figure 5. Of course, only for the linear model with its analytic solution could standard regression analyses, which rely on derivatives of the fitting function, be fully brought to bear on this problem. We consider one last “curve flattening” exercise which further emphasizes the benefits of analytic solutions by returning to the original set of Kermack-McKendrick equations but with an exponent parameter α, not a Laplace transform variable in this case, that links the linear model to the KM approach. Namely as α varies from 0 to 1 in the following set of equations, one transition from the linear model to the KM model:

d S d t = β S I α (28a)

d I d t = β S I α γ I (28b)

d R d t = γ I . (28c)

No analytic solution appears possible except for the α = 0 linear model but, with modern computational capabilities, time courses for the fractions may be generated directly from these equations. Unfortunately, the trick of turning the three equations into one, as done by Kermack and McKendrick for α = 1 , is no longer viable. Nevertheless, a simultaneous numerical propagation approach for all three equations serves quite well to generate curves as a function of αfor fixed values of β and γ, as shown in Figure 7. The figure plots infected fraction curves for five values of α from 0 to 1.1 for β and γ values of 0.096, 0.02 days−1, respectively, and with starting conditions of S 0 = 0.99 and I 0 = 0.01 . Note that as αincreases the curves “flatten” and it might be of interest to evaluate the areas under the curves as a function of α to see if they are all the same. Unfortunately, only for the blue curve, the α = 0 linear model, is this possible. To evaluate areas under the other curves, numerical integration is required and of course must be performed individually for each curve as generated with a wide range of parameters. Though this can be done, it is rather onerous if not Herculean task. Similarly, Figure 7 shows how the peak heights diminish with α which would be considered beneficial from the hospital overload perspective though again, the inability to take an analytic derivative for all but α = 0 means numerical methods must be deployed to evaluate peak height dependencies on α, another onerous task when considering the wide range of β and γ values that need to be investigated to arrive at any firm conclusions. Finally, concluding that a parameter like α would be a valuable addition to fits like those shown in the data in Figure 6(a) would benefit from the ability to perform non-linear regression analyses, a task again made difficult if not impossible due to the numerical nature of the curve generation. In this regard, it is of interest to note that approximate analytic solutions to non-linear, coupled differential equations can be determined, an

Figure 7. Numerically simulated infection fraction curves for a Kermack and McKendrick (KM) model which considers the flux rate from the S to I fraction to be proportional to Iα where α ranges from 0 (linear model) through 1 (KM model) to 1.1, why not? It is just another numerical simulation as are all the curves except for the blue, α = 0 curve. Note that as α increases these curves “flatten”, our last flattening demonstration, though it is difficult to ascertain whether the areas under the curves remain the same for these particular fixed values of the other free parameters in the model described by Equations (28a)-(28c).

excellent example being provided by Varadharajan and Rajendran [18] who considered the non-linear differential equations associated with enzyme-substrate-product interactions. Using a “homotopy perturbation method”, discussed in [18] and references therein, analytic solutions which well-approximated numerical solutions of non-linear equations not dissimilar to those discussed above were produced which, of course, may be integrated or differentiated, a topic of interest indeed but which goes beyond the scope of this tutorial.

4. Conclusion

There remains a lot of uncertainty regarding the eventual time course of the coronavirus throughout individual countries and the world. Model predictions abound and disagreements among the experts and non-experts regarding the assumptions and ultimate validities of different models have provided daily entertainment for those of us who sat “working at home” in our armchairs doing our part to reduce the probability of individual viral infection and/or its spread. As the COVID-19 storm passes through us, valuable data become available in the form of infection vs. time curves, etc. Fits such curves with equations like those reviewed herein may provide insights into the different strategies deployed in different regions to mitigate the effects of the virus. Experienced mathematically oriented epidemiologists may find nothing new or surprising in the analyses we have provided and undoubtedly possess even more sophisticated models which will include nuances such as age dependencies on the rates of infection/recovery, effects of the initial viral load on disease severity and transmission, “vital statistics” [17] which account for births and deaths affecting the overall population and critically, accounting for data distortions due to insufficient or ineffective testing and subsequent knowledge of the overall infection and recovery counts. In any event, the virus and its measured effects will long provide a fruitful epidemiological gold mine for model testing and predictions regarding future pandemics, models that can help guide public health policies in times of confusion.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.


[1] Kermack, W.O. and McKendrick, A.G. (1927) A Contribution to the Mathematical Theory of Epidemics. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 115, 700-721.
[2] Deakin, M.A. (1975) A Standard Form for the Kermack-McKendrick Epidemic Equations. Bulletin of Mathematical Biology, 37, 91-95.
[3] Mohazzabi, P., Richardson, G. and Richardson, G. (2021) A Model for Coronavirus Pandemic. Journal of Infectious Diseases and Epidemiology, 7, 197.
[4] Mohazzabi, P., Richardson, G. and Richardson, G. (2021) A Mathematical Model for Spread of COVID-19 in the World. Journal of Applied Mathematics and Physics, 9, 1890-1895.
[5] Marsden, J.E. (1973) Basic Complex Analysis. W.H. Freeman and Company, San Francisco, 388-409.
[6] Ibarrondo, F.J., Fulcher, J.A., Goodman-Meza, D., Elliott, J., Hofmann, C., Hausner. M.A., et al. (2020) Rapid Decay of Anti–SARS-CoV-2 Antibodies in Persons with Mild COVID-19. The New England Journal o f Medicine, 383, 1085-1087.
[7] Jiang, X.L., Wang, G.L., Zhao, X.N., Yan, F.H., Yao, L., Kou, Z.Q., et al. (2021) Lasting Antibody and T Cell Responses to SARS-CoV-2 in COVID-19 Patients Three Months after Infection. Nature Communications, 12, Article No. 897.
[8] De Giorgi, V., West, K.A., Henning, A.N., Chen, L.N., Holbrook, M.R., Gross, R., et al. (2021) Naturally Acquired SARS-CoV-2 Immunity Persists for Up to 11 Months Following Infection. The Journal of Infectious Diseases, 224, 1294-1304.
[9] Barry, J.M. (2005) The Great Influenza: The Story of the Deadliest Pandemic in History. Penguin Books, London.
[10] Breda, D., Diekmann, O., de Graafb, W.F., Pugliese, A. and Vermiglio, R. (2012) On the Formulation of Epidemic Models (An Appraisal of Kermack and McKendrick). Journal of Biological Dynamics, 6, 103-117.
[11] Capasso, V. and Serio, G. (1978) A Generalization of the Kermack-McKendrick Deterministic Epidemic Model. Mathematical Biosciences, 42, 43-61.
[12] Pakes, A.G. (2015) Lambert’s W Meets Kermack-McKendrick Epidemics. IMA Journal of Applied Mathematics, 80, 1368-1386.
[13] Kaxiras, E. and Neofotistos, G. (2020) Multiple Epidemic Wave Model of the COVID-19 Pandemic: Modeling Study. Journal of Medical Internet Research, 22, e20912.
[14] Harko, T., Lobo, F.S.N. and Mak, M.K. (2014) Exact Analytical Solutions of the Susceptible-Infected-Recovered (SIR) Epidemic Model and of the SIR Model with Equal Death and Birth Rates. Applied Mathematics and Computation, 236, 184-194.
[15] Shabbir, G., Khan, H. and Sadiq, M.A. (2010) A Note on Exact Solution of SIR and SIS Epidemic Models. arXiv.
[16] Carvalho, A.M. and Gonçalves, S. (2016) An Algebraic Solution for the Kermack-McKendrick Model. arXiv.
[17] Hethcote, H.W. (2006) The Mathematics of Infectious Diseases. SIAM Review, 42, 599-653.
[18] Varadharajan, G. and Rajendran, L. (2011) Analytical Solutions of Non-Linear Differential Equations in the Single-Enzyme, Single-Substrate Reaction with Non-Mechanism-Based Enzyme Inactivation. Applied Mathematics, 2, 1140-1147.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.