Meaningful Contact Estimates among Children in a Childcare Centre with Applications to Contact Matrices in Infectious Disease Modelling

Abstract

We present a mathematical model of a day care center in a developed country (such as Canada), in order to use it for the estimation of individual-to-individual contact rates in young age groups and in an educational group setting. In our model, individuals in the population are children (ages 1.5 to 4 years) and staff, and their interactions are modelled explicitly: person-to-person and person-to-environment, with a very high time resolution. Their movement and meaningful contact patterns are simulated and then calibrated with collected data from a child care facility as a case study. We present these calibration results as a first part in the further development of our model for testing and estimating the spread of infectious diseases within child care centers.

Share and Cite:

Flynn-Primrose, D. , Hoover, N. , Mohammadi, Z. , Hung, A. , Lee, J. , Tomovici, M. , Thommes, E. , Neame, D. and Cojocaru, M. (2022) Meaningful Contact Estimates among Children in a Childcare Centre with Applications to Contact Matrices in Infectious Disease Modelling. Journal of Applied Mathematics and Physics, 10, 1525-1546. doi: 10.4236/jamp.2022.105107.

1. Introduction

Population health modelling within applied mathematics is a wide area of research dominated by several model types: deterministic compartmental models [1] - [7], individual based models [5] [8] [9] with extensive literature review in [10] [11], game-theoretic based models (see [12] [13] and references therein) and data analysis models [6] [14] - [21], all of them dedicated in general to large populations. The large population assumptions of compartmental models or games are needed for averaging behaviour of individuals within compartments, respectively population groups. In contrast, individual (or agent) based models (ABM) seek to capture emergent behaviour at population level by modelling individual interactions.

Agent based models have become an increasingly popular modelling framework amongst various scientific disciplines in recent years, including economics and engineering [22] [23] [24] [25] [26], sociology [27] [28] [29] [30], psychology [31], and population health (see the review paper [8] ) as well as [5] [9] [10] [11]. Unlike differential equation models, ABMs are able to readily introduce heterogeneity into individual attributes and are tailored to reflect the emerging behaviour, at population level, resulting from the agent-to-agent and/or agent-to-environment interactions. ABM models on the spread of infectious diseases have received a fair magnitude of attention from various researchers (see the review paper [8] as well as the more recent [10] [11] and references therein). Extensive searches within the PubMed and Google Scholar data bases reveal little mathematical modelling literature that specifically models concurrent infections in a child care setting with an agent-based model, together with its impact on the immediate community. The papers [15] [18] [20] [21] [32] are of most relevant interest to us, given their studies in pathogen transmission. Specific to influenza A in a day care center, we look at [15] [21] as they measure viral load on day care surfaces and the air load distributions of this virus. The paper [17] is widely cited in relation to influenza A in a day care setting and its impact on secondary infections in children’s households.

For current policies related to primary prevention guidelines in child care facilities in Canada we can consult Public Health Departments which set requirements for child care licenses (see [33] ). The study in [6] may be of use, as it relates the type of health care decisions/costs that may result from an infected child visiting a family physician. Risk factors for respiratory viruses in day care facilities in Europe are presented in [14].

In this paper, we build an ABM model of a child care center modelled after the Child Care and Learning Center (CCLC) at the University of Guelph, in Guelph, ON, Canada, so that interactions between human entities (children, adults, etc.) are able to be described with a high level of detail. The number of agents is very small compared to a usual population level ABM model, thus the time resolution of our simulations can be extremely high, and small number statistics effects can be uncovered. Given the combination of these traits, together with the already collected contact data from the CCLC, our proposed model is uniquely suited to estimating contact rates that could lead to pathogen transmission. We present below the model and its calibration on contact and movement patterns using gathered from agent-to-agent contact data. Empirical data detailing agent-to-agent contacts in an institutional setting is rare due to the numerous issues regarding ethics and parental consent. This motivates our goal of generating synthetic data that reliably resembles empirical results. The child care facility we have as a case study encompasses two types of child care rooms, toddler age and preschool age, each with a fixed number of children, a fixed number of teachers (both preassigned to each room for a school year period), and possibly a small number of teaching assistants (preassigned to each room per a 3 - 4 months period). The simulated environment of an abstract room is a 2-dimensional lattice of patches with subgroups of patches signifying specific parts of the room, such as toy boxes, lunch areas, play/activity areas and washrooms.

The agents in our model are further classified into two types: children and adults. Children will be modeled as moving agents in the environment, occupying a patch at any given moment. They will be given a chance to move, they will have different activity levels and they will follow directions of teachers/staff for lunch, room activities and visiting the washroom. In turn, teachers and/or staff move by following groups of two or more children, or directing children to activities. A simulation run in the model represents 15 minutes of real time where the time resolution will be set to 1 second of physical time per simulated time step. The paper is structured as follows: In Section 2, we present the basic structure of the model and describe its parameters, in Sections 3 and 4 we outline the statistical analysis done on the observed and simulated data respectively, while in Section 5 we offer a concluding discussion of our results, and a few ideas for future work.1

2. Model Description

Our overarching goal is to simulate the agent-to-agent contact patterns in a child care setting correctly, i.e., in a fashion where simulated meaningful contacts for transmission are similar to meaningful contacts observed/collected in the CCLC center. We operate under the paradigm that each individual room in the centre can be modelled separately with agents moving between rooms only according to a fixed schedule. With this approach, multi-room child care centers can be simulated by combining multiple single room simulations. In this paper we focus on modeling a single room and in particular we estimate values for each of the model’s free parameters described below.

2.1. Basic Structure of the Model

Each room is modeled as a rectangular graph with agents occupying one of the graph’s nodes. The model operates by allowing the agents to move between nodes according to rules that will be described below (see Definition 1). The resolution is one timestep per in-simulation second and a simulation will involve a classroom movement during a 15 minutes period. Each agent has a so-called activity level which determines an agent’s probability of moving between nodes. For example, an activity level of 0.65 indicates that the agent will move during 65% of timesteps out of the total simulated time.

The model includes two varieties of agents: teachers and children. The distance between nodes is assumed to be equal to the mean stride length for the age range of children to be modeled. In this way, children are restricted to either staying on their current node or moving to a neighbouring node during a single timestep. Teachers are allowed to move up to two nodes from their starting node to account for their greater stride length.

Each agent type has a variety of activities they can undertake and that can alter their movement patterns.

1) Children are able to choose between moving (or staying in place) at random and/or following other agents.

2) Teachers have the ability to select a group of children and gather them together; when that happens the selected children will move toward the teacher agent and will remain in their close proximity until dismissed.2

3) Teachers are able to choose between following a particular child or gathering a group of children around themselves.

2.2. Agents Movement, Activities and Observed Contact Computation

Let us denote by active timesteps the specific timesteps during which an agent will move.

Definition 1. If two agents occupy adjacent nodes within one timestep, then we say that those two agents have had a neighbouring contact.

The neighbouring contacts between a pair of agents in a 5 minutes simulation are counted, and denoted by pair contact time.

Definition 2. Two agents are said to have an observed contact if pair contact time > duration of observed contact doc, i.e., if they spend enough time as neighbouring contacts, in a given 5 minute period, for an observer to notice and record the interaction.

The parameter duration of observed contact, denoted by doc comes from the fact that the contact patterns collected through visual observations depended on the human observer to “register” an interaction between two agents. This threshold parameter regulates how many neighbouring contacts make one observed contact and we know, from data collection, that is between 1 - 2 minutes of physical time (see also Section 3). To compute the number of observed contacts between two agents over the course of 15 minutes, we add the results from all three 5-min periods. We investigate the effect of different observed contact thresholds in Section 4.

We summarize the simulated movements in Table 1.

Agents choose what activity they will engage in according to their sociability level, which dictates their preference for one activity or the other. For example, a sociability level of 0.33 would indicate that, given the option, a child will choose

Table 1. Movement patterns for different agent types.

to follow another agent 33% of the time. Likewise, a sociability level of 0.75 would indicate that a teacher will choose to follow a single child rather than gathering a group of children 3 out of every 4 times.

To regulate how long each agent spends performing their chosen activity we model their “interest” in that activity as a decaying quantity with a fixed half-life. This allows us to use the laws of exponential decay to compute the probability of the agent changing activities on any given timestep.

Definition 3. In particular, on any given active timestep, the probability that an agent will stop their current activity is given by

1 e n n m (1)

where n is the total number of active timesteps the agent has spent performing their current activity and n m is the mean length of time an agent spends on that activity. We set n m = 15 min throughout all simulations.

2.3. Model Parameters

The structure of our model includes a number of parameters for which we must find appropriate values. Two of these parameters: activity level, Al and sociability level, Sl, are chosen for each agent type from a known distribution, as in Table 2. It is our main goal in this paper to determine more specific distributions for these parameters for each agent type, in such a way that the number of simulated observed contacts agrees with the gathered data (see Section 4).

The remaining parameters are related to the temporal and spatial resolution of the model. A summary of the values as they appear in the current version of the model are given in Table 2.

The room dimensions were obtained from the CCLC building blueprints. Information about agent stride length and frequency was found in [35], the model timescale was also informed by that paper which suggests toddlers and preschoolers will spend a negligible amount of time at stride rates greater than 60 per second. The mean interest time was chosen to ensure a meaningful variety of agent behaviour over a single 15 minute observation period.

Table 2. Additional model parameters.

3. Empirical Data

3.1. Data Collection

The data was collected in three intervals of 2 weeks in March-May of 2019 at the CCLC Guelph. The observations were conducted according to the University’s Research Ethics Board protocols. These protocols limited the quantity of data that could be collected as only students with parental consent could be included in the observations. The observer has recorded meaningful contacts between a subset of children and a subset of teachers in a given room. We collected data from 2 toddlers (18 months to 3 years old) and 2 preschooler rooms (3 years old to 4.5 years) and recorded it in independent 15-minute tables. For every table, the occurrence of meaningful agent-agent and agent-surface interactions were recorded every 5 minutes based on the number of children and teachers observed. The observations were taken during three different periods of the day: 8:30 am-10:30 am, 10:30 am-12:30 pm, 2:30 pm-4:30 pm. The age groups observations cannot amalgamated, as toddler age groups have differing numbers of students and teachers in each of their rooms, and activities/daily schedules in each room differ. Moreover, the student/teacher ratios are different, that is to say that by government mandate, each teacher can only supervise a maximum of 5 toddlers per room and a maximum of 8 preschoolers. Last but not least, the rooms we collected the data from were structures as: 10 toddlers and 2 teachers (2 rooms), 16 preschoolers and 2 teachers (2 rooms). Due to the teaching and training of ECE at Guelph, each room hosts typically, for parts of the day, a teaching assistant as a 3rd staff in a room.

3.2. Data Analysis

For each age group we studied two aspects of the observed data and then further re-organize it for use to inform our simulated environment. In what follows, we call a data configuration the number of children and teachers being observed in a 15-minute table. We note that although the observer has tracked 4 or 5 children at a time in a toddler room, the maximum number of children in a toddler room is 10, and the maximum number of teachers is 3 (2 teachers and occasionally a teaching assistant). The same 2 rooms have been observed for toddlers, and the same 2 rooms for preschoolers. The preschool rooms had a maximum of 16 children and 2/3 teachers (occasionally a teaching assistant).

Our first consideration was whether the times of the day during which the 15-minute observations were collected had any statistical relevance, or if we could coalesce the data. To test their statistical relevance, we looked at the following configurations of toddlers and teachers: 4 children and 1 teacher (4C1T), 4 children and 2 teachers (4C2T), 4 children and 3 teachers (4C3T) and 5 children and 2 teachers (5C2T). For preschoolers groups we considered the following configurations: 4 children and 2 teachers (4C2T), 5 children and 2 teachers (5C2T), 5 children and 3 teachers (5C3T), 6 children and 3 teachers (6C3T) (for details of this analysis, please see the Appendix).

Our conclusions drawn from the statistical analysis tests employed were that times of the day were irrelevant and we could coalesce the data for the following age groups and configurations.

Following the analyses described above, we have further concentrated on using the data from age groups and configurations in Table 3. We have further assumed that in each configuration, each child of the 4C or 5C observed constitutes one child agent interacting with the other children and teacher agents. The sizes of the data sets that we obtained therefore are as in Table 4.

Table 3. Scenarios run in the classroom model.

Table 4. Number of observations used from the collected data for each simulated scenario in Table 3.

3.3. Further Insights into the Curated Observed Data

To gain further insight into our observed data, we tried to find known statistical patterns within the observed data. Our analysis began by using the Wilcoxon test to determine the similarity between results for toddlers and preschoolers. Following that we performed a battery of statistical tests to establish if the observed data corresponded to any known distribution.

We began by visually comparing the frequency distributions of agent-agent contacts between toddlers versus preschoolers. For example, Figure 1 shows the

Figure 1. Distribution of observed contacts in 4C2T age groups. (a) Toddler; (b) Preschool.

frequency of meaningful contacts in both age groups in the 4C2T case. The notable result from the histogram is that each age group has a distinctly different distribution.

Further investigation into the toddler group conclusively revealed that the agent-agent contacts between child-child and child-teacher contacts have distinctly different distributions as well. Figure 2 shows a difference between the distribution of child-child contacts and child-teacher contacts in the 4C2T toddler

Figure 2. Distribution of observed contact in the 4C2T toddler group. (a) Child-Child Contact; (b) Child-Teacher Contact.

group. Different agent-agent distributions were also obtained when focusing on Preschool groups. The Wilcoxon test confirmed this result for both age groups.

Having established that there are differences (as expected) between types of contacts in each age group and configuration, we asked whether the type of contacts we encountered, such as child-child or child-teacher etc., may have distributions of a well-known type. Visually, we can reject some of the most common distributions by looking at the histogram charts of both agent groups (Figure 1, Figure 2). The Shapiro-Wilk normality test confirmed the non-normality of the data by returning a very small p-value (<0.05).

To figure out whether a continuous distribution fits our data, we used the descdist and fitDist functions in fitdistrplus and GAMLSS packages in R, respectively [36] [37].3 These functions did not find a specific distribution that conclusively matched all the groups in the observed data. As a consequence, in order to validate our simulated model we use next non-parametric statistical tests. A non-parametric test allows us to make comparisons without any assumptions such as: Independence of observations, normality of data, or homogeneity of variance about the data distribution.

There are numerous “goodness of fit” tests to analyze discrete data sets, such as the χ2 test, the discrete Kolmogorov-Smirnov test, the multidimensional test, and the likelihood ratio test. In our case, the Mann-Whitney-Wilcoxon (MWW) test had several significant properties among other goodness of fit, such as being sufficiently distribution-free, being suitable for use with small sample sizes, and having the ability to accommodate “ties” in the data. For large sample sizes the theoretical distribution of the MWW test statistic is known to be well approximated by a normal distribution with mean and standard deviation determined by the sample sizes [38]. This allows us to draw conclusions about the degree to which our simulated data matches the observed data in an easy and straightforward way.

4. Model Calibration

The model described in Section 2 has a number of free parameters. We would like to determine value ranges these parameters could take so as to produce contact patterns that replicate the observed ones. In order to compare the simulated data with the observed data it is necessary to have multiple simulated outputs for each set of input parameters. We outline the process here:

· We first selected the agents’ activity (At) and sociability (St) levels from a uniform distribution over [0, 1] and then generated the same number of simulated neighbouring contact matrices as observed data points in the given case (see the n 1 = n 2 columns in Table 5 and Table 6).

· Repeating the above procedure allowed us to produce databases of simulated outputs for the case of 10 toddlers and 2 teacher (n = 10,587) and for 16 preschoolers and 2 teachers (n = 7747), where each “output” consists of multiple simulated neighbouring contact matrices where the agents behavioral parameters remain constant within a single “output”.

· We then randomly selected three subsets of 4 children and three subsets of 5 children out of the children simulated (10 if toddlers, 16 if preschoolers).

· For each agent, in each of the six subsets, it was then possible to count the total number of neighbouring contacts with other agents in the same subset, as well as with the two teachers.

· We then converted the neighbouring contacts into observed contacts using the duration of observed contacts ranging from 5 seconds to 295 seconds. Figure 3 shows histograms of the resulting test statistics for the case of observing 5 preschoolers. We have similar ones for all other cases.

Note that the scores for child-child contacts increase as doc increases, whereas the scores for contacts involving a teacher display the opposite trend. As illustrated in Figure 4, for d o c 270 s has negligible effect on the score of child-child contacts. Likewise, d o c 30 s did not improve the scores for contacts involving a teacher. For this reason we decided to set the d o c = 270 s if both agents are children and d o c = 30 s if one of the agents is a teacher.

4.1. Toddlers

When investigating the effect of Al and Sl on toddler classes contacts we used the database of simulated toddlers described above, as well as a second database that was generated in the same way save that it includes three teachers instead of two (n = 5705).

Counts of simulated observed contacts were produced using the doc found above and the result was compared to the observed data using the Mann-Whitney-Wilcoxon test (in Table 5 we give an overview of the results in each case). Based on this table, as well as Table 6, we decided to classify samples as good if their score fell to the right of the expected mean by at least 0.03 of the expected standard deviation (i.e. if at least 51.2% of the area under the expected normal distribution fell to the left of the samples score).

Figure 5 shows histograms of Al and Sl values that produced good samples for all CC, CT, and TC contacts in the case of 4C3T. Figure 6 shows the same but in the case 5C2T. Notably, in the case 4C2T, there were no samples that were good for all three types of contacts.

4.2. Preschoolers

In the case of preschoolers we used the database of preschoolers described above as well as a second database of preschoolers (n = 6001) that included three teachers (as opposed to two). Table 6 gives an overview of the resulting test statistics and Figures 7-9 display histograms of Al and Sl that produced good samples

Figure 3. Histograms of the test statistic for preschoolers in the case of five children and two teachers being observed with doc ranging from 30 s to 270 s. Notice that child-child contacts score best when doc is small whereas the other contact types do best with long doc. (a) Child-Child Contact; (b) Child-Teacher Contact; (c) Teacher-Child Contact.

(with respect to all three contact types) in the case of 4C2T, 5C2T and 5C3T respectively.

4.3. Discussion

Several notable observations can be made regarding the data presented above. When investigating the possible values of duration of observed contacts, doc, there is a clear difference between contact involving only children and contact

Figure 4. Histograms of the test statistic for preschoolers in the case of five children and two teachers being observed with doc ranging from 5 s to 30 s for child-child contacts and from 270 s to 295 s for contacts involving a teacher. Notice that decreasing doc below 30 s does not increase the scores for child-child contacts. Similarly, increasing doc above 270 s resulted in lower scores for contact types involving teachers. (a) Child-Child Contact; (b) Child-Teacher Contact; (c) Teacher-Child Contact.

involving a teacher. Simulated contacts between children were compared most favourably with the observed data when the duration of observed contacts was 270 seconds. Simulated contacts between children and teachers, on the other hand, matched the observed data best when the duration of observed contact was 30 seconds. This pattern persisted regardless of the number or type of agent being observed. We suspect (from the observational study design) that the much shorter

Table 5. Overview of the test statistic results for toddlers.

Figure 5. Good parameters in the case of 4 toddlers and 3 teachers.

Figure 6. Good parameters in the case of 5 toddlers and 2 teachers.

Table 6. Overview of the test statistic results for preschoolers.

Figure 7. Good parameters in the case of 4C2T.

Figure 8. Good parameters in the case of 5C2T.

Figure 9. Good parameters in the case of 5C3T.

duration required for contacts involving children and teachers to be observed and recorded by the data collector can be explained by the fact that teachers, being adults, are more deliberate in their actions than children. When a teacher approaches a child they likely have some specific intent which the observer can easily classify as either involving a meaningful contact or not. Conversely, contacts between children are more likely to be an accidental consequence of two children playing near each other making it much harder for the observer to keep track and notice them, unless they are longer in duration.

There are also some interesting trends in the histogram of behavioral parameters Al and Sl that produced the largest test statistics.

In the case of 4 toddlers and 2 teachers, there is a clear tendency, among both agent types, toward small values for the activity level. When a toddler’s A l 0.2 their Sl did not have an apparent effect in contact numbers. In contrast, when a teacher has an activity level less than A l 0.02 , smaller Sl values seem to produce better results. Interestingly, if a teacher’s activity level is above A l 0.2 then the better contact numbers matches are obtained for S l 0.6 . If a child has an A l 0.2 then the trend reverses, and the best samples appear when the child has a sociability level S l 0.2 . The case of 5 toddlers and 2 teachers is less clear with both agent types performing best when the activity and sociability levels fall near the midpoint of their range.

Simulations involving preschoolers also showed a clear, albeit different, pattern. In both 4C2T and 5C2T, the preschoolers show a trend toward activity and sociability levels above A l , S l 0.4 . The teachers in these cases also tend to have high sociability levels although they do best when their activity level is in the range of 0.4 A l 0.6 . The case of 5C3T is less clear, but suggests that the best values for the parameters are in the range of 0.4 S l , A l 0.6 . A summary of these results can be found in Table 7 and Table 8.

Table 7. Summary of optimal parameter ranges for toddlers.

Table 8. Summary of optimal parameter ranges for preschoolers.

5. Conclusions and Future Work

Given our work presented here, we conclude that toddlers and preschoolers must be modeled using different distributions of the behavioral parameters. Preschoolers are modeled best as having high activity and sociability levels whereas toddlers are best modeled as having low activity levels and only a weak dependence on sociability levels. Teachers appear to behave differently depending on the age of their pupils but tend more toward the mid range of the behavioral parameters.

In the next stages of this project we aim to introduce pathogens into the simulation as well as explore linking multiple classrooms to simulate a single child care facility. We also hope to gain access to the records at the CCLC so we can compare the simulated infections with historical data regarding child and teacher absence due to sickness. This would allow us to further validate the model described in this paper.

Disclaimer

Funding for this work has been provided by the Natural Sciences and Engineering Research Council of Canada (NSERC) through a Collaborative Research & Development Grant No. 401569 in partnership with Sanofi Pasteur Canada (in a 2:1 matching funding model, respectively). A. Hung, J. Lee, M. Tomovici are employees of Sanofi Pasteur Canada and E. W. Thommes is an employee of Sanofi Pasteur. They are collaborators on the research presented here and their salaried compensation for time spent on the current research is part of the funding.

Data collection was funded by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant No. 400684 M. G. Cojocaru (PI). The Child Care and Learning Center (CCLC) at the University of Guelph has collaborated on data collection.

Appendix: Curating Collected Data

Counts for the number of child-child, child-teacher and teacher-teacher contacts were taken for each observation, and separated based on the time slots during which they were observed. Thus, for each configuration, we had three sets of count data. For every data set, we began testing for homogeneity of variance by applying Levene’s test. The two most notable causes of unequal variances between groups were due to a difference in the frequency of observed 0’s and outliers. This was a primary consideration when choosing a test statistic for the permutation tests. One difficulty to address here is that there are relatively few observations per group for many of the configurations, for this reason they served as more of a guideline than a hard rule for determining if homogeneity of variance was a reasonable assumption. We found two resources helpful in determining how accurate Levene’s test was. Firstly, [39], which provides a priori power estimate based on the expected effect size, and performs a calculation for the observed effect size. Secondly, [40], which provides a figure for the priori power Levene’s test has vs the total sample size. If the p-value resulting from the test was significant for α = 0.05, we concluded that homogeneity of variance was not a reasonable assumption. If the p-value was insignificant for α = 0.05 we proceeded to compare boxplots, density plots and ECDF plots before determining if homogeneity of variance was a reasonable assumption.

When the assumption of homogeneity of variance appeared to be met, a permutation test based on the standard F-statistic from a one-way ANOVA model proved to be one of the best approaches to testing whether or not distributions of the populations from which we observe each group are the same. According to [41], the power of a permutation test based on the F-statistic is higher than a one-way ANOVA model for sample sizes of 10 or greater. If the p-value of this test was significant for α = 0.05, we concluded that the distributions were statistically different and that we cannot coalesce the data from all of these groups. The group(s) that was(were) causing issues was(were) then excluded and we proceeded with more 2-group analyses if possible. If the p-value was insignificant for α = 0.05 and the results of Levene’s test seemed reasonable, we conducted a Kruskal-Wallis test and a permutation test based on the H-statistic for more supporting evidence before making a conclusion. We also ran a permutation test based on Satterthwaite’s corrected F-statistic—this was particularly important in the cases where Levene’s test lacked power or accuracy (FigureA1).

We decided to use Satterthwaite’s observed F-statistic alongside the standard F-statistic as a metric in a permutation test to test for a difference in the means of the populations from which we observe each group. We can’t be certain whether or not the variability in the observed values is due to differences in the means of the groups or their variances with the standard F-statistic given the power Levene’s test has for our sample sizes. If the p-value was significant for α = 0.05, we concluded that the distributions were statistically different and that we cannot coalesce the data from all of these groups. We excluded the group(s)

Figure A1. In the above test results, performed on child to child contact data from four children and three teachers, the p-value is significant at a significance level of α = 0.05. For that reason we reject the null hypothesis of homogeneity of variance and refrain from coalescing this data into a single set.

Figure A2. Above are the results from a permutation test using the H-statistic from a Kruskal-Wallis test as a metric. The test was performed on child to teacher contact data from 4C2T set. The p-value is insignificant at a significance level of α = 0.05, therefore it would appear to be reasonable to coalesce the data.

that was(were) causing issues and proceeded with more 2-group analyses if possible. If the p-value was insignificant for α = 0.05, we made no conclusions based on this test (FigureA2).

A Kruskal-Wallis test proved useful when analyzing most of the data sets, even with many ties in between the groups. Given that an assumption of the K-W test involves similarly shaped distributions between groups, we also conducted a permutation test using the observed H-statistic from a K-W test as the metric. The conclusions from both tests were the same for every data set. For agent configurations where certain groups either had no observations or had been excluded as a result of prior analysis/tests, we conducted a permutation test using the observed t-statistic from a pooled sample variance t-test and Welch’s t-test as metrics.

NOTES

1Our model is implemented in R v4.0.0 with Rstudio v1.2.5042 using simulations run on AMD Ryzen 54600H at 3.00 Ghz and with 8 Gb of ram.

2Note that the children are still allowed to move when gathered around a teacher they will simply be forced to return to a neighboring node if they move to far away.

3The descdist function provides a kurtosis and squared skewness plot by computing descriptive parameters of an empirical distribution for non-censored data. In the case of this function for instance, the fit of three common distributions could be considered, Weibull, gamma and Poisson (as a discrete probability) distributions. On the other hand, the fitDist function is using all relevant parametric gamlss.family to fit distributions to our observed data sets. See our implementations in Appendix.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Capasso, V. and Serio, G. (1978) A Generalization of the Kermack-McKendrick Deterministic Epidemic Model. Mathematical Biosciences, 42, 43-61.
https://doi.org/10.1016/0025-5564(78)90006-8
[2] Diekmann, O. and Heesterbeek, J.A.P. (2000) Mathematical Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation. Vol. 5, John Wiley & Sons, Hoboken.
[3] Hethcote, H.W. (2000) The Mathematics of Infectious Diseases. SIAM Review, 42, 599-653.
https://doi.org/10.1137/S0036144500371907
[4] Heesterbeek, J.A.P. (2002) A Brief History of R0 and a Recipe for Its Calculation. Acta Biotheoretica, 50, 189-204.
https://doi.org/10.1023/A:1016599411804
[5] Rahmandad, H. and Sterman, J. (2008) Heterogeneity and Network Structure in the Dynamics of Diffusion: Comparing Agent-Based and Differential Equation Models. Management Science, 54, 998-1014.
https://doi.org/10.1287/mnsc.1070.0787
[6] Thommes, E., Chit, A., Meier, G. and Bauch, C. (2014) Examining Ontario’s Universal Influenza Immunization Program with a Multi-Strain Dynamic Model. Vaccine, 32, 5098-5117.
https://doi.org/10.1016/j.vaccine.2014.06.005
[7] Thommes, E., Cojocaru, M. and Athar, S. (2016) Absenteeism Impact on Local Economy during a Pandemic via Hybrid Sir Dynamics. International Conference on Dynamics of Disasters, Kalamata, 29 June-2 July 2015, 309-328.
https://doi.org/10.1007/978-3-319-43709-5_15
[8] Auchincloss, A.H. and Diez Roux, A.V. (2008) A New Tool for Epidemiology: The Usefulness of Dynamic-Agent Models in Understanding Place Effects on Health. American Journal of Epidemiology, 168, 1-8.
https://doi.org/10.1093/aje/kwn118
[9] Longini Jr., I.M. (1988) A Mathematical Model for Predicting the Geographic Spread of New Infectious Agents. Mathematical Biosciences, 90, 367-383.
https://doi.org/10.1016/0025-5564(88)90075-2
[10] Tully, S., Cojocaru, M. and Bauch, C.T. (2013) Coevolution of Risk Perception, Sexual Behaviour, and HIV Transmission in an Agent-Based Model. Journal of Theoretical Biology, 337, 125-132.
https://doi.org/10.1016/j.jtbi.2013.08.014
[11] Tully, S., Cojocaru, M. and Bauch, C.T. (2015) Sexual Behavior, Risk Perception and HIV Transmission Can Respond to HIV Antiviral Drugs and Vaccines through Multiple Pathways. Scientific Reports, 5, Article No. 15411.
https://doi.org/10.1038/srep15411
[12] Cojocaru, M.-G., Athar, S. and Thommes, E. (2018) Adoption Costs of New Vaccines—A Stackelberg Dynamic Game with risk-Perception Transition States. Infectious Disease Modelling, 3, 256-265.
https://doi.org/10.1016/j.idm.2018.09.002
[13] Cojocaru, M.-G., Bauch, C.T. and Johnston, M.D. (2007) Dynamics of Vaccination Strategies via Projected Dynamical Systems. Bulletin of Mathematical Biology, 69, 1453-1476.
https://doi.org/10.1007/s11538-006-9173-x
[14] Alexandrino, A.S., Santos, R., Melo, C. and Bastos, J.M. (2016) Risk Factors for Respiratory Infections among Children Attending Day Care Centres. Family Practice, 33, 161-166.
https://doi.org/10.1093/fampra/cmw002
[15] Boone, S.A. and Gerba, C.P. (2005) The Occurrence of Influenza a Virus on Household and Day Care Center Fomites. Journal of Infection, 51, 103-109.
https://doi.org/10.1016/j.jinf.2004.09.011
[16] Lucas, P.J., Cabral, C., Hay, A.D. and Horwood, J. (2015) A Systematic Review of Parent and Clinician Views and Perceptions That Influence Prescribing Decisions in Relation to Acute Childhood Infections in Primary Care. Scandinavian Journal of Primary Health Care, 33, 11-20.
https://doi.org/10.3109/02813432.2015.1001942
[17] Hurwitz, E.S., Haber, M., Chang, A., Shope, T., Teo, S., Ginsberg, M., Waecker, N., and Cox, N.J. (2000) Effectiveness of Influenza Vaccination of Day Care Children in Reducing Influenza-Related Morbidity among Household Contacts. JAMA, 284, 1677-1682.
https://doi.org/10.1001/jama.284.13.1677
[18] Li, Y., Fraser, A., Chen, X., Cates, S., Wohlgenant, K. and Jaykus, L.-A. (2014) Microbiological Analysis of Environmental Samples Collected from Child Care Facilities in North and South Carolina. American Journal of Infection Control, 42, 1049-1055.
https://doi.org/10.1016/j.ajic.2014.06.030
[19] Park, G. W., Lee, D., Treffiletti, A., Hrsak, M., Shugart, J. and Vinjé, J. (2015) Evaluation of a New Environmental Sampling Protocol for Detection of Human Norovirus on Inanimate Surfaces. Applied and Environmental Microbiology, 81, 5987-5992.
https://doi.org/10.1128/AEM.01657-15
[20] Payne, D.C., Parashar, U.D. and Lopman, B.A. (2015) Developments in Understanding Acquired Immunity and Innate Susceptibility to Norovirus and Rotavirus Gastroenteritis in Children. Current Opinion in Pediatrics, 27, 105-109.
https://doi.org/10.1097/MOP.0000000000000166
[21] Yang, W., Elankumaran, S. and Marr, L.C. (2011) Concentrations and Size Distributions of Airborne Influenza a Viruses Measured Indoors at a Health Centre, a Day-Care Centre and on Aeroplanes. Journal of the Royal Society Interface, 8, 1176-1184.
https://doi.org/10.1098/rsif.2010.0686
[22] Cojocaru, M., Hogg, C., Kuusela, C. and Thommes, E. (2015) Adoption of New Products with Global and Local Social Influence in a 2D Characteristics Space. In: Cojocaru, M., Kotsireas, I., Makarov, R., Melnik, R. and Shodiev, H., Eds., Interdisciplinary Topics in Applied Mathematics, Modeling and Computational Science, Springer, Cham, 155-160.
https://doi.org/10.1007/978-3-319-12307-3_22
[23] Nguyen, S., Cojocaru, M. and Thommes, E. (2014) Personal Efficiency in Highway driving: An Agent-Based Model of Driving Behaviour from a System Design Viewpoint. 2014 IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE), Toronto, 4-7 May 2014, 1-6.
https://doi.org/10.1109/CCECE.2014.6901101
[24] Tesfatsion, L. (2002) Agent-Based Computational Economics: Growing Economies from the Bottom up. Artificial Life, 8, 55-82.
https://doi.org/10.1162/106454602753694765
[25] Thille, H., Cojocaru, M., Thommes, E.W., Nelson, D. and Greenhalgh, S. (2013) A Dynamic Pricing Game in a Model of New Product Adoption with Social Influence. 2013 International Conference on Social Computing, Alexandria, 8-14 September 2013, 762-767.
https://doi.org/10.1109/SocialCom.2013.114
[26] Thommes, E., Thille, H., Cojocaru, M.-G. and Nelson, D. (2009) A Time-Dependent ABM Model of an Ecoproduct Market with social Interactions and Dynamic Pricing Schemes. 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH), Toronto, 26-27 September 2009, 218-221.
https://doi.org/10.1109/TIC-STH.2009.5444501
[27] Andrews, M., Thommes, E. and Cojocaru, M.G. (2015) Replicator Dynamics of Axelrod’s Norms Games. In: Cojocaru, M., Kotsireas, I., Makarov, R., Melnik, R., Shodiev, H., Eds., Interdisciplinary Topics in Applied Mathematics, Modeling and Computational Science, Springer, Cham, 29-34.
https://doi.org/10.1007/978-3-319-12307-3_5
[28] Axelrod, R. (1997) The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration. Vol. 3, Princeton University Press, Princeton.
https://doi.org/10.1515/9781400822300
[29] Gilbert, N. and Conte, R. (1995) Artificial Societies. Routledge, London.
[30] Wild, E. and Cojocaru, M.G. (2016) Runaway Competition: A Correction and Extension of Results for a Model of Competitive Helping. PLoS ONE, 11, Article ID: e0164188.
https://doi.org/10.1371/journal.pone.0164188
[31] Ezhov, A.A. and Terentyeva, S.S. (2014) Agent-Based Model Heuristics in Studying Memory Mechanisms. Psychology, 5, 369-379.
https://doi.org/10.4236/psych.2014.55048
[32] Enserink, R., Mughini-Gras, L., Duizer, E., Kortbeek, T. and Van Pelt, W. (2015) Risk Factors for Gastroenteritis in Child Day Care. Epidemiology & Infection, 143, 2707-2720.
https://doi.org/10.1017/S0950268814003367
[33] Provincial Infectious Diseases Advisory Committee (2012) Best Practices for Environmental Cleaning for Prevention and Control of Infections in All Health Care Settings. Queen’s Printer for Ontario, Toronto.
[34] Guffey, K., Regier, M., Mancinelli, C. and Pergami, P. (2016) Gait Parameters Associated with Balance in Healthy 2- to 4-Year-Old Children. Gait & Posture, 43, 165-169.
https://doi.org/10.1016/j.gaitpost.2015.09.017
[35] Bjornson, K. F., Song, K., Zhou, C., Coleman, K., Myaing, M. and Robinson, S.L. (2011) Walking Stride Rate Patterns in Children and Youth. Pediatric Physical Therapy, 23, 354-363.
https://doi.org/10.1097/PEP.0b013e3182352201
[36] Delignette-Muller, M.L. and Dutang, C. (2015) Fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64, 1-34.
https://doi.org/10.18637/jss.v064.i04
[37] Rigby, R.A. and Stasinopoulos, D.M. (2005) Generalized Additive Models for Location, Scale and Shape. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54, 507-554.
https://doi.org/10.1111/j.1467-9876.2005.00510.x
[38] Bellera, C.A., Julien, M. and Hanley, J.A. (2010) Normal Approximations to the Distributions of the Wilcoxon Statistics: Accurate to What N? Graphical Insights. Journal of Statistics Education, 18.
https://doi.org/10.1080/10691898.2010.11889486
https://www.statskingdom.com/
[39] Statistics Kingdom (2017) Lavine’s Test Calculator.
https://www.statskingdom.com/230varlevenes.html
[40] Delacre, M., Lakens, D. and Leys, C. (2017) Why Psychologists Should by Default Use Welch’s T-Test Instead of Student’s T-Test. International Review of Social Psychology, 30, 92-101.
https://doi.org/10.31219/osf.io/sbp6k
[41] Ónder, H. (2007) Using Permutation Tests to Reduce Type I and II Errors for Small Ruminant Research. Journal of Applied Animal Research, 32, 69-72.
https://doi.org/10.1080/09712119.2007.9706849

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.