A Co-Evolution Model for Dynamic Social Network and Behavior

Individual behaviors, such as drinking, smoking, screen time, and physical activity, can be strongly influenced by the behavior of friends. At the same time, the choice of friends can be influenced by shared behavioral preferences. The actor-based stochastic models (ABSM) are developed to study the interdependence of social networks and behavior. These methods are efficient and useful for analysis of discrete behaviors, such as drinking and smoking; however, since the behavior evolution function is in an exponential format, the ABSM can generate inconsistent and unrealistic results when the behavior variable is continuous or has a large range, such as hours of television watched or body mass index. To more realistically model continuous behavior variables, we propose a co-evolution process based on a linear model which is consistent over time and has an intuitive interpretation. In the simulation study, we applied the expectation maximization (EM) and Markov chain Monte Carlo (MCMC) algorithms to find the maximum likelihood estimate (MLE) of parameter values. Additionally, we show that our assumptions are reasonable using data from the National Longitudinal Study of Adolescent Health (Add Health).


Introduction
Numerous studies have examined the role friends play in influencing behavior. Researchers have made extensive use of data from the Framingham Heart Study-Network Study (FHS-Net) [1]- [3], the National Longitudinal Study of Adolescent Health (Add Health) [4] [5], and other datasets [6]- [9] to examine whether health behaviors such as smoking and becoming obese can spread between friends. However, the validity of analyses based on observational studies has been called into question by several authors [10] [11]. The main concern is the impossibility of identification of peer influence from peer selection using regression-based approaches [11].
In response to these concerns, the actor-based stochastic model (ABSM) was proposed by [12] [13]. This model employs Markov chain simulation and method-of-moments (MOM) to adjust estimates of peer influence and peer selection parameters using longitudinal data. The underlying model is a random utility function, where the utilities are not observed. This type of model is the most appropriate for scenarios where an actor must make a single choice from a given set of choices [14], although several researchers have applied the ABSM model to continuous behaviors [7] [8].
In ABSM, a continuous time finite-state-space Markov process was used to model the dynamic relationship between social network and behaviors. Three steps describe this process. The first step determines when the chance for the next change will occur. Let λ i n be the rate of change for actor i's network and λ i b be the rate of change for actor i's behavior.
Then the waiting time for the next chance of change is exponentially distributed with parameter λ = ∑ i λ i n + λ i b . Note that the chance of change does not necessarily results in successful change. The second step defines which actor has the opportunity to make a change (either a network change or a behavior change). The probability of a network change taken by a particular actor i is given by λ i n /λ and the probability that this is a behavior change taken by actor i is λ i b /λ. At the third step there is an opportunity to make a change in network or behavior. If actor i is making a network change, there are n possible outcomes, where n is the number of actors in the network. This condition holds because for network changes, at most one tie difference from the current network is allowed; no network change is also allowed. Say, y is the current network. The next network y′ must be either equal to y or deviate from y exactly one element in row i. To simplify notation, for adjacent matrix y and indicators i = 1, 2, ···, n and j = 1, 2, ···, n, we define a mapping function c(y, i, j) that maps y to a new matrix y′ whose (s, t)th element y st ′ equals the (s, t)th element of y, which is y st , when s ≠ i or t ≠ j. If s = i and t = j, there are two situations. If j = i, then y ij ′ = y ij . If j ≠ i, then y ij ′ = 1 − y ij . For actor i, let u ik be the k th effect for network evolution, which is a function of network variable y or behavior variable z, or both. Therefore, we can write u ik = u ik (y, z) to emphasize this relationship. Then the network objective function of actor i is f i n (y, z) = ∑ k ∈ K β k n u ik (y, z) where β k n are parameters that are either given (in a simulation) or estimated from data (in a analysis), and is a set of the effects of interest. The probability that actor i will make a network change and have a new network value y′ = c(y, i, j) is If actor i is going to make a behavior change, there are 3 possible outcomes: increase 1 unit, stay the same, or decrease 1 unit. Similarly, for i = 1, 2, ···, n and j = 1, 2, 3, define a mapping function d(z, i, j) that maps vector z to a new vector z′ whose sth element z s ′ equals to the s th element of z, which is z s , when s ≠ i. If s = i, then z i ′ = z i + j − 2 for j = 1, 2, 3. For actor i, let v ik (y, z) be the k th effect for behavior evolution. Then the behavior objective function of actor i is f i b (y, z) = ∑ k ∈ K β k b v ik y, z . The probability that actor i will make a behavior change and have a new behavior value z′ = d(z, i, j) is In summary, the probability to change to a new set of value (y′, z′) in the next step is if z′ = z and y′ = c(y, i, j), 1 ≤ i ≤ n and 1 ≤ j ≤ n, To use ABSM, the behavior variable must be bounded and discretized. For continuous behavior variables, such as body mass index (BMI), time spent watching television, etc., the process of discretizing can be arbitrary and tricky. In Section 3 (Results), we show that the effect of average BMI similarity can be very different for integer and categorical BMI.
Based on the above considerations, we were motivated to develop a linear-based behavior evolution model. In our model, the network evolution is similar to ABSM. However, the behavior evolution is defined by a continuous Markov process, which is completely different from [12] [13]. To simplify computation, we consider only a real network change as an "event" (instead of the opportunity of change). In addition, for behavior evolution, we assume normal residuals for values of change.

Complete and Observed Data
For illustration purpose, consider two waves of data that are collected at time 0 and T. The complete data during time period (0, T) include number of events k, time of events t 0 = 0, t 1 , ···, t k , y 0 , y 1 , ···, y k , y T = y k (or write as y 0 , (i (1) , j (1) ), ···, (i (k) , j (k) ), where (i (s) , j (s) ), 1 ≤ i (s) ≠ j (s) ≤ n, is the network edge changing at time t s , s = 1, ···, k), and behavior variable z 0 , z 1 , ···, z k , z T = z k+1 . The observed data include network variables y 0 , y T and behavior variables z 0 , z T . All the other variables occur between observations, and thus are considered missing in the complete data set. The joint evolution of network and behavior is shown in the following flow chart: Here the observed data are represented in black ovals, missing behavior data in blue ovals, and missing network data in red ovals. The network evolution process is represented by red arrows and behavior variable by blue arrows.

Occurrence of Events
The number of events k during time period (0, T) follows a Poisson distribution with rate λT. Conditional on k, the event times t 1 , ···, t k has the joint probability density function For now, we assume the chance of making a network change is the same for each actor. This assumption can be extended to be actor-specific if the data are informative enough.

Network Evolution
Let u(y, z) be an arbitrary vector of statistics of the graph and behavior, θ be the vector of coefficients, and ψ(θ) be the normalizing factor. Define δ ij y, z = u y ij If the current network is y and the behavior immediately before the next event is z, the probability to change edge (i, j) at next event is P change(i, j) = 1 n · exp 2y ij − 1 θ T δ ij y, z ∑ j′ ≠ i exp 2y ij′ − 1 θ T δ ij′ y, z .

Behavior Evolution
Define ΔZ(t 1 , t 2 )=Z t 2 − Z t 1 , the vector of behavior variable changes from time t 1 to t 2 . For any time t ∈ (t u , t u+1 ), we propose the following co-evolution model for behavior variables ΔZ t u , t = αW u ΔZ t u , t + t − t u Xγ + ε u (6) where W u = () n×n is a matrix of functions of y t u , X = () n×p is the matrix of p covariates, γ = (γ 1 , ···, γ p ) T is the vector of coefficients for general trend for BMI, ε u = ε 1 u , ⋯, ε n u T are all independent from each other or any other random variable, and ε u follows a multidimensional normal distribution with mean zero and variance matrix (t − t u )σ 2 I n . Note that W u represents individuals' friendship network variables. This is saying that the change of individuals' behavior is a function of friends' behavior change. The parameter α measures how strong this relationship is. In the next subsection we give an example choice of W and explain this function more intuitively.
Note that when t = t u , the variance of ε u is zero and therefore there is no change at all; when t increases, the variance of ε u increases, as one would expect. Equation (6) can also be written as Since behavior variables are accumulated over time, we would expect that when modeling behaviors, the distribution of change from time t u to t u+1 is consistent with a two-step process: first from t u to t, then from t to t u+1 . In our model, this condition is naturally satisfied because where ε u and ε t are independent and both follow multi-dimensional normal distributions with mean zero and variances (t − t u )σ 2 I n and (t u+1 − t)σ 2 I n respectively, which indicates that ε u + ε t follows a normal distribution with mean zero and variance (t u + 1 − t u )σ 2 I n . This is exactly what we expect. Note that in ABSM [7] [8], this condition is usually not satisfied for continuous behavior variables.

An Example Choice of W
As an example, assume that the i th individual's BMI change during time (t 1 , t 2 ), where 0 ≤ t 1 < t 2 ≤ T, is a linear function of the average change of BMI of his/her friends. That is, where ε i is independent of any other random variable and follows a normal distribution with mean zero and variance (t 2 − t 1 )σ 2 . When written in matrix format: where W = (diag(y 1+ , ···, y n+ )) −1 y. That is, (I n − αW) ΔZ = ε. We can then assume that the corrected behavior variables (I n − αW) ΔZ follow a multi-dimensional normal distribution with mean 0 and variance matrix (t 2 − t 1 )σ 2 I n . A network effect exists if α ≠ 0.

Complete Data Log-Likelihood Function
Exponential random graph models (ERGMs) are commonly employed to test whether the presence of network ties (edges) differs from what would be expected in a random graph, given some set of network statistics [15]. In the ERGM, the parameters are η = (λ, θ, γ, α, σ) with dimension = 3 + p +q, where p is the number of covariates and q is the number of network statistics in the ERGM. The complete data log-likelihood function is

Normal Distribution to Simulate Behavior Variable Z u
In the general multi-dimensional situation, assume that X 1 ~ N (0,Σ 1 ), X 2 ~ N(0,Σ 2 ), X 1 and X 2 are independent. Then X 1 + X 2 ~ N (0,Σ 1 + Σ 2 ). The distribution of X 1 conditional on X = X 1 + X 2 is normal with mean I n + ∑ 2 ∑ 1  Here Σ 2 is unknown since y h−1 , h = u + 1,⋯,k are not available at time t u . To solve this problem, we simply ignore the variations in W u−1 , ⋯,W k−1 and use W u−1 to replace all the other W s to generate an approximate distribution. Then we use Metropolis-Hastings algorithm to find the acceptance ratio and adjust samples to the right distribution. Let all W s equal to W u−1 , Σ 2 can be simplified as (T − t u )σ 2 ((I−αW u−1 )(I−αW u−1 ) T ) −1 , which is (T −t u )/(t u −t u−1 )·Σ 1 . Therefore, we propose to sample z u according to the normal distribution with mean (t u −t u−1 )/(T−t u−1 )·X and variance (T−t u )/(T−t u−1 )·Σ 1 .

Sample Hidden Variables Conditional on Observed Data
Remember that the observed data are y 0 , z 0 , y T , z T and the hidden variables are k, t 1 ,⋯t k , y 1 ,⋯, y k−1 , z 1 ,⋯, z k . With known y 0 , the network variables y 1 ,⋯, y k−1 can also be written as (i (1) , j (1) ), ⋯, (i (k−1) , j (k−1) ). With known z 0 , the behavior variables z 1 ,⋯, z k can also be written as Δz(t 0 , t 1 ),⋯, Δz (t k−1, t k ). The following sampling steps will sample the above hidden variables conditional on y 0 , z 0 , y T , z T .
• Sample k: let d be the number of edges (i, j) such that y i, j 0 ≠ y i, j T . Then it must follow that k = d + 2a for some a = 0,1, ⋯.
-If d is even, -If d is odd, • Sample t 1, ⋯ , t k conditional on k: ordered uniform (0,T).

1.
Sample Δz (t 0 ,t 1 ) from the following multinormal distribution, conditional on y 0 , z 0 , y T , z T , k, t 1, ⋯ , t k : and evaluate the density function of the above normal distribution at the realized value Δz(t 0 ,t 1 )′, which is denoted by q 1 (z′).

a.
Define the important list to be L = {(i 1 , j 1 ),⋯, (i d , j d )}, where y i u j u 0 ≠ y i u j u T for u = 1,⋯,d.
-If (i, j) ∉ , add (i, j) to , change d to d + 1, and change a to a−1.

d.
Denote the probability from the situlation of a > 0 or a = 0 by r 1 (y′).

4.
Use the Metropolis Hastings algorithm to decide whether to accept the generated sample (y′, z′) or not.

Results
We used the Add Health "saturation sample" data to check the reasonableness of our assumptions and to perform simulation studies. First, we show results based on the ABSM model; next we compare these results with our co-evolution model.
The Add Health saturation sample data are based on adolescents in 16 high schools where all students in a given school were asked to participate. There are two waves (1 year apart) of friendship network data, including environmental variables and self-reported height/ weight. We focus on one school called "Jefferson High" as in [16] [17], where over 99% students are white. In this data set, the sample size with complete data over two waves is 624, among which 52.7% are males. The grade levels range from 9 to 11, the average BMI is 23.1 with SD being 4.4 and the average outdegree (number of friends named) of the network is 4.0 with SD being 2.1.

Results for ABSM Models
The results based on ABSM are in Table 1 The probabilities for BMI evolution are shown in Table 2. The results indicate that for individuals whose BMI is greater than 17.5 there is a higher probability of an increase in BMI, which is consistent with the observed propensity for BMI to "track" over time [18] [19]. However, for individuals whose BMI is less than 17.5, the results indicate a higher probability of decrease in BMI; this may not be reasonable.

Validation of Assumptions in the Joint Evolution Model
Using the Add Health data for the school of Jefferson High, we can draw the histogram of BMI change and screen time change between these two waves ( Figure 1). From Figure 1, we see that the normality assumption is not perfectly satisfied due to larger amount of observations around zero. However, the distributions are approximately symmetric, which is usually sufficient in a linear model if sample size is moderately large (for example, greater than 30).
We also draw the scatter plot of individual's BMI change versus average friends' BMI change to check linearity assumption. The plot in Figure 2 suggest weak linear relationship between these two variables. Note that to draw this plot, we consider only friends who were nominated at both waves so that the BMI change comparison is valid. Therefore, the relationship shown reflect only part of the data, which contributes to the weakened linear relationship. These findings suggest that our assumptions are approximately satisfied.

Simulation Study
To simulate a realistic network with reasonable BMI values assigned to each individual, we randomly sampled 30 individuals (from the same school) in the Add Health data. The average BMI of selected individuals is 22.9 kg/m 2 . We then create an initial network using Bernulli graph with density = 0.3.
We specified that network and BMI would evolve for 60 days using the following parameters values: λ = 1, γ = 0.001, α = 0.1, σ = 0.1, and θ = (−3.42,2.33,0.50,0.39) with corresponding statistics of outdegree, reciprocity, transitive triplets and BMI total similarity. In the simulated data set, the number of network change events = 65, within which 21 edges change from 1 to 0 and 44 change from 0 to 1. The average BMI after 60 days is 23.6 and network density 0.33.
Apply the EM procedure described in Methods section, we obtained the following parameter estimations (Table 3). From the table, we see that some of the parameter estimates, such as event rate λ, network effect α, and coefficient of out degree θ 1 are relatively accurate. The variance σ 2 is underestimated. The other parameters are not significant comparing to zero.
This suggest that our algorithm can find reliable parameter estimates for those that are significantly different from zero.
The explanation of the above parameters are mostly straight forward. For example, the rate of events λ = 1.08 indicates that on average, there is 1.08 edge change during one unit time (one day in this example). The coefficient of trends γ is not distinguishable from zero, which means that there is no significant trend of BMI increase or decrease during these 60 days. The parameter that is of most interest is the network effect α, which is 0.12 in this example. This means that whenever an individual's friends' average BMI increases/ decreases by one unit, this individual's BMI is expected to increase/decrease by 0.12.

Application to Real Data
Since our model cannot deal with a network as large as 624 individuals, we include only students in grade 11 in this application. The sample size here is 110. We first run the ABSM model using RSiena (Table 4). Then we fit in our joint co-evolution model (Table 5).
Compare results from Table 4 and Table 5. We found out that the results for network evolution are similar from both models. This is because we are using the same network evolution models. The different behavior models have only limited effect on the network evolution process. Both model suggest that there is no effect of selection or influence. That is, similarity in BMI does not affect the process of making friends; an individual's friends' BMI change does not affect his/her own BMI. Note that when we use the complete data of 624 individuals, we got significant effect of selection (p = 0.0309) and influence (p = 0.0002). The insignificant results here are due to reduced sample size and lost information in missing values.

Discussion
We have developed a joint social network and behavior evolution model. In our model, behavior changes are consistent over time. That is, ΔZ and ΔZ 1 + ΔZ 2 have the same distributions. Our model is robust to scaling of behavior variables, and parameter values are easy to interpret. In addition, this framework may be readily expanded to study valued networks.
The field of social network analysis is a relatively young field. However, useful contributions are being made today. The range of applications is vast, from the contagion of health behaviors described in this paper [20], to the study of group formations in human societies [21]. Further advances will require improved statistical methods (to deal with different types of behaviors departing from the discrete choice model), as well as more extensive empirical data sets incorporating social networks. Many future studies will use continuous outcome measures; we hope the method presented here will be valuable in extending the ABSM to such outcomes.
Our model does require intensive computation. However, we are confident that more efficient algorithms can be developed. Though our model requires specific assumptions, we have demonstrated that these assumptions are reasonably easy to satisfy using real data. Sensitivity analysis will ultimately be required to determine if our model works well when some of the assumptions are violated.