Simulating the Effect of Social Network Structure on Workflow Efficiency Performance

The effect of social network structure on team performance is difficult to investigate using standard field observational studies. This is because social network structure is an endogeneous variable, in that prior team performance can influence the values of structural measures such as centrality and connectedness. In this work we propose a novel simulation model based on agent-based modeling that allows social network structure to be treated as an exogeneous variable but still be allowed to evolve over time. The simulation model consists of experiments with multiple runs in each experiment. The social network amongst the agents is allowed to evolve between runs based on past performance. However, within each run, the social network is treated as an exogenous variable where it directly affects workflow performance. The simulation model we describe has several inputs and parameters that increase its validity, including a realistic workflow management depiction and real-world cognitive strategies by the agents.


Introduction
A social network is a structure whose nodes represent members in a social context and whose edges can represent interaction, collaboration or influence between the members [1].SN analysis has attracted considerable interest from social and behavioral scientists over the last few decades [2,3].Recently, management researchers have also recognized that organizations can benefit from the interactions within the informal social network amongst its members that can often supplement the official hierarchy imposed by the organizational chart [4,5].While social networks may be represented in several ways, in this work, we utilize socio-matrices, where the sending members are the rows and the receiving members are the columns [2].
Several measures have been used in the SN literature to characterize a network, from the perspective of either a single actor, or from that of a group.Actor level measures include the centrality and the prestige of the actor in a SN, with finer definitions including degree centrality, closeness centrality and betweenness centrality [6].Group or team level measures include the centrality of the leader of the team, overall team density (how interconnected are the members?), and a related construct: the overall team cohesiveness, defined as the "forces that act on members to stay in the group" [7].
Workflow modeling is an area that attempts to model organizational tasks that can be executed by actors who require resources to accomplish discrete tasks [8].[9] pointed out that most Workflow Management Systems (WFMSs) refer to underlying organizational role lists in order to allocate activities to machines accessible by agents who can perform these roles.[10] provided several shortcomings in the activity allocation methods of WFMSs, many of which can be attributed to a lack of organizational knowledge on the part of the WFMS.One of the pioneering attempts to overcome these limitations is presented in [10], with the use of Object Constraint Language (OCL) to model teams of agents and their relationships in an organization.A limitation of this approach is that OCL does not support concepts that are usually used to characterize organizational relationships.Similarly, [11] used an object-oriented language to model organizational constraints, with the same limitations.
Literature in the management area on task performance has focused on contingency theory [12] where task completion performance is based on a good fit between task complexity and resources allocated, which may be information resources or otherwise.There is a broad recognition in the management literature that organizational tasks are performed in a social context [13].[14,15] indicated that formal organizational structures and informal social networks often influence each other, and both are important factors in completing organizational tasks.[16] pointed out how management efforts to inform and motivate employees can affect strategically aligned behavior that can lead to better task performance.Such efforts can be facilitated using social networks that allow diffusion of information amongst employees and teams.[17] proposed a constructural theory of group formation, where individuals exchange information, form their social network within the organization based on current knowledge and that, in turn, shape their future knowledge.Thus a social network is an evolving variable, based on whom the actor has interacted with in the past.This dynamic view of evolving networks is also recognized in [18] though the network is considered as an endogeneous variable changed by actors based on some objective function.
A basic assumption in all studies investigating the effects of social network metrics on task completion performance has been that social networks serve as conduits for the flow of resources [19].However, previous work in tying the effects of social network metrics to performance has been correlational with the potential for confounds because of the difficulty of isolating the dynamic aspect of the social network as tasks are executed.Not surprisingly, the findings have been mixed.According to [20], "unresolved empirical questions and theoretical debates persist about whether or not some social network features yield improved task completion…" For example, [21] found no correlation between a team's informal social ties and team performance.However, [22] showed how increased cohesiveness results in lower employee absenteeism.In [23], low cohesiveness was found to negatively affect creative group work such as brainstorming, as well as more routine tasks.
Similarly, [24] proposed that a well-connected leader correlated negatively with team performance because of the burden of maintaining social ties, while the traditional view (e.g.[25]) has been that better connected leaders have better performing teams.
A fundamental question in social networks effects has been the causal direction of the impact of the network structure.Does the network structure cause better performance [26], or does better actor/team performance lead to more social network centrality of teams and actors [27]?According to [28], enhanced reputation of actors based on previous performance may positively impact an actor's centrality within the network which could in turn reinforce future performance of the actors.
Based on the discussion above, data collection in real world organizations cannot address the direction of causality.This work takes an initial step in resolving this issue by proposing a simulation model that uses agents as actors over multiple runs of workflows.For each run, the social network is exogenous.However it is allowed to evolve after each run, so that it is endogenous across runs.This can help our understanding of the direction of causality.Figure 1 captures the essence of the research model addressed by this simulation, using semantically rich agent behaviors to model organizational tasks.

Simulation Model
Our model of organizational tasks draws from the workflow and management literatures.Workflows are usually modeled as collections of tasks, connected using control flow operators and performed by actors requiring resources [29,30].In [31], a canonical list of control flow operators connecting tasks that make up a workflow are described.
In the management area, for example, the PCAN model proposed in [32], uses people, resources and tasks matrices to model organizations.This approach has been further extended into the meta-matrix model and used in [33,34] to study the evolution of terrorist networks.The meta-matrix model [35] recognizes several column vectors such as personnel, tasks, resources and knowledge.Matrices representing relations between these column vectors are used to model views of the organizations.In earlier works, metrics imposed on each of these relations have been used to produce an overall view of the operational risk in the organization.For example, risk increases if an employee has exclusive access to a resource, or if there is a mismatch between the people-task matrix and the people-resource matrix, using the task-resource matrix as a reference.We model the organizational tasks in a similar way, and our simulation model aims to extend current work in the area by using Agent Based Modeling Systems (ABMS) based experiments to study the effects of differing network structures on task performance.

Basket of Independent Tasks
As mentioned in [17,18] social networks evolve as tasks are completed.Our simulation plan consists of several experiments.Each experiment consists of a series of runs.The SN is modeled as an exogenous variable in each run that evolves for the next run.During each run, organizational actors will execute tasks using resources that they possess or that are garnered from their social network.Each experiment ends when an optimal social network has been reached where there is not significant improvement in the task completion efficiency from previous runs.
Each experiment has a task basket that is reset at the start of each run.The SN for a run is derived from the earlier runs, based on the members the actor came in contact with in order to complete their tasks and the cognitive abilities of the actors in the network, which is one of the variables that can be studied in our model.This extends traditional work in social networks, where a study consists of one run, typically in a real world setting, and where actors often change their social network as part of the experiment, thereby making it the dependent or endogenous variable.Because the experiments we propose have multiple runs each, we can allow the social network to evolve in our experiments, but to still be exogenous for each run.Using varying levels of cognitive abilities of actors also extends work in the area of analytical models of agent based economic systems, where each agent is assumed to know every other agent.
The summary of our notation is shown in  The set T consists of task instances that make up the task basket for an experiment.We consider task instances as opposed to task types, in order to enable multiple instances of the same task type to be allocated to different members.For example, 500 instances of a task type such as handling a customer call, are handled as 500 different task instances, allocated to different possible members.
The relations in Figure 2 are modeled as matrices in our implementation.The values of cells in the X matrix represent the levels of each of the resources available to a member.Of course, a member may have available resources that they do not need, or excess levels of resources that others may be able to utilize.The X matrix represents the resource power of each actor.More resource-powerful actors have access to more types of resources and/or higher levels of a resource which are consumed in tasks.
The cell values in the Y matrix reflect the levels of each resource that a task needs for completion.Sufficient resources should be made available to the system to perform the task instances that are in the task basket for each run.
The cell values in the Z matrix reflect the expected value of the amount of time a member takes to complete a task.A positive value signifies that a member is assigned to a task instance.In the simple case of independent tasks, a task is only done by one member, and hence only one cell per column will have a positive value, all other cells in that column being 0. The actual time taken to complete a task in a simulation run is based on the expected value and a simulation parameter D T that represents the percentage interval for deviation in performance time for actors on tasks.
The cell values in the S matrix can be either binary or continuous between 0 -1, depending on whether social links are assumed to be binary or weighted.For nondirectional links, only half the matrix is considered, since the matrix will be symmetric.For directional links, the entire matrix is used.
Each run in an experiment starts with the same task basket that needs to be completed and the same task assignment and mean times.This means that the Y and Z matrices are created at the start of an experiment and reset before each run.At the start of each run, the X matrix (linking actors to resource levels they control) is perturbed, the degree of perturbation P being a simulation parameter.The social network matrix S is created at the start of each experiment and will change at the start of each subsequent run in the experiment, based on the actors' cognitive abilities.This implies that the S matrix is generated at the start of each run of an experiment, based, broadly speaking, on whom the actor came into contact with in the earlier runs of the experiment.
During each run, actors try to locate resources to per-form the next task in their queue.If an actor in their social network has the resource, it is given with a delay = R where R is a simulation parameter representing a percentage of the expected time for the task.If a second degree interaction is required, it is with a time delay = R 2 , and so on.All resources required for a task have to be reachable by an actor, though some after many degrees of separation.

Strategies for Adapting Social Network
The goal of agents is to quickly locate most resourceful agents.Agents, therefore strategically choose connections, i.e., adapt their social networks, based on past experience to locate agents who possess the resources needed by most arriving tasks.

Variable Number of Agent Connections
We propose several simple strategies for adapting the social network.In these strategies the number of connections each agent possesses is variable, but bounded by the simulation parameter, S, which represents the maximum number of connections an agent can have in its social network.The maximum size S of each actor's social network is part of the model of an agent's cognitive capacity.One simple example is that once an actor is contacted by another, they become part of their social network.If an actor is not used by another for N runs they move out of that actor's network in the N + 1th run, where N is a simulation parameter.If a weighted network, where the links are not binary but instead represent the strength of ties, is used, then the link weights may be reduced by a degree, D, also a simulation parameter.If an actor's network approaches S, the social network link used least, or in the most distant past may be eliminated.

Fixed Number of Agent Connections
We propose three strategies for adapting the social network of agents when all agents utilizes the maximum number of connections allowed, S: random mixture, random selection, and rewiring with exploration [36,37].Random mixture (RM) is the simplest strategy: at each iteration agents randomly reinitialize every connection.Note that the RM strategy does not require any long term memory for the agents and can be used by agents with very limited cognitive capabilities.When using random selection (RS), an agent first decides whether it should change any of its connections.It keeps an exponential weighted moving average, V, of the utility gained in each iteration.The utility agent i expects to gain in the next iteration, t, is ( ) is a utility learning parameter and 1 Δ i t U − is the change in utility for agent i in period t − 1.If the expected gain falls below a utility threshold parameter, , i.e., i t V < Θ then the agent chooses to rewire.If it chooses to change some of its connections, it still must choose which connections to rewire.That decision is also based on an exponentially weighted moving average of connection strengths represented by connection weights.If 1 Δ ij t U − is the change in utility that agent i could have received by contacting agent j on iteration t, agent i updates its connection weight ij t W for the con- nection to agent j as follows: ( ) is a weight learning parameter.The agent changes every connection for which the connection strength falls below a weight threshold parameter Φ , i.e., ij t W < Φ .New connection weights are initialized to the average of the current connection weights.
When using the decaying exploration (DE) strategy, each agent has an initial exploration rate 0 (0,1]  x ∈ , and this exploration rate is reduced at a rate η in every ite- ration, i.e., The rate of change of connections will be based on t x as well as i t V it as described above and the base expected utility, 0 i V .In the DE strategy the probability of an agent rewiring a connection is given by 0 * max 0, 1 The base expected utility is initialized as the average expected utilities for other connected agents.As in the RS strategy, agents keep track of the weight for each connection, ij t W .However, while the RS strategy can change multiple connections in one time step, an agent using the DE strategy is more cautious and changes only the connection with the lowest weight, and only if the corresponding weight satisfies the condition ij t W < Φ .Variants of these strategies can be developed and evaluated as well which assume varying degrees of agent cognitive abilities, based on our experimental results with these three strategies for adapting the social network in an organization to more effectively process assigned tasks.Table 1 lists the simulation parameters for our model.
While the time taken to complete the task basket is one measure of interest, we are also interested in the number of runs it takes to evolve to an optimal social network, where each actor can complete their task with minimal second or higher order interactions with members outside their network.The run with the optimal social network will provide the most efficient completion time for the task basket for an experiment.
Next, we highlight some changes to the model when

Interdependent Tasks and Teams
A basket of workflow instances (interdependent tasks) is generated at the start of each experiment, and the same basket would be used in each run of the experiment.We draw from the workflow modeling literature, where workflows are modeled as tasks (or activities) that are combined using a canonical set of control flow operators.These operators are precedence, AND-split, AND-join, OR-split, OR-join and an XOR-split [31].
When modeling interdependent tasks, we use the same sets and matrices as in Figure 2.However, an additional set of workflow instances W is needed, where each task instance in T is linked to one instance in W, though each workflow instance can be linked to many task instances.The average number of tasks per workflow will be Q, an additional simulation parameter.An algorithm for creating the interdependence between the tasks for each workflow is shown in Figure 3.
The rest of the simulation model is similar to the one for independent tasks.The primary difference here is that some tasks have to wait for other tasks based on interdependence.As part of the simulation implementation, a workflow execution engine would be needed that will implement the control flow so that certain tasks are on the wait queue of the workflow engine until other tasks are finished.
Again one can examine the effects of beginning social network structure on workflow completion efficiency, as well as the evolution of the SN over multiple runs.Additionally, team level SN measures can be used such as

For each workflow w l
For each task t j in w l Generate a random element c from the set of workflow control flow operators {PREC, AND-S, AND-J, OR-S, OR-  mean team cohesion, mean connectedness of team with outside resources to characterize different types of starting social networks for experiments.

Operationalization of Variables for Experiments
We now describe how independent variables shown in Figure 1 are operationalized for different experiments.
Figure 4 depicts a sample list of operationalizations for each of the independent variables whose effects can be tested as the goal of each experiment.The task characteristics variable has been described above.For the cognitive ability of actors or agents, the maximum degree of each actor will reflect their cognitive capacity and can be varied.We also propose two other levels: one with limited working memory and one with working as well as long term memory.With only short term memory, actors will only remember the history of the current run and the social network will be updated based on the degree of usage of each link, with maximum number of links being a constraint.With long term memory, agents will remember the history from previous runs in an experiment and be able to optimize their social network based on longer histories as well as memories of optimal networks from the past.
The social network variable is the main variable of interest and several characterizations are possible for creating starting social networks.The average degree reflects the average number of links to which an actor is connected [38].The small world topology [39] is one where all actors are connected to a few actors but long links go out across the network, so the degrees of separation are usually small.
The transitivity of a network is a measure of the likelihood of whether the friend of a friend is also your friend.A transitivity coefficient for each vertex is the ratio of the number of triangles connected to vertex v and the number of triples centered on v [39].Measures like centrality and network density can be used to characterize networks, based on individual links between members (nodes) of that group.Centrality of each member can be characterized by the number of other nodes to which the member is linked [40].Network density reflects how reachable a node is, on average, from any other node in the network.
The level of mismatch between resource and social links can be used to test hypotheses drawn from the management literature [12,32,35] where resource allocations are not aligned with the conduits to harness resources (social networks).
The team for a workflow instance will be all the members assigned to execute the task instances for a workflow.We define the cohesiveness as the number of links between members of the team, divided by the theoretical maximum number of links possible between the members [41].Similarly, team connectedness with outside members can be defined as the number of links going out from team members to those outside the team, divided by the expected number of such links.The expected number of links can be computed in several ways, including the average number of links between any two members in the organization.
The discussion above illustrates the different experiments that can be run, based on selecting one operationa-lization for each independent variable.In addition, the interaction effects between these variables can be studied.

Simulation Platform
Several multi-agent simulation tool-kits are available for deploying our model.One example is MASON (Multi-Agent Simulator Of Networks), a multi-agent simulation environment available at http://cs.gmu.edu/~eclab/projects/mason/.MASON is a general purpose, domain agnostic, lightweight framework that provides easily modifiable objects, stochastic event ordering, inspection of simulation objects, visualization of 2D and 3D, graphs, and charts.Repast (http://repast.sourceforge.net/)and Netlogo (http://ccl.northwestern.edu/netlogo/) are general purpose object-oriented frameworks that make simulating natural and social phenomena relatively easy for inexperienced users.MASON is more flexible, faster, and lightweight compared to RePast and Netlogo.JADE (Java Agent DEvelopment Framework), available at http://jade.tilab.com/ is a high-level software framework that enables the modeling of multi-agent systems.Two important features of Jade are: 1) it complies with the FIPA specifications (available at http://www.fipa.org/)and 2) its portability to many environments such as J2EE, J2SE, and J2ME.JADE allows each agent to dynamical-ly discover other agents and to communicate with them according to the peer-to-peer paradigm.Though Jade is more sophisticated than MASON, the latter is more suitable for our need of a simulation environment that allows us to develop our own model.
When compared to other agent-based simulation platforms, MASON is fast, easily extensible, and efficient and can support up to a million agents over many iterations.Long-running simulations may be suspended and resumed later.MASON defines agents as computational entities that can be scheduled explicitly to perform some actions and change the environment.Steppable objects can be scheduled to occur at any given timestep.Various agents can be grouped together to perform in parallel.To enable visualization, MASON relates the objects with locations in 2D or 3D grids or network graphs.
In our experimental scenarios, a group of agents are assigned either individual or team-level organizational tasks at each timestep.To simulate such scenarios effectively, a teamwork generator agent is required that would generate tasks, determine groups, and assign these tasks to the agents in the group.If resource requirements for tasks are not met by local resources, the owners of tasks would search for resource provider agents in their environment to supply missing resources.Once agents with the required resources are located, the task owner agents will update their knowledge with the corresponding information.Agents can thus change their social networks after locating more resourceful agents in the organization.One can visualize the network of the agents and the changes in the topology of the network by using the visualization tools of MASON.

Discussion
The broad research questions we seek to address in this work are as follows.First, is it possible to create semantically rich agent based simulations of organizational task execution where simple behaviors lead to emergent patterns?Our approach differs from other studies in the social science area in that we model agent behaviors representing the area of organizational task completion drawing from both the workflow modeling literature and management literature.Second, how do different types of social networks in an organization affect task completion efficiency, and how do they evolve to converge to an "optimal" network over multiple runs of the simulation in different situations?While we include task types and agent cognitive makeup in our list of variables that affect task completion efficiency, we are primarily interested in the effect of different types of social networks.A unique feature of our proposal is that the social network is viewed as an exogenous variable in each run of an experiment, but at the same time dynamically evolves through multiple runs of an experiment.

Conclusions
Overall, we expect the primary contribution of this research to add significantly to the body of knowledge on the effects of social network characteristics on organizational work.The immediate impact of the ABMS infrastructure developed in this work will provide a realistic setting to test several social network based hypotheses related to organizational work processes that are difficult to measure using real world data collection.More importantly, in the long run, the semantically rich infrastructure developed in this project will allow the validation of analytical models that predict emergent behaviors in the network.It will allow the study of non-linear behavior amongst agents, using threshold level if-then decision making, which is intractable using differential equations.
The infrastructure will also allow the creation of semantically realistic experiments in other areas of sociology and management, where data collection has typically been onerous and often allowed for only correlational analysis, e.g., the effects of power and trust levels on task performance [42,43].

Figure 2 .
M: The set of |M| = m members, each element shown as mi T: The set of |T| = t task instances, each element shown as tj R: The set of |R| = r resources, each element shown as rk X: A relation between M and R, depicting the level of resource rk to which member mi has access Y: A relation between T and R, depicting the relative amount of resource level expended by each task Z: A relation between M and T, depicting the time taken by a member to complete a task S: A relation between M and M, that may be binary or a continuous value between 0 and 1.

Figure 2 .
Figure 2. Summary of notation used in the formulation.

Figure 3 .
Figure 3. Algorithm to generate interdependent tasks or workflow instances.

Figure 4 .
Figure 4. Operationalization of independent variables for different experiments.

Table 1 . List of simulation parameters.
J, X-ORS} If c = PREC, pick one other task t m s.t.j<>m Create <t j PREC t m > return If c<>PREC, pick y other tasks, s.t.y is random and 1 <= y <= M Create <t j c t m , m = 1..y, m<>j} return