Performance Analysis of an Enhanced Load Balancing Scheme for Wireless Sensor Networks

Research interest in sensor networks routing largely considers minimization of energy consumption as a major performance criterion to provide maximum sensors network lifetime. When considering energy conservation , routing protocols should also be designed to achieve fault tolerance in communications. Moreover, due to dynamic topology and random deployment, incorporating reliability into protocols for WSNs is very important. Hence, we propose an improved scalable clustering-based load balancing scheme (SCLB) in this paper. In SCLB scheme, scalability is achieved by dividing the network into overlapping multihop clusters each with its own cluster head node. Simulation results show that the proposed scheme achieves longer network lifetime with desirable reliability at the initial state compare with the existing multihop load balancing approach.


Introduction
Sensor network research evolves as a result of the advances in computing and communication that has taken place in the late 1990s and early 2000s which has resulted in a new generation of sensor network technology.These sensor networks represent a significant improvement over traditional sensors through a number of high-density technologies, including micro electro-mechanical system and nanoscale electro-mechanical systems [1][2][3].
Historical evidence shows that the essential function of a sensor network is to identify the state of an environment using multiple sensors.Other aspects of sensor network design, such as communication, power, and computational resources, are evaluated by their ability to facilitate this function.
Wireless sensor nodes are deeply embedded into the physical surroundings, gather and process information such as temperature, humidity, light characteristics, seismic activities or images and sound samples from the physical worlds.Networked systems of such sensors were expected to be used in a variety of applications including habitat monitoring, precision agriculture, disaster recovery operations, health care and supply chain management [2,4].
Many studies were carried out in WSNs such as energy efficiency, load balancing and reliability.However, these design goals are generally orthogonal to each other.For example, most of the loads balancing schemes are not robust to high link failure rate [5].
On the other hand, reliable routing in wireless sensor networks (WSNs) is a challenging issue firstly, because of the absence of global addressing schemes; secondly, because of the problem of data source from multiple paths to single source; and thirdly, because of data redundancy and of energy and computation constraints of the network.The performance of the existing routing algorithms, such as shortest path routing algorithm, for WSNs varies from application to application because of the diverse demands for different applications for which they were deployed.
In this study, an enhanced load balancing algorithm for reliable routing in WSNs is proposed.This algorithm aims to achieve both load balancing and reliability for large-scale WSN.The use of load balancing scheme can be expected to provide significant lifetime benefits: rather than always using the nodes with the best channel, traffic is redistributed over a larger number of relays.
The paper is organized as follows: Section 1 introduces the readers to WSN background.Section 2 discusses the related works on load balancing algorithm.Research method is discussed in Section 3. The simulation results and discussion are detailed in Section 4. Section 5 draws the conclusion and proposes future work.

Related Works
Clustering-based routing is an energy efficient routing model for achieving load balancing as compared with direct routing and multihop routing protocol in WSNs but there are some issues in clustering-based routing as well.Authors in [6] discussed the problem of load balancing in cluster-based routing and introduced a novel idea of rotation of cluster head role inside the cluster named Low-Energy Adaptive Clustering Hierarchy (LEACH).
The protocol assumes that all cluster heads can directly communicate with the central base station of the network; therefore it is not applicable in large regions.The major drawback is that the resultant set of cluster heads may be unevenly distributed, which causes variable cluster sizes and higher intra-cluster communication cost.Cluster heads far away from the base station have to transmit data over a long distance and suffer a high energy consumption rate.In a large network, such a disparity will cause nodes in the far corners of the sensing area to die quickly.
Authors in [7] proposed Power-Efficient Gathering in Sensor Information Systems (PEGASIS) which is an improvement on the LEACH approach.Instead of forming multiple clusters, PEGASIS forms chains of sensor nodes so each node communicates only with its closest neighbours and takes turns in transmitting data to the base station.The protocol assumes that all nodes are to communicate directly with the base station, so the role of the leader is rotated among all the nodes forming the chain in order to balance the energy consumption.
In PEGASIS, a rotation scheme is used to share the cost of communication with the base station, however uneven energy dissipation still exists due to the difference in cluster head positions.In addition, this approach assumes global data aggregation, that is, sensor data from all nodes can be aggregated into a single packet.This is an assumption that is not always true.When it is not true, the cost of passing each packet along the entire chain will cause a very short network lifetime.
These major drawbacks were addressed in a centralized optimization approach proposed by [8].In this paper, the authors presented an AntChain method where Ant Colony Optimization algorithm (ACO) was applied as a centralized optimization tool to form the near lowest cost chain for a particular area.Unlike the PEGASIS, sensors nodes do not have the prior global knowledge.The ACO approach was designed to deal with the dynamics of the sensor nodes which can ensure the algorithm response to any network changes in a time.
The ACO algorithm allows all the complicate computation, optimization and set-up overhead to be performed by base station which was assumed to have unlimited energy recourse, much stronger communication and computation ability.The drawback of this protocol is that the base station requires information about all the nodes in a network before the selection of cluster heads.In a larger network, this approach would not work well since it uses a centralized approach for the management of the clusters.
Authors in [9] proposed a Hybrid Energy-Efficient Distributed Clustering (HEED) that periodically selects cluster heads according to a hybrid of their residual energy and a secondary parameter, such as node proximity to its neighbours or node degree.In this approach, a probabilistic algorithm was employed to form a dominating set in a fixed number of rounds, with a penalty of slightly large dominating set size.This scheme builds a higher quality clusters than LEACH and PEGASIS that uses random selections and single chain, which results in a longer network lifetime.
In HEED, all cluster heads send aggregated data to the base station via the shortest path.This scheme minimizes the total energy consumption.However, the energy consumption is still unbalanced since neighbours of the base station are responsible to relay all packets to the base station and have higher load.A hot spot is formed in the area surrounding the base station, which is congested with data traffic and consumes energy much faster than other areas of the network.
A multilayer multi hop routing algorithm for inter cluster communication was presented by [10].The algorithm worked on the principle of divide and conquers and performed better in terms of load balancing and energy efficiency than LEACH.The algorithm was aimed at exploiting the redundancy property of the WSNs.It selected a small percent of nodes from the network and marked them as temporary cluster heads and used these nodes to make the inter cluster communication multi hop.The problem with the algorithm was that it was selecting the temporary cluster heads randomly thus compromising occasionally on the area coverage of the network which it is monitoring.
A data aggregation algorithm for WSNs was proposed in [11].The algorithm selected a cluster leader that can perform data aggregation in a partially connected sensor networks.The proposed scheme is only limited to a partially connected network within an intra-cluster network.
Authors in [12] proposed adaptive energy aware intra cluster routing (EAICR) in order to address the problem of load balancing associated with the existing single hop and multihop routing protocol.Some nodes were considered in close region and they performed direct routing and outside the region nodes adopted multihop routing.In this way, the closer nodes are not having extra load on them unlike HEED where the closer nodes to the base station exhaust energy very quickly because they perform the task of sensing their own data and routing the data of the other nodes.Thus closer nodes have extra load with them and there is no concept of load balancing as well.
The major drawback of this algorithm is that the scope is only limited to intra cluster routing algorithm and not adaptive to inter cluster routing algorithm.Also, the load is still unbalanced due to hot spots that exist at the cluster head close to the base station as a result of data relay through multiple hop.

Research Method
This study proposes an improved scalable clusteringbased load balancing routing algorithm (SCLB) that is a hybrid of the existing inter-cluster and intra-cluster routing approach.Scalability is achieved by dividing the network into overlapping multihop clusters each with its own cluster head node.Each cluster head is responsible for building a local relative map corresponding to its cluster using intra-cluster node's range measurements.

Cluster Communication
Three major issues within the intra-clusters region were considered: cluster forming, cluster head re-election and cluster head canceling.In cluster forming, when a node is powered on, it marks itself clusterless and sets up a random waiting timer Tt and starts to monitor the radio channel for signal request.If no request is heard within Tt, the node marks itself as a cluster head.Once a node is notified as cluster head, it sends a beacon immediately in form of a contention packet to the neighbouring nodes within the cluster set up.
Every node has their budget variables for cluster head selection which is usually in terms of residual battery power or elapsed time for acting as cluster head.When a node is set as cluster head, it starts a timer Tm for acting as cluster head.After the Tm has expired, it selects the node that has the highest budget variable as the next cluster head and sends it a packet for notification.
Cluster head canceling occurs when a cluster head hears a probe request from another cluster head and thereby set itself as a slave.Subsequently, the node sends a beacon message to the cluster head for association.
After the cluster head formation, the nodes transmit their packets to the cluster head in their scheduled time slot.The cluster head receives all packets from its cluster members, compress the data into one packet and send the packet to the closest cluster head in another cluster neighborhood in a peer-to-peer (P2P) communication.
Based on the respective outputs of the sensor readings, the inter-cluster communication is formed where each cluster head has to produce an output to be transmitted over a wireless communication link.The packet outputs are finally received by the base station.Thus, the scheme is model as a distributed function of the observations given the state since the sensors have chance to co-operate to some extent.

Simulation Platform
The simulation platform was developed using Java programming language within a 200 m × 200 m region.The network comprises of 250 sensor nodes and these sensor nodes are randomly deployed with different levels of density.The communication range of each sensor is set to 20 meters.Two nodes have a wireless link between them if they are within the communication range of each other.

Model Assumption
The model was assumed to be of homogenous sensor nodes where all the nodes are of equal sizes, have identical hardware capabilities and uniform energy consumption, uses the same IEEE 802.11 radio protocol.None of the sensor nodes is resource-rich than the others and they have equal chances of cluster head selection.Each of the cluster head is dedicated to both intra-cluster and inter-cluster traffic and a way to distribute the radio resources for the two types of traffic was considered.

Problem Formulation
In SCLB, WSN with n nodes and k number were considered which is denoted as n1, n2, n3, n4, •••, nk and a base station b as shown in the Figure 1.A network of sensors is considered to be connected only if there is at least one path between each pair of nodes in the network.Connectivity, therefore, depends primarily on the existence of paths.
A SCLB model is formulated as a deterministic directed graph.The nodes of the graph correspond to the transmitting and receiving units, and the edges of the graph correspond to the connections that link the nodes in the network and describe its topology.
A directed graph is a graph G = G(V, E) that consists of ordered pairs of distinct nodes with two sets V(G) and E(G) where, V = V (G) is the set of p > 1 nodes or vertices of the graph.E = E (G) is a set of q > 0 pairs of nodes or the edges or links of G, and Edge E = (u, v) is said to join nodes u and v in G.
This graph is said to be k-connected if there are at least k disjoint paths between every pair of nodes u, v  V.If we set d (u, v) as the physical distance between nodes u and v, and R c is the radius of communication, then E is defined as follows: The cluster head (CH) nodes n1, n2, n3, •••,nk were equally spaced over a distance d and each of the nearest CHs to the base station was 50 m away from the base station b.Each of the sensed data from sensor nodes i j S within the cluster region of n1, n4 and n7 were sent to their respective CHs.The aggregated data were compressed and the greedy search algorithm was used to select and transmit l1, l2 and l3 bits of data from these nodes to the next CHs n2, n5 and n7 respectively where they were compressed jointly with the data at that node.
Greedy search algorithm selects the closest node from the next tier of CH.When one node n1 is closest to multiple nodes in the current tier, the node that is farthest to b wins and marks itself as selected.The other competing nodes have to select from the remaining unmarked nodes.In this case, we assume idealized entropy coding that achieves the maximum compression possible.
The l2 bits of packet were transmitted from the second CHs to the third CHs and so on till the last node within the cluster.Then the jointly compressed lk bits of data from each cluster were routed to the base station b in a peer-to-peer approach.SCLB incorporates a cluster-based joint routing and compression strategies that span from one extreme where no compression is performed and each node routes its information in a single-hop routing to the CH and to the other extreme where data from every source was compressed sequentially before routing to the base station.

Algorithm Development
SCLB algorithm comprises of three phases, namely: the setup phase, the steady phase and the forwarding phase.In setup phase, the sensor nodes begin with initialization state where the nodes are at sleep.At a particular threshold function say 5 seconds, the sensor nodes wake up and enter listening state where the nodes receive protocol data unit from other neighboring nodes to determine its role in the topology.After listening state, the sensor node enters learning state where the node can transmit and receive but not forwarding packets.This state compares the medium access control (MAC) address of the sensor nodes within the same radio signal to its own MAC address in its routing table to determine whether the MAC address of the incoming signal is higher than its own MAC address or not and subsequently drop its attempt to participate in CH election, otherwise the node drops the MAC address of the incoming signal and elects itself as the CH.
The sensor nodes enter contention state where the node with the highest residual energy among the nodes within a particular cluster becomes the CH.The node broadcasts its status to the other sensor nodes in form of probe request and listens for beacons (probe responses) from the nodes.After the remaining nodes send the probe response, the CH initiates association and allocates a specified time slot for each of the sensor nodes to transmit to the CH.
The sensor nodes start transmitting in their allotted time slot using time division multiple access (TDMA) and this marks the beginning of the steady phase.When all nodes within an intra-cluster region finish the transmission of aggregated data to the CH, the forwarding phase begins.
The forwarding phase marks the beginning of the inter-cluster communication where CH performs computation on the received data.CH also transmits the data using multiple-chain multihop approach to the neighboring CH until the data is received at the base station.Once all the CHs finish the transmission, the control returns to the steady phase again.This is depicted in Figure 2.

Results and Discussion
In order to carry out the analysis of the simulated outputs, some performance parameters are taken into consideration for evaluating the proposed clustering scheme with the existing multihop load balancing approach.

Performance Evaluation
Performance metrics can be described as a measure to analyze or grade an activity or event.They act as yardsticks on which decisions about such activity can be made.For this study, energy consumption, network lifetime and reliability would be the metrics.

Energy Consumption
The study considered the amount of energy the radio dissipates to run the transmitter or receiver circuitry as Eelec = 50 nJ/bit and the amount of energy consumed by the transmit amplifier as Eamp = 100 pJ/bit/m 2 .The corresponding transmission power Etx and receiving power Erx by a l-bit packet over a distance d are: , The total energy E ij (l, d) for transmitting l-bit data from source node i to destination node j within a distance of d is: In HEED, the energy consumed for the shortest path routing from n7 till n1, that is, the energy consumed to deliver a packet to the neighbouring clusterhead is given as: In SCLB, a greedy search algorithm is used to form a straight-chain model.The study adopted the algorithm of the research conducted by [13] where the total energy consumed per round at node k is calculated thus:

Network Lifetime
Network lifetime is crucial to a large-scale sensor network since it is undesirable or infeasible to replace or recharge sensors once the network is deployed.In modeling the sensor network lifetime, we obtained formula applies to any definition of the network lifetime [14] that is derived based on Strong Law of Large Numbers (SLLN).This is given by: Applying equation (7) to the current research study with an assumption that  = 1, network lifetime becomes: where O = initial energy of a non-rechargeable battery, and = size of homogenous sensor network.

Reliability
Reliability implies the extent to which a measurement instrument is said to yield a consistent result.In other words, reliability points to the degree in which a system measures the same way each time it is used under the same condition with the subjects.The reliability of a system is not measured rather it is estimated.In wireless sensor networks, reliability is used as a measure to show how reliable the sensed event can be reported to the base station and the study uses the metrics RSCLB to measure the reliability of the system as,

Simulation Results and Discussion
The results of the simulation based on the equations discussed earlier in the study are presented for the existing model (HEED) and the proposed model (SCLB).

Reliability
The result from Figure 5 shows that RSCLB drops gradually from 100% to 2% as the operation time of the system increases.This is due to the fact that at first transmission of the aggregated data from clusterhead n7 which is the farthest single chain clusterhead load balancing assumed for the research, almost all the data were transmitted to the next clusterhead.The next clusterhead n6 accepts the data and aggregates it with those sent to it from the nodes within its cluster region.The total aggregated data were computed and sent to the next clusterhead.Clusterhead n5 also collects the data with the sensing data from the clustered nodes within its intra-communication domain.These processes continues until the last clusterhead node close to the base station.Optimal reliability was attained at 75.6% when the clusterhead node is at n5.The decrease in reliability was accounted for by the increase in failure rate of the aggregated data which resulted in packet loss.

Conclusions and Future Work
In this study, we presented a SCLB that is an extension of the existing studies on WSNs.The proposed scheme considered a balanced inter-cluster sensor network routing where all nodes are location aware and have the same initial energy capabilities.Simulation results showed that the network lifetime of the proposed scheme improves with a balanced energy consumption.
Furthermore, we also incorporated reliability into the protocol for WSNs due to dynamic topology and random deployment.Obviously, the reliability deteriorates as the operation hour approaches the clusterhead node close to base station due to increased number of failure rate in a single round of transmission of data.In order to improve the reliability of the clusterhead nodes, more than one retransmission will be required to ensure that the data sent from one clusterhead will reach the next clusterhead.However, this would pose a challenge of delay in delivery data to the base station which can be considered for further study.
energy (that is, total unused energy in the network when it dies); P c = Constant continuous power consumption over the whole network. = Average sensor reporting rate (defined as the number of data collections per unit time); energy consumed by all sensors in a randomly chosen data collection.
node to the base station is 0.475, 0.650, 0.825, 1.000, 1.175, 1.350 and 1.525 milliJoule while HEED are 0.650, 1.000, 1.350, 1.700, 2.050, 2.400 and 2.750 milliJoule respectively as shown in Figure3.This implies that SCLB achieves an average lower energy consumption of 39% as compared with HEED.Starting from 2 single node clusterhead to 8 single node clusterhead, SCLB decreases with values: 4.211, 3.077, 2.424, 2.000, 1.702, 1.481 and 1.311.On the other hand, HEED decreases with values: 3.076, 2.000, 1.481, 1.179, 0.976, 0.833 and 0.727.This implies that SCLB has an improved network lifetime with an average value of 65.29% as compared with HEED.The appreciable increases in network lifetime of SCLB is accounted for by the paradigm shift from the multihop load balancing scheme to cluster-based load balancing approach.
The energy consumed by SCLB at maximal single node starting from 2 to 8 clusterhead nodes, that is, each of the closest clusterhead4.2.2.Network LifetimeThe network lifetime is an important metric which the study consider for SCLB and HEED as shown in Figure4.