Survey of Clustering Schemes in Mobile Ad Hoc Networks

Mobile ad-hoc networks (MANETs) are a specific kind of wireless networks that can be quickly deployed without pre-existing infrastructures. They are used in different contexts such as collaborative, medical, military or embedded applications. However, MANETs raise new challenges when they are used in large scale network that contain a large number of nodes. Subsequently, many clustering algorithms have emerged. In fact, these clustering algorithms allow the struc-turing of the network into groups of entities called clusters creating a hierarchical structure. Each cluster contains a particular node called cluster head elected as cluster head according to a specific metric or a combination of metrics such as identity, degree, mobility, weight, density, etc. MANETs has drawbacks due to both the characteristics of the transmission medium (transmission medium sharing, low bandwidth, etc.) and the routing protocols (information diffusion, path finding, etc.). Clustering in mobile ad hoc networks plays a vital role in improving resource management and network performance (routing delay, bandwidth consumption and throughput). In this paper, we present a study and analyze of some existing clustering approaches for MANETs that recently appeared in literature, which we classify as: Identifier Neighbor based clustering, Topology based clustering, Mobility based clustering, Energy based clustering, and Weight based clustering. We also include clustering definition, review existing clustering approaches, evaluate their performance and cost, discuss their advantages, disadvantages, features and suggest a best clustering approach.


Introduction
A Mobile Ad hoc NETwork (MANET) consists of a group of mobile nodes that self-configure to form a temporary network without the aid of a preset infrastructure or centralized management.Such networks are characterized by: dynamic topologies, existence of bandwidth constrained, variable capacity links, and energy constrained operations and highly prone to security threats.Due to all these features routing is a major issue in mobile ad hoc networks [1,2].
Routing in a network is the process of selecting paths to send network traffic.Routing can take place either in a flat structure or in a hierarchical structure [3].In a flat structure [4,5], all nodes in the network are in the same hierarchy level and thus have the same role.Although this approach is efficient for small networks, it does not allow the scalability when the number of nodes in the network increases.In large networks, the flat routing structure produces excessive information flow which can saturate the network [6,7].Hierarchical routing protocols [8] have been proposed to solve this problem among others.This approach consists of dividing the network into groups called clusters.This results in a network with hierarchical structure.Different routing schemes are used between clusters (inter-cluster) and within clusters (intracluster).Each node maintains complete knowledge of locale information (within its cluster) but only partial knowledge about the other clusters.Hierarchical routing is a solution for handling scalability in a network where only selected nodes take the responsibility of data routing [9,10].However, hierarchical approaches undergo continual topology changes.Thus, topology management plays a vital role prior to the actual routing in MANET.Cluster based structure (hierarchical structure) in network topology has been used to improve the routing efficiency in a dynamic network [11,12].
Structuring a network is an important step to simplify the routing operation in MANETs.Several algorithms based on clustering techniques have been proposed in the literature [4,5,8,13].The clustering consists of dividing the network into a set of nodes that are geographically close.It is an efficient solution to simplify and optimize the network functions.In particular, it allows the routing protocol to operate more efficiently by reducing the control traffic in the network and simplifying the data routing.Several clustering schemes have been proposed.These schemes have different characteristics and are designed to meet certain goals depending on the context in which the clustering is used (routing, security, energy conservation, etc.) [11,14,16].
The rest of the paper is organized as follow: we start by introducing different clustering approaches.Then, we present their advantages and disadvantages.In section 3 we present some existing works on survey of clustering in MANETs.In section 4, we review some clustering schemes for MANETs.Then we compare the clustering schemas that already present.Finally, in section 5, we conclude the paper.

Definition
The process that divides the network into interconnected substructures, called clusters.Each cluster has a particular node elected as cluster head (CH) based on a specific metric or a combination of metrics such as identity, degree, mobility, weight, density, etc.The cluster head plays the role of coordinator within its substructure.Each CH acts as a temporary base station within its cluster and communicates with other CHs [17,18].A cluster is therefore composed of a cluster head, gateways and members node.
Cluster Head (CH): it is the coordinator of the cluster.Gateway: is a common node between two or more clusters.
Member Node (Ordinary nodes): is a node that is neither a CH nor gateway node.Each node belongs exclusively to a cluster independently of its neighbors that might reside in a different cluster.

Related Work
Jane Y.Yu and Peter H.J.Chong [11] A. Abbasi and M. F. Younis [12] grouped taxonomy of relevant attributes into three types: cluster properties, cluster head capabilities, clustering process.They categorized the different schemes based on the objectives, the desired cluster properties and clustering process.They highlighted their objectives, features, complexity and the effect of the network model on the presented schemes and summarized a number of schemes, stating their strength and limitations.Finally they compared these clustering algorithms based on metrics such as convergence rate, cluster stability, cluster overlapping, location awareness and support for node mobility.B.A.Correa et al [3], discussed the concepts related to network topology, routing schemes, graphs partitioning and mobility algorithms.The authors described lowest-ID heuristic, highest degree heuristic, DMAC (distributed mobility-adaptive clustering), WCA (weighted clustering algorithm).R. Agarwal and M. Motwani [10] examined the important issues related to cluster-based MANETs, such as the cluster structure stability, the control overhead of cluster construction and maintenance, the energy consumption of mobile nodes with different cluster-related status, the traffic load distribution in clusters, and the fairness of serving as cluster head for a mobile node.
M. Anupama and B. Sathyanarayana [28], analyzed, compared and classified some clustering algorithms into: location based, neighbor based, power based, artificial intelligence based, mobility based and weight based.They also presented the advantages and disadvantages of these techniques and suggest a best clustering approach based on the observation and the comparison.

Clustering Schemes in Mobile Ad hoc Network
We classify the clustering algorithms based on their objectives, the cluster heads election criteria and based on literature review [10, 12 ,28,29] as:

Identifier Neighbor Based Clustering
In identifier neighbor based clustering, a unique ID is assigned to each node.Each node in the network knows the ID of its neighbors.The cluster head is selected based on criteria involving these IDs such as the lowest ID, highest ID...etc.Ephremides et al [20] proposed a clustering algorithm called Linked Cluster Algorithm (LCA) where each node is either, a cluster head, an ordinary node or a gateway node.Initially, all nodes have status of ordinary node; periodically each node in the network broadcasts its ID and its neighbors IDs.Subsequently, the node with the smallest ID is selected as cluster head.A node which can hear two or more cluster heads is a gateway.The process repeats until every node belongs to at least one cluster.Nodes with a small ID are more likely to be selected as cluster heads so they quickly consume their energy.
Chiang et al [30] proposed Least Cluster Change (LCC), an improved versions of LCA algorithm which adds a maintenance step to minimize the cost of re-clustering.The reconstruction of clusters is invoked in only the following two cases:  If two cluster heads are neighbors, then the one with the highest ID gives up the role of cluster head.
 If a non CH node moves outside its cluster and does not join an existing cluster then it will become cluster head forming a new cluster.
LCC improves the stability of clusters but it has some disadvantages e.g. the cost of re-clustering is a bit expensive.
Lin and Gerla [31] proposed another protocol called Adaptive Clustering Algorithm (ACA).In this algorithm, once the clusters are formed, the concept of cluster head disappears and all nodes play the same role in the network.The authors' motivation is that cluster heads can become bottlenecks and consume their resources faster than other nodes.The same metric as the LCA (the lowest ID) is used for the CH selection.In cluster maintenance, each node must know its two-hop neighbors.If the distance between two nodes in the same cluster becomes three hops, than cluster maintenance is invoked.
A heuristic based algorithm [13] called Max-Min Dcluster builds D-clusters non-overlapping.The node ID is used for CH election.The algorithm is divided into three phases.In the first phase, each node broadcasts its ID to its neighbors within D-hops, collects their IDs and finds the highest ID which it will broadcast in the second phase.In the second phase, on receiving the highest IDs, each node keeps the lowest IDs among the highest.During the third phase cluster head is chosen based on the IDs saved in the two previous phases.This algorithm produces a robust structure of clusters.However, the duration of cluster formation is significant and more information is exchanged before electing a CH.
Chen et al proposed an algorithm [32] that constructs k-hop clusters by generalizing the scheme [31].Nodes initiate the clustering process by flooding requests for clustering to all the other nodes.Each node has to know its k-hops neighbors.All nodes whose ID is lowest among all their k-hop neighbors broadcast their decision to create clusters to all their k-hop neighbors and becomes CHs.The maintenance phase is similar to the one used in [31] but it takes into account the cluster radius.However, the same disadvantages of [31] are still present.

Topology Based Clustering
In the topology based clustering, the cluster head is chosen based on a metric computed from the network topology like node connectivity.We present below some of the existing topology based clustering algorithms.
Gerla and Tsai proposed a protocol called High-Connectivity Clustering (HCC) [21] based on the degree of connectivity to construct clusters.In this protocol the node with the highest number of neighbors is selected as the cluster head.If two nodes or more have the same degree of connectivity, the node with the lowest ID is elected as a cluster head.HCC generates a limited number of clusters.In mobile environment, this algorithm increases the number of re-affiliations of CHs because their degree changes very frequently.
In [34], Yu and Chong proposed 3-hop Between Adjacent Cluster-heads (3hBAC) which creates a 1-hop non-overlapping clusters structure with three hops between neighboring cluster heads by the introduction of a new node status, named cluster guest.Cluster guest node is a mobile node that cannot directly connect to any cluster head, but can access some clusters with the help of a cluster member.During cluster formation, the nodes having the highest degree are declared as CHs.All one hop neighbors join as member nodes.The neighbor nodes of these member nodes that cannot directly join any cluster will be declared as cluster guest.Cluster maintenance is performed the same way as in LCC algorithm.This algorithm reduces the number of CHs in the network.CHs and member nodes keep their status for a long period.However, this algorithm requires that each node maintains two tables: a neighbor table and member table that contain all member nodes of the network.
Guizani et al [35] proposed a new clustering algorithm named α-Stability Structure Clustering (α-SSCA) that has three phases.The first phase consists in collecting the neighbor nodes information necessary for CHs election by exchanging HELLO message.During the second phase a score function is used as a metric for CHs election.The score function is based on the number of neighbors whose status has not been decided yet.The node with the highest score is elected as cluster head.This technique has the advantage of keeping neighboring CHs far away from each other which leads to minimal invocation of the maintenance procedure.This algorithm increases moderately the number of clusters in the aim of improving clusters stability, and reducing the overheads.
Associativity-based Cluster Formation and Cluster Management [36] use a new metric called associativity representing the relative stability of nodes in their neighborhood.Every time, a node u checks its current neighbors, it increments by one the associability value of the nodes from the previous period.When a neighbor moves away, its associativity value is reset to zero.The associativity value is set to one when a neighbor is detected for first time or redetected.At each instant of time, the associativity of u is the sum of the values associated to its neighbors.During the cluster formation phase, each node considers the nodes in its k-neighborhood, the node with the highest associativity is chosen as cluster head.When more than one has the highest associativity value, the node with the highest degree is chosen.This algorithm produces overlapping k-clusters that remain stable over a long period of time.

Mobility Based Clustering
Lowest Relative Mobility Clustering Algorithm (MOBIC) [38] is based on the LCA algorithm but involves the relative mobility of nodes as a criterion in the cluster head selection.The idea is to choose nodes with low mobility as cluster heads because they provide more stability.MOBIC uses a similar clusters maintenance procedure as LCC [30] with an additional rule to minimize the cost of clusters maintenance.MOBIC uses Cluster Contention Interval (CCI) to avoid unnecessary cluster head relinquishing.If two CHs are neighbors after the CCI time period has expired, then the one with the highest ID gives up the role of CH.This mechanism reduces the CHs maintenance.However, the limitations of LCC algorithm are not completely eliminated.
A novel clusters [39] which guarantees longer lifetime of the clustering structure.The main idea is to estimate the future mobility of mobile nodes so that the ones that will exhibit the lowest estimated mobility will be chosen as CHs.Combining the mobility prediction scheme with the highest degree clustering technique, the authors proposed a distributed algorithm that builds a small and stable virtual backbone over the whole network.This algorithm creates clusters highly resistant to node mobility.The node with the highest weight among its neighbors is declared as the CH.This algorithm eliminates the problem of frequently changing CH due to node mobility, by allowing a node to become a CH or to join a new cluster without starting a re-clustering phase.
Ni et al proposed a mobility prediction-based clustering (MPBC) scheme [40] for MANETs with high mobility nodes.The basic information in MPBC is the relative speeds estimation for each node in the whole network.During the clustering stage, all nodes broadcast the Hello packets periodically to build their neighbors lists.Each node estimates its average relative speeds with respect to its neighbors based on the Hello packets exchanges.Nodes with lowest relative mobility are selected as CHs.During cluster maintenance stage a prediction-based method is to solve the problems caused by relative node movements, including the cases when a node moves out of the coverage area of its current CH, and when two CHs move within the reach of each other, one is required to give up its CH role.This approach extends the connection lifetime which results in stable clusters.
Mobility-based d-hop clustering algorithm (MobDHop) [41] divides the network into d-hop clusters based on relative mobility metric.The objective of creating d-hop clusters is to supports larger than one-hop radius clusters which reduces the number of cluster heads.The relative mobility is estimated based on the signal strengths of received packets.The distance between two nodes is estimated using the signal strengths of the received packets exchanged.The cluster formation process is divided into two stages: Discovery Stage and Merging Stage.During the discovery stage, mobile nodes with similar speed and direction are grouped into the same cluster.The merging phase is invoked in order to either merge clusters together or join individual nodes to a cluster.The cluster maintenance process is invoked when a node switches on and joins the network or a node switches off and leaves the network.

Energy based Clustering
The battery power of node is a constraint that affects directly the lifetime of the network, hence the energy limitation poses a severe challenge for network performance.CH performs special tasks such as routing causing excessive energy consumption.Next, we discuss some existing energy based clustering algorithms.
A multicast power greedy clustering (MPGC) [15] is based on heuristic to reduce the energy consumption.This algorithm runs in three consecutive phases: beacon phase, greedy phase and recruiting phase.During beacon phase, each node sends a beacon signal with the highest power in order to inform its neighbors of its presence and collects information about its neighbors of the beacons received.During the greedy phase, each node sends a cluster head declaration with necessary level of power required to reach its nearest neighbor, and then it increases its power level step by step until it reaches all its neighbors.During last phase, each node has the value of the residual power of its neighbors.If a node u has the highest residual power among all its neighbors, then u is elected as cluster head.MPCG prolongs network lifetime, but it requires several steps to construct the clusters structure which increases network traffic and bandwidth consumption.
A Flexible Weighted Clustering Algorithm based on Battery Power (FWCABP) for MANETs [42] is proposed to maintain stable clusters by preventing nodes with low battery power from being elected as a cluster head, minimizing the number of clusters, and minimizing the clustering overhead.During cluster formation phase, each node broadcasts a beacon message to inform its neighbors of its status and builds its neighbors list.The CHs election is based on the weight values of the degree Copyright © 2013 SciRes.CN of nodes, sum of distance to its neighboring nodes, nodes mobility and remaining battery power.The node with the smallest value is selected as CH.FWCABA invokes the maintenance procedure when: a node moves outside its cluster boundary and/or CH battery power decreases to a predefined threshold value.FWCABP increases network traffic during the cluster head election process which degrades the network performance.Enhance Cluster based Energy Conservation (ECEC) algorithm [43] is an enhancement of Cluster based Energy Conservation algorithm (CEC) [44].The authors presented a new topology control protocol that extends the lifetime of large ad hoc networks while ensuring minimum connectivity of nodes in the network, the ability for nodes to reach each other and conserve energy by identifying redundant nodes and turning their radios off.During cluster formation phase, nodes with the highest estimated energy values in their own neighborhoods are elected as CHs.After CHs election process, ECEC then elects gateways to connect clusters.It is shown in [43] that ECEC reduces power consumption which leads to a longer network lifetime.However, this scheme exchanges more overhead to elect the CHs and getaways.

Weight based Clustering
Weight based clustering techniques use a combination of weighted metrics such as: transmission power, node degree, distance difference, mobility and battery power of mobile nodes… etc.The weighting factors for each metric may be adjusted for different scenarios.Some of these algorithms are presented next.
A Flexible Weight Based Clustering Algorithm (FWCA) uses a combination of metrics (with different weights) to build clusters.Node degree, remaining battery power, transmission power, and node mobility are used in CHs election process.The cluster size does not exceed a predefined threshold value.During cluster maintenance phase, FWCA uses the clusters capacity and the link lifetime instead of the node mobility because the link stability metric affects the election of a CH with the same weight as the node mobility metric.
Adabi et al proposed Score based clustering algorithm (Sbca) [46] for MANETs which aims to minimize the number of clusters and maximize lifespan of mobile nodes.it uses a combination of the following four metrics to calculate the node score: battery remaining, node degree, number of members and node stability.During cluster formation, each node calculates its score and broadcasts it to its neighbors.The node with highest score is elected as cluster head.Sbca generates fewer clusters than WCA but has the same limitations.
An efficient weight-based clustering algorithm (EW-BCA) for MANETs is proposed in [47] aims to improve the usage of scarce resources such as bandwidth and energy by producing stable clusters, minimizing routing overhead, and increasing end to end throughput.Each node has a combined weight (Number of Neighbors, Battery Residual Power, Stability and Variance of distance with all neighbors) that indicates its suitability.Each node is: NUL, CH, member node, getaway node.Initially all nodes are in the NUL state.Each node calculates its combined weight and broadcasts it to its neighbors.The node with highest combined weight is elected as CH.Cluster maintenance is invoked when a node moves outside the boundaries of its cluster and/or when cluster head consumes most of its battery energy.

Comparison of Clustering Schemes
They are many clustering schemes for MANETs available in the literature.To evaluate these schemes, we have to decide about the metrics to use for the evaluation.Based on our review and the work presented in [11,29], we summarize the comparison in Table 1.We can observe in Table 1, the total overheads increase when clusters number is high and CHs change frequently.The weight based clustering scheme performs better than ID-Neighbor based, topology based, mobility based and energy based clustering.The weight based clustering scheme is the most used technique for CH election that uses combined weight metrics such the node degree, remaining battery power, transmission power, and node mobility etc.It achieves several goals of clustering: minimizing the number of clusters, maximizing lifespan of mobile nodes in the network, decreasing the total overhead, minimizing the CHs change, decreasing the number of re-affiliation, improving the stability of the cluster structure and ensuring a good resources management (minimize the bandwidth consumption) .

Conclusions
In this survey, we first presented fundamental concepts about clustering, including the definition of clustering, design goals and objectives of clustering schemes, advantages and disadvantages of clustering, and cost of network clustering.Then we classified clustering schemes into five categories based on their distinguishing features and their objectives as: Identifier Neighbor based clustering, Topology based clustering, Mobility based clustering, Energy based clustering, and Weight based clustering.We reviewed several clustering schemes which help organize MANETs in a hierarchical manner and presented some of their main characteristics, objective, mechanism, and performance.We also identified the most relevant metrics for evaluating the performance of existing clustering schemes.Most of the presented clustering schemes focus on important issues such as cluster structure stability, the