Range-Based Localization in Wireless Networks Using Density-Based Outlier Detection

,


Introduction
The process of finding the spatial location of nodes in a wireless network has been called localization, positioning, geolocation, and self-organizing in the literature.The term localization is the most popular and so is used here.In a wireless network, we can classify the nodes into three categories, anchor, unlocalized, and localized.The first group of nodes know their position or coordinates and are called anchors.Nodes in the second group do not know their position and are called unlocalized.The third group contains those nodes which were in the second group but subsequently had their positions estimated, and thus are called localized.
The location information of nodes in a wireless network can be used for many useful purposes such as tracking mobile nodes, determining the coverage area, load and traffic management, node lifetime control, cluster formation, and routing enhancement.There are many different aspects to the localization problem, such as when localization should be performed and how fre-quently.Upon network start up, all nodes should be initially localized.However, this may have to be repeated periodically, for example if there are mobile nodes in the network.
The quality or resolution of the localization is an important consideration.Sometimes, node locations are required within meters of their actual positions, and other times within a few centimetres.Some applications only require relative localization, such as node A is in region 1 and node B is in region 2, or node A is close to node B. For example, monitoring people in a building when we need to know how many have entered a given room during the day.
The computations associated with localization can bedistributed (done at each node), centralized (done at a central unit), or both (done at cluster heads in the network).
When the number of anchor nodes is low, they typically cannot cover the entire wireless network.This means that some unlocalized nodes may not be within range of a sufficient number of anchor nodes.In this case, localized nodes can participate in the localization process by acting as anchors.This is called cooperative localization.
Localization algorithms can be divided into two categories: range-based and range-free.Range-free algorithms depend on proximity sensing or connectivity information to estimate the node locations.These include CPE [1], centroid [2], APIT [3], and the distributed algorithm in [4].Range-based algorithms estimate the distance between nodes using measurements such as time of arrival (ToA) [5], time difference of arrival (TDoA) [6], received signal strength (RSS) [7], or angle of arrival (AoA) [8].
Position accuracy is not constant across the area of coverage, and poor geometry of the unlocalized nodes relative to the anchor nodes can lead to high geometric dilution of precision (GDOP).GDOP is commonly used to describe localization accuracy.Generalized GDOP (GGDOP) is a similar measure used to compare the performance of localization algorithms.
The proposed approach differs from conventional solutions to the localization problem in wireless networks.Typically the locations of the anchors within range and the estimated distances between the unlocalized node and these anchors are used to directly estimate its location.Instead, we use a multi-step process.An approach from data mining called density-based outlier detection (DB OD) [9] is employed which uses the distance to the Knearest neighbours (KNN) to select the best (candidate) points, and these are averaged to get the estimated location of the unlocalized node.
The remainder of the paper is organized as follows.Dilution of precision is explained in Section 2, and the density-based outlier detection technique is explained in Section 3. The proposed algorithm is presented in Section 4. Some performance results are given in Section 5, and finally some conclusions are given in Section 6.

Dilution of Precision
Dilution of precision is a metric which describes how good an anchor node geometry is for localization.The distance measurements used to compute the node coordinates always contain some error.These measurement errors result in errors in the computed node coordinates.The magnitude of the final error depends on both the measurement errors and the geometry of the structure induced by the nodes.The contribution due to geometry is called the geometric dilution of precision (GDOP).GDOP is used extensively in the GPS community as a measure of localization performance [15].
The distribution of the anchors around an unlocalized node can have a good or poor GDOP, as shown in Figure 1.Another version of GDOP is the generalized geometry of dilution precision GGDOP.GGDOP depends on the geometry of the anchors around an unlocalized node and the accuracy of the range measurements.GGDOP is defined as [16] 2 where

sin (
) The distance error for node i has a Gaussian distribution with variance 2 whose location is being estimated, as shown in Figure 2, and m is the number of anchors and localized nodes involved in the estimation.As the GGDOP increases, the localization error decreases.In [16], it was shown that for all

Density-Based Outlier Detection
The density-based outlier detection algorithm is commonly used in anomaly detection.The outlier score is just the inverse of the density score of a point.The density is the inverse of the mean distance to the K-nearest neighbours of point p [9] and is given by where ( , ) N p K is the set containing the K-nearest neighbours of point p, ( , ) N p K is the size of this set, and y is a nearest neighbour.The density-based outlier detection algorithm is given in Algorithm 1 [9].

The Proposed Algorithm
The first step in localization is to obtain distance estimates for the unlocalized nodes from the anchor and localized nodes that are within range.These estimates provide the radii for circles around the nodes.The intersection of these circles for an unlocalized node forms a set of points to be used in the remainder of the algorithm.The key is to choose candidate intersection points which are closest to each other.In the ideal case the circles in tersect on the unlocalized node.For example, when we have three anchors, three intersection points lie on the node, while the other three do not.However, in practical situations where noise and other sources of error exist, this event is unlikely and the circles intersect as in Fig--ure 1.In Figure 3, the intersection points of the circles around anchor nodes p 1 = (x 1 ,y 1 ) and p 2 = (x 2 ,y 2 ) are denoted as p 12 and p 21 , and their coordinates are given by [17] and where the 12 p x-coordinate corresponds to the plus sign in (5), and the corresponding y-coordinate corresponds to the minus sign in (6).The distance between the anchor nodes is Each unlocalized node estimates its distance from each anchor or localized node that it can receive a signal from.This node can estimate its position only if it is in range of three or more of these nodes.The intersection of the circles formed from all estimates of the unlocalized node provide a set of points.If we have m anchor and/or localized nodes, then they form g groups where Figure 3. Intersection of the distance estimates for two anchors.
Each group consists of two points as a result of the intersection between two anchor and/or localized node estimates, as shown in Figure 3.The total number of points if all estimates intersect is 2g.The goal is to average a subset of these points to obtain the location estimate.
The third step is to calculate the density of each intersection point.To do this, the K-nearest neighbours, 1 K g   , of each intersection point are used to calculate the density according to (4).The points with a density higher than the mean density are selected as candidates.
In some cases, the estimates are too small, resulting in circles that do not intersect.However, these intersection points can still be calcuated, but they will be complex numbers.If this occurs, we consider the real part as the intersection point in subsequent calculations.rithm for four anchor nodes.After the four distance estimates are determined, the 2g = 12 intersection points are found.Note that because two pairs of circles do not intersect, there are only w = 10 points in Figure 4(a) (since the real parts of the complex numbers are the same).
Then the average density for all intersection points is calculated as 1 ( , ) where v i is an intersection point and w is the number of intersection points.Finally, the points with density given by ( 4) greater than the average D are selected as candidate points.
If the candidate points are , then the estimated location of the unlocalized node is the average of these points We next consider the effect of employing localized nodes to help in the localization of nodes which do not have a sufficient number of anchors around them.If nodes with transmission range r are randomly deployed in an area A, then the probability of an unlocalized node being within transmission range of a given node is and N is the number of anchor and localized nodes.Then the probability of a node having n or more anchor and localized nodes within its range is

Performance Results
In this section, the proposed algorithm LDBOT is compared with the WLS-SVD [11] and LLS [10] algorithms via simulation.We first consider distance to measure the accuracy of both techniques.100 nodes are deployed, 50% of which are anchors which are chosen randomly.The deployment area is A = 100 × 100 m 2 , and the range is r = 10 m.The distance error has a Gaussian distribution with variance 2 d  which is a percentage of the actual distance.As a performance measure, we use the mean error, which is defined as where u is the number of unlocalized nodes, t is the number of trials, ˆ( , )  x y is the estimated unlocalized node position, and (x, y) is the actual position.The results were averaged over 10 4 trials.Localized nodes were used with the anchors to localize those unlocalized nodes which were not within range of a sufficient number of anchor and localized nodes in the previous iterations.
The localization process ends when all nodes are localized or all remaining unlocalized nodes are isolated, i.e., not within range of three or more anchor or localized nodes.Figure 5 shows that the mean error with the proposed algorithm outperforms that with the LLS and WLS-SVD algortihms.Note that the rate of change of the error is also lower with LDBOD.
Next all algorithms are compared considering the transmission range of the wireless nodes.The deployment area, the number of nodes, and the number of anchors are the same as before but the transmission range varies from 10 m to 50 m.The distance error variance is fixed at 10% of the actual distance between nodes.The results were again averaged over 10 4 trials.Figure 6 shows that the proposed algorithm performs better than the LLS and WLS-SVD algorithms at low transmission ranges, in which case the unlocalized nodes are typically within range of a small number of nodes.However, at high transmission ranges all algorithms have similar performance.
The probability that an unlocalized node has 3 or more anchor or localized nodes around it based on a given transmission range is illustrated in Figure 7.This clearly  shows that using localized nodes in the localization process improves the probability of localizing nodes.When the range exceeds 25 m, all unlocalized nodes can be localized.
The anchor ratio is one of the most important factors affecting localization accuracy.Thus we next comparethe algorithms with a varying percentage of anchor nodes.In this case we are more interested in the performance when the anchor ratio is small because in a practical system the number of anchor nodes will be much lower than the number of unlocalized nodes.The deployment area and the number of nodes are the same as before but the anchor ratio varies from 20% to 80%.The transmission range is fixed at 10 m and the distance error variance is fixed at 10% of the actual distance.The results are again averaged over 10 4 trials.Figure 8 shows that the proposed algorithm again outperforms both the LLS and WLS-SVD algorithms, particularly at low anchor ratios.The probability that an unlocalized node has 3 or more anchor or localized nodes around it based on a given anchor ratio is shown in Figure 9. Clearly the use of  localized nodes in the localization process improve the probability of node localization.This probability reaches a maximum of only 0.6 due to the small range of 10 m.
Next we consider the effect of the node geometry on performance, with GGDOP used as the geometry meas--ure.Three anchor nodes are deployed on a circle with a fourth unlocalized node in the center.The transmission range is set to 30 m to ensure that the unlocalized node is within range of the three anchors, and the distance error variance is set to 10%.The anchors a 1 , a 2 , and a 3 are distributed around the unlocalized node u at angles a ua  ranging from 1˚ to 101˚ (with both angles the same).The results are averaged over 10 4 trials, and are shown in Figure 10 for GGDOP and in Figure 11 for angle between anchors.Both figures show that the proposed algorithm performs better, particularly when the geometry is poor, i.e., low GGDOP or small angles.The LLS and WLS-SVD algorithms perform similarly at all angles and GGDOP values because only   four nodes are deployed.At small angles, the geometry is poor, and as the angles increase the geometry approaches the ideal case where the anchors are distributed uniformly on the circle.Thus at angles of 101˚ the performance is very good, and all three algorithms perform similarly.
Finally, we evaluated the algorithms with different anchor ratios and distance error variances.Figure 12 shows the resulting mean error surfaces.Clearly LDBOD outperforms the other algorithms at high distance error variances (90%) and low anchor ratios (10%), which is the most typical, but also the most challenging environment.The performance is similar at high anchor ratios (90%) and low distance error variances (10%), which is close to the ideal case, and therefore not likely to occur in practice.

Conclusions
A new range-based localization algorithm (LDBOD) has been presented which is based on the density-based outlier detection (DBOD) algorithm, a concept from data mining.The proposed algorithm is used to select the best points (candidates) from a set of distance estimate intersection points.The proposed algorithm was shown to outperform the LLS and recently proposed WLS-SVD algorithms.

Algorithm 1
Density-based Outlier Detection 1: K is the number of nearest neighbours 2: for all points p do 3: determine N (p, K) for p 4: determine the density of p 5: the outlier score is the inverse of the density 6: end for

Figure 2 .
Figure 2.An unlocalized node with multiple anchors within its range.

Figure 4 Figure 4 .
Figure 4.The proposed algorithm with four anchor nodes.(a) Step 1: Distance estimates for an unlocalized node from four anchors; (b) Step 2: The intersection points; (c) Step 3: The candidate intersection points.


The probability of a node having degree , . . i e   an- chor or localized nodes within its range, is given by

Figure 7 .
Figure 7. Probability of node localization based on transmission range.

Figure 9 .
Figure 9. Probability of node localization based on the anchor ratio.

Figure 12 .
Figure 12.Mean error surfaces.(a) Mean error surface for the LLS algorithm; (b) Mean error surface for the WLS-SVD algorithm; (c) Mean error surface for the LDBOD algorithm.