Correction of inertial navigation system's errors by the help of video-based navigator based on Digital Terrarium Map

This paper deals with the error analysis of a novel navigation algorithm that uses as input the sequence of images acquired from a moving camera and a Digital Terrain (or Elevation) Map (DTM/DEM). More specifically, it has been shown that the optical flow derived from two consecutive camera frames can be used in combination with a DTM to estimate the position, orientation and ego-motion parameters of the moving camera. As opposed to previous works, the proposed approach does not require an intermediate explicit reconstruction of the 3D world. In the present work the sensitivity of the algorithm outlined above is studied. The main sources for errors are identified to be the optical-flow evaluation and computation, the quality of the information about the terrain, the structure of the observed terrain and the trajectory of the camera. By assuming appropriate characterization of these error sources, a closed form expression for the uncertainty of the pose and motion of the camera is first developed and then the influence of these factors is confirmed using extensive numerical simulations. The main conclusion of this paper is to establish that the proposed navigation algorithm generates accurate estimates for reasonable scenarios and error sources, and thus can be effectively used as part of a navigation system of autonomous vehicles.


Introduction
Vision-based algorithms has been a major research issue during the past decades. Two common approaches for the navigation problem are: landmarks and ego-motion integration. In the landmarks approach several features are located on the image-plane and matched to their known 3D location. Using the 2D and 3D data the camera's pose can be derived. Few examples for such algorithms are [2], [3]. Once the landmarks were found, the pose derivation is simple and can achieve quite accurate estimates. The main difficulty is the detection of the features and their correct matching to the landmarks set.
In ego-motion integration approach the motion of the camera with respect to itself is estimated. The ego-motion can be derived from the optical-flow field, or from instruments such as accelerometers and gyroscopes. Once the ego-motion was obtained, one can integrate this motion to derive the camera's path. One of the factors that make this approach attractive is that no specific features need to be detected, unlike the previous approach. Several ego-motion estimation algorithms can be found in [4], [5], [6], [7]. The weakness of ego-motion integration comes from the fact that small errors are accumulated during the integration process. Hence, the estimated camera's path is drifted and the pose estimation accuracy decrease along time. If such approach is used it would be desirable to reduce the drift by activating, once in a while, an additional algorithm that estimates the pose directly. In [8], such navigation-system is being suggested. In that work, like in this work, the drift is being corrected using a Digital Terrain Map (DTM). The DTM is a discrete representation of the observed ground's topography. It contains the altitude over the sea level of the terrain for each geographical location. In [8] a patch from the ground was reconstructed using `structure-from-motion' (SFM) algorithm and was matched to the DTM in order to derive the camera's pose. Using SFM algorithm which does not make any use of the information obtained from the DTM but rather bases its estimate on the flow-field alone, positions their technique under the same critique that applies for SFM algorithms [1].
The algorithm presented in this work does not require an intermediate explicit reconstruction of the 3D world. By combining the DTM information directly with the images information it is claimed that the algorithm is well-conditioned and generates accurate estimates for reasonable scenarios and error sources. In the present work this claim is explored by performing an error analysis on the algorithm outlined above. By assuming appropriate characterization of these error sources, a closed form expression for the uncertainty of the pose and motion of the camera is first developed and then the influence of different factors is studied using extensive numerical simulations.
Comparison of the corrected position of the object, measured by an independent navigation system DGPS, with the calculated position of the object would estimate the real effectiveness of navigation corrections. The correspondent investigation for described method was made during flight in Galilee in Israel [9]. The position error was about 25 meter and angle error was about 1.5 degree.

Problem Definition and Notations
The problem can be briefly described as follows: At any given time instance t , a coordinates system ) (t C is fixed to a camera in such a way that the Z -axis coincides with the optical-axis and the origin coincides with the camera's projection center. At that time instance the camera is located at some geographical location ) ( Consider now two sequential time instances 1 t and 2 t : the transformation from ) is given by the translation vector

The Navigation Algorithm
The following section describes a navigation algorithm which estimate the above mentioned parameters. The pose and ego-motion of the camera are derived using a DTM and the optical-flow field of two consecutive frames. Unlike the landmarks approach no specific features should be detected and matched. Only the correspondence between the two consecutive images should be found in order to derive the optical-flow field. As was mentioned in the previous section, a rough 3 estimate of the required parameters is supplied as an input. Nevertheless, since the algorithm only use this input as an initial guess and re-calculate the pose and ego-motion directly, no integration of previous errors will take place and accuracy will be preserved.
The new approach is founded on the following observation. Since the DTM supplies information about the structure of the observed terrain, depth of observed features is being dictated by the camera's pose. Hence, given the pose and ego-motion of the camera, the optical-flow field can be uniquely determined. The objective of the algorithm will be finding the pose and ego-motion which lead to an optical-flow field as close as possible to the given flow field.
A single vector from the optical-flow field will be used to define a constraint for the camera's pose and ego-motion. Let . Assuming a pinhole model for the camera, then be the homogeneous representations of these locations. As standard, one can think of these vectors as the vectors from the optical-center of the camera to the projection point on the image plane. Using an initial-guess of the pose of the camera at The true ground feature G W can be described using true pose parameters: Here,  denotes the depth of the feature point (i.e. the distance of the point to the image plane projected on the optical-axis). Replacing (2) in (1): (3) From this expression, the depth of the true feature can be computed using the estimated feature location: By plugging (4) back into (2) one gets: p respectively. The superscript describing the coordinate frame in which the vector is given will also be omitted, except for the cases were special attention needs to be drawn to the frames. Normally, 12 p and q 's are in camera's frame while the rest of the vectors are given in the world's frame. Using the simplified notations, (5) 4 can be rewritten as: In order to obtain simpler expressions, define the following projection operator: This operator projects a vector onto the subspace normal to s , along the direction of u . As an illustration, it is easy to verify that 0 ) , . By adding and subtracting E G to (6), and after reordering: Using the projection operator, (8) becomes: The above expression has a clear geometric interpretation (see Fig.1). The vector from E G to 1 p is being projected onto the tangent plane. The projection is along the direction  Our next step will be transferring G from the global coordinates frame-W into the first camera's frame 1 C and then to the second camera's frame 2 C . Since 11)  in the above expression represents: One can think of  as an operator with inverse characteristic to  : it projects vectors on the ray continuing q . By reorganizing (13) and using the projection operator, we obtain: is being projected on the orthogonal complement of 2 q . Since G C 2 and 2 q should coincide, this projection should yield the zero-vector. Plugging (11) into (14) yields our final constraint: This constraint involves the position, orientation and the ego-motion defining the two frames of the camera. Although it involves 3D vectors, it is clear that its rank can not exceed two due to the usage of  which projects 3  on a two-dimensional subspace. Such constraint can be established for each vector in the optical-flow field, until a nonsingular system is obtained. Since twelve parameters need to be estimated (six for pose and six for the ego-motion), at least six optical-flow vectors are required for the system solution. But it is correct conclusion for nonlinear problem. If we use Gauss-Newton iterations method and so make linearization of our problem near approximate solution. The found matrix will be always singular for six points (with zero determinant)as numerical simulations demonstrate. So it is necessary to use at least seven points to obtain nonsingular linear approximation. Usually, more vectors will be used in order to define an over-determined system, which will lead to more robust solution. The reader attention is drawn to the fact that a non-linear constraint was obtained. Thus, an iterative scheme will be used in order to solve this system. A robust algorithm which uses Gauss-Newton iterations and M-estimator is described in [10].We begin to use Levenberg-Marquardt method if Gauss-Newton method after several iterations stopped to converge. This two algorithms are realized in lsqnonlin() Matlab function. The applicability, accuracy and robustness of the algorithm was verified though simulations and lab-experiments.
It is more convenient to use more robust for iterations equivalent to (15) equation: Using of this normalized form of equations avoids to get incorrect trivial solution when two positions are in a single point on the ground.

Multiple Features
Suppose next that n feature points are tracked in two frames, so that the estimated locations Repeating this for each feature point: Note that n A and n B depend on known quantities: the estimated features, the normals of the DTM tangent planes, and the images of the features at the two time instances, together with the unknown orientation 1 R and the relative rotation 12 R . At this point in our discussion, several remarks are in order. Remark 1: The constraint (18) involves twelve "unknowns", namely the pose and egomotion of the camera. From the remark at the end of the previous section, the equation involves at most n 2 linearly independent constraints, so that at least six features at different locations Ti Q are required to have a determinate system of equations. But it is correct conclusion for nonlinear problem. If we use Gauss-Newton iterations method and so make linearization of our problem near approximate solution. The found matrix will be always singular for six points (with zero determinant)as numerical simulations demonstrate. So it is necessary to use at least seven points to obtain nonsingular linear approximation. Usually, more vectors will be used in order to define an over-determined system, and hence reduce the effect of noise. Clearly, there are degenerate scenarios in which the obtained system is singular, no matter what is the number of available features. Examples for such scenarios include flying above completely planar or spherical terrain. However, in the general case where the terrain has ``interesting'' structure the system is non-singular and the twelve parameters can be obtained. Remark 2: The constraint (18) is non-linear and, therefore, no analytic solution to it is readily available. Thus, an iterative scheme will be used in order to solve this system. A robust algorithm using Newton-iterations and M-estimator will be described in following sections.
Remark 3: Given Remark 2, one observes that the location and translation appear linearly in the constraint. Using the pseudo-inverse, these two vectors can be solved explicitly to give: so that, after resubstituting in (19): This remark leads to two conclusions: 1. If the rotation is known to good accuracy and measurement noise is relatively low, then the position and translation can be determined by solving a linear equation. This fact may be relevant when ''fusing'' the procedure described here with other measurement, e.g., with inertial navigation.
2. Equation (21) shows that the estimation of rotation (both absolute and relative) can be separated from that of location/translation. This fact is also found when estimating pose from a set of visible landmarks as shown in [11]. In that work, similarly to the present, the estimate is obtained by minimizing an objective function which measures the errors in the object-space rather than on the image plane (as in most other works). This property enables the decoupling of the estimation problem. Note however that [11] address's only the pose rotation and translation decoupling while here the 6 parameters of absolute and relative rotations are separated from the 6 parameters of the camera location and translation.

The Epipolar Constraint Connection
Before proceeding any further, it is interesting to look at (15) in the light of previous work in SFM and, in particular, epipolar geometry. In order to do this, it is worth deriving the basic constraint in the present framework and notation. Write: , and thus the two should coincide. Additionally, since 1 q is also a projection of the same feature in the 1 C -frame, the epipolar constraint requires that the two rays (one in the direction of 2 q and the other from 12 p in the direction of 1 12 q R ) will intersect. It follows that: (23) and hence: , let  x denote the skew-symmetric matrix: Then, it is well known that the vector product between two vectors x and y can be expressed as: .
= y x y x   Using this notation, the epipolar constraint (24) can be written as: (25) and symmetrically as: The important observation here is that if the vector 12 p verifies the above constraint, then the vector 12 p   also verifies the constraint, for any number  . This is an expression of the ambiguity built into the SFM problem. On the other hand, the constraint (15) is non-homogeneous and hence does not suffer from the same ambiguity. In terms of the translation alone (and for only one feature point!), if 12 p verifies (15) for given 1 R and 12 R , then also 2 12 q p   will verify the constraint, and hence the ego-motion translation is defined up to a one-dimensional vector. However, one has the following trivially: and hence the epipolar constraint does not provide an additional equation that would allow us to solve for the translation in a unique manner. Moreover, observe that (15) can be written using a vector product instead of the projection operator as: Taking into account the identity it is possible to conclude that (28)  (26), and hence the new constraint ''contains'' the classical epipolar geometry. Indeed, one could think of the constraint derived in (15) as strengthening the epipolar constraint by requiring not only that the two rays (in the directions of 1 q and 2 q ) should intersect, but, in addition, that this intersection point should lie on the DTM's linearization plane. Observe, moreover, that taking more than one feature point would allow us to completely compute the translation (at least for the given rotation matrices).

Vision-based navigation algorithm corrections for inertial navigation by help of Kalman filter
Vision-based navigation algorithms has been a major research issue during the past decades. Algorithm used in this paper is based on foundations of multiple-view geometry and a land map. By help of this method we get position and orientation of a observer camera. On the other hand we obtain the same data from inertial navigation methods. To adjust these two results Kalman filter is used. We employ in this paper extended Kalman filter for nonlinear equations [12].
For inertial navigation computations was used Inertial Navigation System Toolbox for Matlab [13].
Input of Kalman filter consists of two part. The first one is variables X for equations of motion. In our case it is inertial navigation equations. Vector X consists of fifteen components: perturbations to these angles can be added linearly and so these angles can be used in Kalman filter for small errors. Such choose of angles is made because formulas describing their evolution are much simpler than formulas describing evolution of Euler angles differences. Variables . Diagonal elements of k Q correspondent to Euler angles are defined by Gyro noise and proportional to dt : dt Q A  . We assume that errors for between values gotten by vision-based navigation algorithm and real values are linearly depend on noise. Corespondent measurement noise covariance matrix is denoted by k R . Error analysis giving this matrix is described in [14]. k H , (5,8) k H , (6,9) k H describing angles are equal to one. The rest of the elements are equal to zero.
The rest of elements for matrix Phi are equal to zero.
Kalman filter time update equations are follow: Kalman filter update equations project the state and covariance estimates from the previous time step 1  k to the current time step k . Kalman filter measurement update equations are follow: Kalman filter measurement update equations correct the state and covariance estimates with measurement k Z . The found vector k X is used to update coordinates, velocities, Euler angles, Accel and Gyro biases for inertial navigation calculations on the next step.
Numerical simulations were realized to examine effectiveness of Kalman filter to combine these two navigation algorithms. On Figs. 3-5 we can see that corrected path for coordinate error much smaller than inertial navigation coordinate error without Kalman filter. Improved results by help Kalman filter are gotten also for velocity in spite of the fact that this velocity was not measured by help vision-based navigation algorithm (Fig. 6).

Error analysis
The rest of this work deals with the error-analysis of the proposed algorithm. In order to evaluate the algorithm's performance, the objective-function of the minimization process needs to be defined first: For each of the n optical-flow vectors, the function 3 12 : is defined as the left-hand side of the constraint described in (16): In the above expression, 12 R and i  are functions of ) , , ( :    will be defined as the concatenation of the i f functions: . According to these notations, the goal of the algorithm is to find the twelve parameters that minimize 2 ) , , where  represents the 12vector of the parameters to be estimated, and D is the concatenation of all the data obtain from the optical-flow and the DTM. If D would have been free of errors, the true parameters were obtained. Since D contains some error perturbation, the estimated parameters are drifted to erroneous values. It has been shown in [15] that the connection between the uncertainty of the data and the uncertainty of the estimated parameters can be described by the following first-order approximation: Here,   and D  represent the covariance matrices of the parameters and the data respectively. g is defined as follows: n Jacobian matrix of F with respect to the twelve parameters. By ignoring second-order elements, the derivations of g can be approximate by: is defined in a similar way as the ) (3 m n  Jacobian matrix of F with respect to the m data components. Assigning (48) and (49) back into (46) yield the following expression: In expressions (53) in expression (56) is the ground feature G under the second camera frame as defined in (11).

D  Calculation
As mention above, the data-vector D is constructed from concatenation of all the 2 q 's followed by concatenation of the E G 's. Thus D  should represent the uncertainty of these elements.
Since the 2 q 's and the E G 's are obtained from two different and uncorrelated processed the covariance relating them will be zero, which leads to a two block diagonal matrix: In this work the errors of image locations and DTM height are assumed to be additive zero-mean Gaussian distributed with standard-deviation of I  and h  respectively. Each 2 q vector is a projection on the image plane where a unit focal-length is assumes. Hence, there is no uncertainty about its z -component. Since a normal isotropic distribution was assumed for the sake of simplicity, the covariance matrix of the image measurements is defined to be: In [16] the accuracy of location's height obtained by interpolation of the neighboring DTM grid points is studied. The dependence between this accuracy and the specific required location, for which height is being interpolated, was found to be negligible. Here, the above finding was adopted and a constant standard-deviation was set to all DTM heights measurements. Although there is a dependence between close E G 's uncertainties, this dependence will be ignored in the following derivations for the sake of simplicity. Thus, a block diagonal matrix is obtained for G  containing the 3 3 covariance matrices i G  along its diagonal which will be derived as follows: consider the ray sent from 1 p along the direction of 1 1 q R . This ray should have intersected the terrain at be the 3D point on the terrain above that location.
Using that H belongs to the true terrain plane one obtains: Extracting  from (60) and assigning it back to E G 's expression yields: The above result was obtained using the fact that the z -component of N is 1: Finally, the uncertainty of E G is expressed by the following covariance-matrix: The algorithm presented in this work estimates the pose of the first camera frame and the ego-motion. Usually, the most interesting parameters for navigation purpose will be the second camera frame since it reflect the most updated information about the platform location. The second pose can be obtained in a straightforward manner as the composition of the first frame pose together with the camera ego-motion: 12  (65) The uncertainty of the second pose estimates will be described by a 6 6  covariance matrix that can be derived from the already obtained 12 12 covariance matrix   by multiplication from both sides with 2 C J . The last notation is the Jacobian of the six 2 C parameters with respect to the twelve parameters mentioned above. For this purpose, the three Euler angles 2  , 2  and 2  need to be extracted from (65) using the following equations: Simple derivations and then concatenation of the above expressions yields the required Jacobian which is used to propagate the uncertainty from

Divergence of the method. Necessary thresholds for the method convergence
In previous Section we considered Error analysis for video navigation method. But its consideration is correct only if found solution is close to true one. If it is not true nonlinear effects can appear or even we can found incorrect local minimum. In this case the method can begins to diverge. We can obtain the such result: 1)if large number of outliers features appears.
2)if the case is close to degenerated one. In this case the position or orientation errors are too large. It can happen for example for small number of features, flat ground , small field of view of camera and all that.
3) if the initial position and orientation for iterations process are too far from true values In the follow subsections we consider some threshold conditions which allow us to avoid the such situations.
If in some case even one of these threshold conditions is not correct we don't use for this case the correction of visual navigation method and use only usual INS result.If such situation repeats three times we stope to use the visual navigation method at all and don't use it also for the last correct case. Let us discourse these three factors in details

Dealing with Outliers
In order to handle real data, a procedure for dealing with outliers must be included in the implementation. The objective of the present section is to describe the current implementation, which seems to work satisfactorily in practice. Three kinds of outliers should be considered: 1. Outliers present in the correspondence solution (i.e., ''wrong matches''). 2. Outliers caused by the terrain shape, and 3. Outliers caused by relatively large errors between the DTM and the observed terrain.
The latter two kinds of outliers are illustrated in Fig.7. The outliers caused by the terrain shape appear for terrain features located close to large depth variations. For example, consider two hills, one closer to the camera, the other farther away, and a terrain feature Q located on the closer hill. The ray-tracing algorithm using the erroneous pose may ``miss'' the proximal hill and erroneously place the feature on the distal one. Needless to say, the error between the true and estimated locations is not covered by the linearization. To visualize the errors introduced by a relatively large DTM-actual terrain mismatch, suppose a building was present on the terrain when the DTM was acquired, but is no longer there when the experiment takes place. The ray-tracing algorithm will locate the feature on the building although the true terrain-feature belongs to a background that is now visible. As discussed above, the multi-feature constraint is solved in a least-squares sense for the pose and motion variables. Given the sensitivity of least-squares to incorrect data, the inclusion of one or more outliers may result in the convergence to a wrong solution. A possible way to circumvent this difficulty is by using an M-estimator, in which the original solution is replaced by a weighted version. In this version, a small weight is assigned to the constraints involving outliers, thereby minimizing their effect on the solution. More specifically, consider the function ) ( i f defined in (45) resulting from the i -th correspondence pair. In the absence of noise, this function should be equal to zero at the true pose and motion values and hence, following standard notation, define the residual Using an M-estimator, the solution for  (the twelve parameters to be estimated) is obtained using an iterative re-weighted least-squares scheme: The weights i w are recomputed after each iteration according to their corresponding updated residual. In our implementation we used the so-called Geman-McClure function, for which the weights are given by: The calculated weights are then used to construct a weighted pseudo-inverse matrix that replaces the regular pseudo-inverse T J appearing in (50). See [17] for further details about M-estimation techniques. Let us define weights matrix W which allows us to decrease influence of outliers and n is number of features. The weights matrix W ) 3 (3 n n  can be found as follow: for diagonal elements of W we can write : gives us the maximum camera position shift allowing the photo feature error to be smaller than pixel size.Threshold value dist threshold is chosen to be 40. Degenerated case because of small baseline (distance between two camera positions used in video navigation method) gives the follow threshold conditions:

Dependence of error analysis on different factors.
The purpose of the following section is to study the influence of different factors on the accuracy of the proposed algorithm estimates. The closed form expression that was developed throughout the previous section is being used to determine the uncertainty of these estimates under a variety of simulated scenarios. Each tested scenario is characterized by the following parameters: the number of optical-flow features being used by the algorithm, the image resolution, the grid spacing of the DTM (also referred as the DTM resolution), the amplitude of hills/mountains on the observed terrain, and the magnitude of the ego-motion components. At each simulation, all parameters except the examined one are set according to a predefined parameters set. In this default scenario, a camera with 400 400 image resolution flies at altitude of 500m above the terrain. The terrain model dimensions are 3 3 km with 300m elevation differences ( Fig.13(b)). A DTM of 30m grid spacing is being used to model the terrain (Fig.10(c)). The DTM resolution leads to a standarddeviation of 2.34m for the height measurements.    differs the two images being used for the optical-flow computation. Each of the simulations described below study the influence of different parameter. A variety of values are examined and 150 random tests are performed for each tested value. For each test the camera position and orientation were randomly selected, except the camera's height that was dictated by the scenario's parameters. Additionally, the direction of the ego-motion translation and rotation components were first chosen at random and then normalized to the require magnitude.
In Fig.8, the first simulation results are presented. In this simulation the number of optical-23 flow features that are used by the algorithm is varied and its influence on the obtained accuracy of 2 C and the ego-motion is studied. All parameters were set to their default values except for the features number. Fig.8(a) presents the standard-deviations of the second frame of the camera while the deviations of the ego-motion are shown in Fig.8(b). As expected, the accuracy improves as the number of features increases, although the improvement becomes negligible after the features' number reaches about 150. In the second simulation the influence of the image resolution was studied (Fig.9). It was assumed that the image measurements contain uncertainty of half-pixel, where the size of the pixels is dictated by the image resolution. Obviously, the accuracy improves as image resolution increases since the quality of the optical-flow data is directly depends on this parameter. The influence of DTM grid spacing is the objective of the next simulation. Different DTM resolutions were tested varying from 10m up to an extremely rough resolution of 190m between adjacent grid points (see Fig.10). The readers attention is drawn to the fact that the obtained accuracy seems to decrease linearly with respect to the DTM grid-spacing (see Fig.11) . This phenomenon can be understood since, as was explained in the previous section, the DTM resolution does not affect the accuracy directly but rather it influences the height uncertainty which is involved in the accuracy calculation. As can be seen in Fig.12, the standard-deviation of the DTM heights increases linearly with respect to the DTM grid spacing which is the reason for the obtained results. Another simulation demonstrates the importance of the terrain structure to the estimates accuracy. In the extreme scenario of flying above a planar terrain, the observed ground features do not contain the required information for the camera pose derivation, and a singular system will be obtained. As the height differences and the variability of the terrain increase, the features become more informative and a better estimates can be derived. For this simulation, the DTM elevation differences were scaled to vary from 50m to 450m (Fig.13). It is emphasized that while the terrain structure plays a crucial role at the camera pose estimation together with the translational component of the ego-motion, it has no direct affect on the ego-motion rotational component. As the opticalflow is a composition of two vector fields -translation and rotation, the information for deriving the ego-motion rotation is embedded only in the rotational component of the flow-field. Since the features depths influence only the flow's translational component it is expected that the varying height differences or any other structural change in the terrain will have no affect on the ego-motion rotation estimation. The above characteristics are well demonstrated in Fig.14.  Fig.15). As a conclusion from the above stated, the time gap between the two camera frames should be as long as the optical-flow derivation algorithm can tolerate. Inertial navigation systems (INS) are used usually for detection of missile position and orientation. The problem of this method is that its error increases all time. We propose to use new method (Navigation Algorithm based on Optical-Flow and a Digital Terrain Map) [18] to correct result of INS and to make the error to be finite and constant. Kalman Filter is used to combine results of INS and results of new method [12]. Error analysis with linear first-order approximation is used to find error correlation matrix for our new method [14].    The typical results of numerical simulations can be seen on (Fig.3, 4  1) If situation is close to degenerated case (for example, for small camera field of view, almost flat ground, small baseline and so on) we can not used described method because it is impossible to find cameras states from this data. But it is possible also for this case to used found correspondent features constrains for INS results improvement by help Kalman filter.We can consider directly these corespondent features (and not calculated position and orientation on basis these features) as result of measurement for Kalman filter. Example of the such improvement can be found in [19]. But in this case errors of method will increase with time similar to INS. So after some time measured position is too far from the true position and we can not use DTM constrains for error correction, but only epipolar constrains. For described in this paper method the error stops to increase and remains constant so we are capable to use DTM constrains all time.
2)It is possible to consider more optimal and fast methods for looking for minimum of function giving position and orientation of camera.For example it is possible to improve initial state for described method , using epipolar equations (25 ) for 12 R and 12 p up to constant calculations.
The next step can be use equation (21)  3)We can look for not only some random features. Also hill tops, valleys and hill occluding boundaries can be used for position and orientation specifying. 4) using distributed (not point) features and also some character object recognition. 5) Using the used methods in different practical situations: orientation in rooms, inside of man body.

Conclusions
An algorithm for pose and motion estimation using corresponding features in images and a DTM was presented with using Kalman filter. The DTM served as a global reference and its data was used for recovering the absolute position and orientation of the camera. In numerical simulations position and velocity estimates were found to be sufficiently accurate in order to bound the accumulated errors and to prevent trajectory drifts.
An error analysis has been performed for a novel algorithm that uses as input the optical flow derived from two consecutive frames and a DTM. The position, orientation and ego-motion parameters of the the camera can be estimated by the proposed algorithm. The main source for errors were identified to be the optical-flow computation, the quality of the information about the terrain, the structure of the observed terrain and the trajectory of the camera. A closed form expression for the uncertainty of the pose and motion was developed. Extensive numerical simulations were performed to study the influence of the above factors.
Tested under reasonable and common scenarios, the algorithm behaved robustly even when confronted with relatively noisy and challenging environment. Following the analysis, it is concluded that the proposed algorithm can be effectively used as part of a navigation system of autonomous vehicles.
On basis results of numerical simulation for real parameters of flight and camera we also can conclude follow: 1) The most important parameter of simulations is FOV: for the small FOV the method diverges. For FOV 60 degree the results are very good. The reason for this is that for small FOV (12 or 6 degree) the situation is close to degenerated state, also we must choose small baseline and observed ground patch is too small and almost flat.
2) Resolution of camera is also very important parameter: for better resolution we have much more better results, because of much more better precision of features detection.
3) The precision of new method depends on flight height. Initially precision increases with height increasing because we can use bigger baseline and can see bigger patch of ground. But for bigger heights precision begin to decrease because of small parallax effect.