Point Selection for Triangular 2-D Mesh Design Using Adaptive Forward Tracking Algorithm

Two-dimensional mesh-based motion tracking preserves neighboring relations ( through connectivity of the mes h) and also allows warping transformations between pairs of frames; thus, it effectively eliminates blocking artifacts that are common in motion compensation by block matching. However, available uniform 2-D mesh model enforces connectivity everywhere within a frame, which is clearly not suitable across occlusion boundaries. To overcome this limitation, BTBC ( background to be covered ) detection and MF ( model failure ) detection algorithms are being used. In this algo-rithm, connectivity of the mesh elements ( patches ) across covered and uncovered region boundaries are broken. This is achieved by allowing no node points within the background to be covered and refining the mesh structure within the model failure region at each frame. We modify the occlusion-adaptive, content-based mesh design and forward tracking algorithm used by Yucel Altunbasak for selection of points for triangular 2 -D mesh design. Then, we propose a new triangulation procedure for mesh structure and also a new algorithm to justify connectivity of mesh structure after motion vector estimation of the mesh points. The modified content-based mesh is adaptive which eliminates the necessity of transmission of all node locations at each frame.


Introduction
Motion estimation is an important part of any video processing system and is divided as 2-D motion estimation and 3-D motion estimation.
2-D motion estimation has a wide range of applications; including video compression, motion forward tracking, sampling rate conversion, filtering and so on.Depending on the intended applications for the resulting 2-D motion vectors, motion estimation methods can be very different.For example, for computer vision applications, the 2-D motion vectors will be used to deduce 3-D structure and motion parameters [1].
On the other hand, for video compression applications, the estimated motion vectors are used to produce a motion compensated prediction of a frame to be coded from a previously coded reference frame.The ultimate goal is to minimize the total bits used for coding the motion vectors and the prediction errors.
In motion tracking, the estimated motion vectors are used to predict position an object in next frame and tracking it in sequencing video images [2,3].
In this paper, we are concerned with methods of 2-D motion estimation and implementation one of them in motion tracking.
All the motion estimation algorithms (2-D) are based on temporal changes in image intensities (color).In fact, the observed 2-D motions based on intensity changes may not be the same as the actual 2-D motion.To be more precise, the velocity of observed or apparent 2-D motion vectors is referred to as optical flow.Optical flow can be caused not only by object motions, but also camera movements or illumination condition changes.In this paper, we define optical flow and its equation which imposes a constraint between image gradients and flow vectors [4].This is a fundamental equality that many motion estimation algorithms are based on.
A key problem in motion estimation is how to parameterize the motion field.A 2-D motion field resulting from a camera or object motion can usually be described by a few parameters.However, there are typically multiple objects in the imaged scene that move differently.Therefore, a global parameterize model which assumes all objects in the scene have equally motion and estimates a vector motion for every scene or any frame is usually inadequate.It is suitable if only the camera is moving or image scene contains a single moving object with a planar surface.This method is feature-based and uses correspondences between pair of selected feature points in two video frames.
The most direct and unconstrained approach is to specify the motion vector at every pixel.This is called pixelbased representation.Such a representation is universally applicable, but it requires the estimation of a large number of unknowns (twice the number of pixels).In addition to, it requires special physical constraints to correct estimation [5][6][7].
For scenes containing multiple moving objects, it is more appropriate to divide an image frame into multiple regions, so that the motion within each region can be characterized well by a parameterized model.This is known as region-based motion representation.It consists of a region segmentation map and several sets of motion parameters (one for each region).The difficulty with such an approach is that one does not know in advance which pixels have similar motions.Therefore, segmentation and estimation must be accomplished iteratively, which requires intensive computations that may not be feasible in practice [8].
One way to reduce the complexity associated with region based motion representation is by using a fixed partition of the image domain into many small blocks.As long as each block is small enough, the motion variation within each block can be characterized well by a simple model and the motion parameters for each block can be estimated independently.This brings us to the popular block-based representation.
The simplest version models the motion in each block by a constant translation, so that the estimation problem becomes that of finding one motion vector for each block.This method provides a good compromise between accuracy and complexity and has found great success in practical video coding systems.One main problem with the block-based approach is that it dose not impose any constraint on the motion transition between adjacent blocks.The resulting motion is often discontinuous across block boundaries, even when the true motion field is changing smoothly from block to block [4,9].
One way to overcome this problem is by using a mesh-based representation, in which the underlying image frame is partitioned into no overlapping polygonal elements.The motion field over the entire frame is described by the motion vectors at the nodes (corners of polygonal elements) only and the motion vectors at the interior points of an element are interpolated from the nodal motion vectors.
This representation induces a motion field that is continuous every where.It is more appropriate than the blockbased representation over interior regions of an object which usually undergo a continuous motion, but it fails to capture motion discontinuities at object boundary; for example, when occlusion or overlapping to objects occurs.
Adaptive schemes such as adaptive block-matching [10] or adaptive mesh [11] that allow discontinuities when necessary are needed for more accurate motion estimation.
In this paper, we present the general methodologies for 2-D motion estimation and evaluate advantages and defects of them.In the following, we focus on the mesh algorithm and present a new adaptive mesh algorithm to forward motion tracking.
2-D mesh model is a method for motion tracking which allows spatial transformations and preserves neighboring relations between elements.This method tracks the motion by transmissions of the points in the elements that are computed from an estimated motion field [11,12].
2-D meshes can be classified as uniform (regular) meshes with equal size for elements [13], and nonuniform (hierarchical or content-based) ones that the size of elements is adapted to particular scene content [14][15][16][17][18].The elements in the regular mesh may overlap within the subsequent frames of the image while a nonuniform mesh is adapted on the boundary of the objects.Furthermore, a uniform mesh is not suitable when there were more than one type of motion in the scene [19,20].
In fact, Object tracking is monitoring the object position variations and tracking it during a video sequence for a perpendicular purpose.The success of forward tracking is closely related to how well we can detect occlusion and model failure regions and estimate the motion field in the vicinity of their boundaries.Motion compensation is necessary to determine points to be killed and the locations of the newly born points.
In this paper, we present an adaptive forward-tracking mesh procedure which in it, none of the points are allowed to locate in the background regions that will be covered in the next frame (BTBC regions) and the mesh within the model failure region(s) is redefined for subsequent tracking of these regions [21].
In addition to, we present the basics of adaptive mesh algorithm, estimation motion fields and BTBC region detection.Then, polygon approximation is reviewed.Next, we present a new, efficient, adaptive forwardtracking mesh design procedure that is employed to design the initial mesh.
In the following, a new triangulation method is proposed.Point motion estimation for the purposes of motion compensation and mesh tracking are implemented too, which are followed by a forward node tracking and mesh refinement algorithm.
Experimental results on the synthetic and simple images showed that the new presented adaptive forwardtracking mesh algorithm is successful in motion tracking.On the other word, the results shown in this paper are synthetic.But, in next step, this new algorithm must be tested on the real states consist of deformation or multiple objects scenes and results must be compared to other algorithms for tracking the object especially in presence the occlusion.

Optical Flow Equation
Combining Equations ( 1) and (2) yields: Where  ,  x y v v represents the velocity vector or flow vector.Equation (3) known as the "optical flow equation".[22]

Pixel-Based Motion Estimation
In pixel motion estimation, one tries to estimate a motion vector for every pixel.Obviously, this scheme is ill-defined.If one uses the constant intensity assumption into two sequenced frame k and k + 1, for every pixel in the k frame, there will be many pixels in the k + 1 frame that have exactly the same image intensity.If one uses the optical flow equation instead, the problem exists again, because there is only one equation for two unknowns.
To circumvent this problem, there are in general four approaches: First, one can use regularization techniques to enforce smoothness constraints on the motion field, so that the motion vector of a new pixel is constrained by those found for surrounding pixels.Second, one can assume that the motion vectors in a neighborhood surrounding each pixel are the same and apply the constant intensity assumption or the optical flow equation over the entire neighborhood.
Third, one can make use of additional invariance constraints; in addition to intensity invariance which leads to the optical flow equation.For example, one can assume that the intensity gradient is invariance under motion as proposed in [5][6][7].
Finally, one can also make use of the relation between the phase functions of the frame before and after motion [23].

Block-Based Motion Estimation
As we have seen, a problem with pixel-based motion estimation is that one must impose smoothness constraints to regularize the problem.
One way of imposing smoothness constrains on the estimated motion field is to divide the image domain into nonoverlapping small region which called "blocks" and assume that motion within each block can be characterized by a simple parametric model; for example, constant, affine or bilinear.
If the block is sufficiently small, then this model can be quite accurate.
Assume that m represents the image block 'm' and 'M' is number of blocks and K = {1,2,…,M}; partition into blocks should satisfy: Theoretically, a block can have any polygonal shape.In practice, the square shape is commonly used; although the triangular shape can be used also and it is more appropriate when the motion in each block is described by an affine model.
Figure 1 illustrates the effect of several of the motion representation for a head-and-shoulder scene.

Block Matching Algorithm
In the simplest case, the motion in each block is assumed to be constant, that is, the entire block undergoes a translation.Therefore, the motion estimation problem is to find a single motion vector for each block.This type of algorithm is referred as the block matching algorithm [24].
Given an image block m in the anchor frame, the motion estimation problem is to determine a matching block The displacement vector m between the spatial positions of these two blocks (in terms of center or a selected corner) is the motion vector of this block.In the other word,  in target frame.The most popular criterion for motion estimation is error function or the sum of he differences between the luminance values of every pair of corresponding points between the anchor frame 1  and the target frame 2  .
Under the blocking translational model, error equation can be written as: Which, p is a positive number.When p = 1, the error function (for every motion estimation algorithm) is called the "mean absolute difference", and when p = 2, it is called "mean squared error".Because the estimated motion vector for a block affects the prediction error in that block only, one can estimate the motion vector foe each block individually by minimizing the prediction error accumulated over each block, which is: One way to determine m that minimize this error is by using exhaustive search.As illustrated in Figure 2, this scheme determines the optimal m for a given block m in the anchor frame by comparing it with all candidate blocks m d d B B in the target frame within a predefined search region and finding the one with the minimum error.The displacement between the two blocks is the estimated motion vector.
But the exhaustive search algorithm requires a very large amount of computation.For a search range of R  and a step size of 1 pixel, the total number of candidates R   whit this algorithm.To speed up the search, various fast algorithms for block matching have been developed.The key to reducing the computation is reducing the number of search candidates.Various fast algorithms differ in the ways that they skip those candidates that are unlikely to have small error.The most important fast search algorithms are 2-D-log search method [25] and Three-step search method [26].
In the block matching algorithm, one assumes that the motion in each block is constant.But, if it is necessary to characterize the motion in each block by a more complex model, deformable block matching algorithm can be used.

Deformable Block Matching Algorithm
In the block matching algorithm introduced previously, each block is assumed to undergo a pure translation.This model is not appropriate for blocks undergoing rotation, zooming and so on.In general, a more sophisticated model such as the affine, bilinear or projective mapping can be used to describe the motion of each block.(Obviously, this will still cover the translation model as a special case.)With such model, a block in the anchor frame is in general mapped to a nonsquare quadrangle as shown in Figure 3. Therefore, the class of block-based motion estimation methods using higher order models is referred as deformable block matching algorithms [27][28][29].
In this algorithm, the motion vector at any point in a block is interpolated by using only motion vectors at the block corners (called nodes) [30].
Assume that a selected number of control nodes in a block can move freely and that the displacement of any interior point can be interpolated from nodal displacement.Let the number of control nodes be denoted by K and the motion vectors of the control nodes in m by , m k .Then, the motion function over the block is described by: Equation ( 6 in a block as an interpolation of nodal displacements as shown in Figure 4.
With this model, the motion parameters for any block are the nodal motion vectors that can be estimated by minimizing the prediction error over this block, which is: Because the estimation of nodal movements is independent from block to block, we omit the subscript 'm'.The interpolation kernel   , m k X  depends on the desired contribution of the control point k in m to the motion vector at X. one way to design the interpolation kernels is to use the shape functions associated with the corresponding nodal structure [31].To guarantee continuity across element boundaries, the interpolation kernel should satisfy: x y x y x y x y x y x y x y x y

Mesh-Based Motion Estimation
With the block-based model used in either block-matching or deformable block-matching, motion parameters in individual blocks are independently specified.Unless motion parameters of adjacent blocks are constrained to very smoothly, the estimated motion field is often discontinuous and some times chaotic as illustrated in Figure 6.
One way to overcome this problem is by using meshbased motion estimation.As illustrated in Figure 7, the anchor frame is covered by a mesh and the motion estimation problem is to find the motion of each node, so that the image pattern within each element in the anchor frame matches well with that in the corresponding deformed element in the target frame.The motion within each element is interpolated from nodal motion vectors, as long as the nodes in the target frame still form a feasible mesh.The mesh-based motion representation is guaranteed to be continuous and thus be free from the blocking artifacts associated with blockbased representation.Another benefit of mesh-based representation is that it enables continuous tracking of the same set of nodes over consecutive frames, which is desirable in applications requiring object tracking.
As shown in Figure 7, one can generate a mesh for initial frame and then estimate the nodal motions between every two frames.At each new frame, the mesh generated in the previous step is used, so that the same set of nodes is tracked over all frames.This is not possible with block-based representation, because it requires that each new frame be reset to a partition consisting of regular blocks.
As said above, in mesh-based motion representation, the underlying image domain in the anchor frame is partitioned into non overlapping polygonal elements.Each element is defined by a few nodes and links between the nodes as shown in Figure 8.The motion field over the entire frames is described by motion vectors at the nodes only.The motion vectors at the interior points of an element are interpolated from the motion vectors at the nodes of this element.In addition to, the nodal motion vectors are constrained so that the nodes in the target frame still form a feasible mesh with no overlapped elements.
Let the number of elements be denoted by M and the motion vector of the node k in m by , m k .The number of nodes defining each element is K.Then, the motion function over the element is described by: With the mesh-based motion representation, the motion parameters include the nodal motion vectors.To estimate them, we can again use an error-minimization approach such as: The , is nodal motion vector in node 'n'. n The accuracy of mesh-representation depends on the number of nodes.A very complex motion field can be reproduced when a sufficient number of nodes are used.To minimize the number of nodes required, the mesh should be adapted to the image scene so that the actual motion within each element is smooth.In real-world video sequences, there are often motion discontinuities such as occlusion at object boundaries.A more accurate representation would use separate meshes for different object or adapt the mesh in every two sequence frame.This algorithm is referred as "adaptive mesh algorithm".d

Adaptive Mesh
In the following, we present the basics of adaptive mesh algorithm.

Adaptive Mesh Concept
Standard mesh models enforce continuity of motion across the whole frame and it is not desired when multiple motion and occlusion regions are presented [11,12].We introduce the adaptive mesh concept to overcome this fundamental limitation.Consider a moving elliptical object, which is shown in Figure 9. Assume that the elliptical object is translating to the right, thus there are two types of motion: motion to the right that is assigned to the BTBC region, and to the left that is assigned to the UB region.The BTBC region in frame k should be covered in frame k + 1 completely.

Estimation Motion Field
Estimation motion field algorithm estimates motion vectors at each pixel independently from an optical flow field using method of Horn-Schunck [32].An optical flow-based method is better than for example block matching method, because the former yields smoother motion fields, which are more suitable for parameterization Figure 9. Adaptive mesh concept.[33].In [34] Lucas and Kanade present another method for motion estimation based on split-and-merge scheme.
In this paper, we use a point scheme to solve the optical flow equation [35].Assume a block such as Figure 10 for every pixel and compute the partial derivations based on according to Equation (13).
We require an extra condition to solve the optical flow equation which minimizing the gradient or laplacian relations according Equation ( 14) to smooth the motion field can be used.
If all objects in the scene move together, only one and one will exit for scene.Therefore, minimizing to smooth the motion filed can be used on the summation square optical equations in all pixels.
u v

BTBC Region Detection
We introduce the following algorithm to BTBC detection that is used in [11,12] greater than a predefined threshold, then the pixel ) , ( y x is labeled as an occlusion pixel.That is: The amount of threshold (T1) is depend on scene and can be defined by any matter.For example, half the biggest different between the pixels of k frame and the pixels of k + 1 frame or the average of different between all pixels of k frame and k + 1 frame can be used.However, T1 should be determined experimentally.
Figure 11 illustrates the BTBC region detection for two scenes.

Polygon Approximation
Polygon approximation can be used when the boundary of a region must be approximated by a set of points, so that it provides nearest approximation to the region with a few parameters.This simple approximation naturally fits with the boundaries of the region.See Figure 12 for more details.The used polygon approximation algorithm is similar to that proposed in [11,36], as follows: 1) Find two pixels on the boundary of the region (1 and 2), which have maximum distance from each other.The line drawn between these two points is called the main axis.2) Find the two points on the boundary (3 and 4) in two opposite sides of the axis which have the largest perpendicular distance from the main axis.These four points are named the initial points.3) Now, there are four segments; 1-3, 3-2, 2-4 and 4-1.
Consider segment 1-3.Draw a straight line from point 1 to the point 3. Detect every point on the boundary between 1 and 3 and every point in the area between the boundary and the straight line.If the maximum perpendicular distance between the straight line and every point on the boundary is below a certain threshold and the pixels that are in the area between the boundary and the straight line is less than 5% of total pixels of the region, then no new points need to be inserted on this segment of the boundary.If, however both criteria are not satisfied simultaneously, a new point on the boundary must be inserted at the pixel with the maximum distance from the straight line, and this procedure is repeated until no new points are needed within each new segment.4) Repeat step 3 for three remaining segments.Results of algorithm for several desired regions are shown in Figure 13.The criterion of stopping the algorithm can be the predetermine number of points on the boundary.If number of inserted points on the boundary is exceeded on upper criterion (say, 64), the algorithm is stopped.But, this number of points is not needed in practice.The algorithm is able to expansion up 128 or 256 points simply.

Selection the Points for Triangular 2-D Mesh Design
In the following, we propose a new, computationally efficient algorithm to select the points for 2-D triangular mesh design.This algorithm incorporates a nonuniform mesh, so that the density of points is proportional to the local motion activity and the mesh boundaries match with object boundaries.Thus, the points are located on spatial edges or pixels with high spatial gradient.This is shown in Figure 14.Then, a triangulation procedure is performed using the selected points.The algorithm is as follows [37]: 1) Approximate the region of object by polygon approximation.2) All pixels except of the BTBC region are labeled unmark.Compute DFD average for all unmarked pixels by the following expression:  And k is number of unmarked pixels and p can be sele about every point that is gained from cted equal 1 or 2. growing when marked region is merged the other marked region even the condition in step 3 is not satisfied.8) Repeat ste selected.Figure 15 show own regions for a square, which has moved to the right.

T
We propose a fast and called star-polygon method.The algorithm is as follows: 1) Consider point M of set point of the mesh.
2) Find the nearest neighbor point over the poi 3) Find the nearest neighbor point under the point M. See Figure 16.The trian , we use is circumsc gles should not be overlapped.Thus Delaunay triangulation condition to build triangles with no overlaps.Delaunay triangulation is a kind of triangulation, which has the following specification: if a circle is circumscribed triangle, no other points of the mesh are within the circle.This is shown in Figure 17.
Result of algorithm drawing a circle that ribed desire triangle is shown in

. Motion Vector Estimation for Se
Motion vector estimation for selected points can be a complished using the computed motion field in BTBC detection part.But, it is possible that estimated motion vectors are inconsistent in the sense that they do not preserve the connectivity of the mesh structure.This is shown in Figure 20.
Also, following steps ensures that the connectivity of the mesh structure is fulfilled: 1) Consider point 0 N and all points connected to it.s m pped 2) Find the point a from previous frame to the next and name them i N  .3) Compute virtual point 0 N  by: where is motion vector of virtual point ) , ( v u 2) Compensate all points in this element by computed for all triangular elements to parameters and (22).
Similar to T1, The threshold T2 is depend on scene an cture d determined experimentally too.For example, the relation T2 = T1 can be a selection.

Refinement of the Mesh Stru
All   , x y   points refine mesh structure.But if a point next fr ts of adaptive forward-tracking mesh algorithm ar is located in model failure region or BTBC region between k + 1 and k + 2 frames, that point has to be canceled because such a point has not a true motion vector.Also a set of new points is located in model failure region.
The region recognized as object area is used for ame.

Figure 1 .
Figure 1.Different motion representation: (a) global; (b) pixel-based; (c) block-based; (d) region-based.these two blocks is minimized.(If motion is forward, anchor frame represents frame k and target frame represents k + 1 frame.Otherwise, if motion is backward, anchor frame represents frame k + 1 and target frame represents k frame.)Thedisplacement vector m between the spatial positions of these two blocks (in terms of center or a selected corner) is the motion vector of this block.In the other word,

Figure 2 .
Figure 2. The search procedure for block-matching algorithm.

8 )Figure 4 .
Figure 4. Interpolation of motion in a block from nodal motion vectors.

Figure 5 .
Figure 5.A standard triangular element and a standard quadrilateral element.

Figure 6 .
Figure 6.The blocking artifacts caused block-based motion estimation.

Figure 8 .
Figure 8. Illustration of mesh-based motion representation using a quadrilateral mesh.

Figure 10 .
Figure 10.The block to compute partial derivations in every pixel.

: 1 )
Calculate the frame difference and determine the change region by thresholding and processing as follows: a) Filtering by the median filter with a 5 × 5 size.b) Three times morphological closing operations with a 3 × 3 size and then three times morphological opening operations with the same size.c) Eliminate small regions, which are smaller than predetermined size for example 25 pixels.2) Estimate motion field from frame k to frame k + 1. 3) Compensate the motion for frame k from frame k + 1 to compute figurative frame I k ~ using the mo-

Figure 14 .
Figure 14.Selection the points for mesh design.


in this circle is greater than ge.Label all pixels within the circle as "marked."4) Label the BTBC region.and consider a circle region with a predefined radius about it.If there is no other marked pixel in the circle, label the new pixel.7) Grow a circle about it similar to step 3. Stop p 6 until a desired number of points are s the result of procedure.It consists of gr riangulation simple algorithm for triangulation, nt M.

Figure 15 .
Figure 15.Result of algorithm selection points.

Figures 18 and 19
illustrate the result of algorithm triangulation for several desired set of points too.

Figure 18 .
Figure 18.drawing the circle that is circumscribed the desired triangle.

Figure 20 .Figure 22 .
Figure 20.Non connectivity of the mesh structure.