A Personnel Detection Algorithm for an Intermodal Maritime Application of ITS Technology for Security at Port Facilities

doi:10.4236/jtts.2011.14016

Paper Menu >>

Journal Menu >>

Journal of Transportation Technologies, 2011, 1, 123-131

doi:10.4236/jtts.2011.14016 Published Online October 2011 (http://www.SciRP.org/journal/jtts)

A Personnel Detection Algorithm for an Intermodal

Maritime Application of ITS Technology for

Security at Port Facilities

Mouhammad K. Al Akkoumi, Robert C. Huck, James J. Sluss

The University of Oklahoma, Tulsa

E-mail: makkoumi@ou.edu, rchuck@ou.edu, sluss@ou.edu

Received July 26, 201 1; revised August 23, 2011; accepted September 4, 2011

Abstract

With an overwhelming number of containers entering the United States on a daily basis, ports of entry are

causing major concerns for homeland security. The disruption to commerce to inspect all containers would

be prohibitive. Currently, fences and port security patrols protect these container storage yards. To improve

security system performance, the authors propose a low cost fully distributed Intelligent Transportation Sys-

tem based implementation. Based on prior work accomplished in the design and fielding of a similar system

in the United States, current technologies can be assembled, mixed and matched, and scaled to provide a

comprehensive security system. We also propose the incorporation of a human detection algorithm to en-

hance standard security measures. The human detector is based on the histogram of oriented gradients detec-

tion approach and the Haar-like feature detection approach. According to the conducted experimental re-

sults, merging the two detectors, results in a human detector with a high detection rate and lower false posi-

tive rate. This system allows authorized operators on any console to control any device within the facility

and monitor restricted areas at any given time.

Keywords: Human Detection, Port Security, Smart Surveillance System

1. Introduction

In 2005, over 20 million sea, truck, and rail containers

entered the United States [1]. This increasing number of

containers entering the country poses higher risks for

security breaches and malicious attacks. Physical in-

spect- tion of each and every container on a daily basis

would shut down the entire economy [1]. Furthermore,

many containers coming into the country are stored at the

port for a period of time before being shipped by road,

rail, or barge to their final destination. Storing these con-

tainers in a staging area raises concerns about the secu-

rity of the containers. Thus leading to a need to have a

more efficient system to monitor and protect the port

facility and the cargo. Currently, un-queued video sur-

veillance, vehicle detection, fences and gates, and foot

patrols are the common means for port security. Using

other available technologies, a more efficient security

system can be implemented to allow uninterrupted

freight-flow operations at the port.

Human detection is a fast growing and promising

technique used in various applications to find humans in

given images. Researchers are trying to accomplish this

type of detection using methods that result in high accu-

racy and fast computation. The next sections are written

in the following sequence: sectio n II includes a literature

review of related works. Section III discusses the per-

sonnel detection algorithm and section IV covers the

experimental results of our algorithm. Section VI is a

discussion of futu re wo rk and the conclusion.

2. Background

The benefits of ITS deployments are well known: Im-

proving transportation network efficiency, enhancing

safety and security, reducing cong estion and travel delay,

reducing incident response times, and increasing the ef-

ficiency of both transportation and emergency response

agencies. Today’s typical ITS deployment is a assort-

ment of vehicle detectors, closed circuit television (CCTV)

cameras, fixed and portable message signs, highway ad-

visory radio systems, a web based traveler information

M. K. A. AKKOUMI ET AL.

124

system, weather information, and an integrated commu-

nications network that links the field hardware to system

operators, transportation managers, and emergency man-

agement agencies. In most cases, system control is im-

plemented in a centralized traffic management center that

co-locates the system operators, transportation managers,

response agencies, and their dispatchers [2,3]. Our de-

sign of a distributed, hierarchical, peer-to-peer ITS sys-

tem [4] results in a virtual centralized management cen-

ter where the various system operators, transportation

managers, and incident management agencies can remain

geographically separated throughout the State but still

enjoy most of the benefits provided by a this centralized

management center environment. Another feature of this

system is the use of off-the-shelf equipment and open-

source software to reduce development costs. Addition-

ally, by using standards based network architectures and

protocol converters to communicate with the remotely

deployed sensor devices; the software integration effort

was reduced thereby greatly reducing risks.

In this research, two human detection approaches were

used to create a joint human detector. The two ap-

proaches are the histogram of oriented gradients [5] and

the Viola and Jones approach using a cascade of weak

classifiers [6]. In [6], Viola and Jones proposed the first

approach for detecting objects in images based on Haar-

like features in 2001. This approach has been used pre-

viously to perform face detection, upper and lower body

detection, and full body detection with moderate detec-

tion results [7-9]. While face detection was introduced

first and showed very promising results; Haar-like fea-

ture detection has not shied away from being used in

many other human and object detection algorithms. The

Viola and Jones detector has been used in different ap-

plications to perform fast object recognition. One of the

drawbacks of this detector is its detection inconsistency

with an object's rotation in images.

In [10], Kolsch and Turk proposed a Viola and Jones

detector that performed hand detection with a degree of

rotation. The detector was trained using a dataset that

contained images of hands with different angles of rota-

tion. The results showed an increase of one order of

magnitude in the detection rate of the hand in input im-

age frames. A more advanced version of the Viola and

Jones approach was proposed in [11] by Mita et al. The

authors introduced a new approach for face detection

using joint Haar-like features. The joint features are lo-

cated through the co-occurrence of face features in an

image. The classifiers were then trained using these fea-

tures under adaptive boosting (Adaboost). The results

shown in the paper proved achieving faster detection

time, 2.6 times faster, with similar face detection accu-

racy. The joint Haar-like features also played into re-

duceing the overall detection error by 37% compared to

the traditional Viola and Jones approach.

3. Personnel Detection and Image Processing

Techniques

Establishing exceptionally accurate pedestrian detection

and tracking are two major hurdles facing computer vi-

sion today. Overcoming these challenges can result in

providing more secure surveillance systems to monitor

indoor and outdoor spaces. These smart systems can be

used to enhance security at ports of entry worldwide.

3.1. Haar-Like Feature Pedestrian Detector

The use of Haar-like algorithms simplifies locating all

the desired features. A feature is selected if the differ-

ence between the average dark region pixel value and the

average of the light region is higher than a preset thresh-

old. An example of HAAR features is shown in Figure 1.

As shown in the figure, the features can be used to detect

different pixel orientations throughout a defined region

of interest. A combination of a certain arrangement of

edges can then be identified as the desired object or not.

The features presented in the figure are either 2-rectangle

or 3-rectangle features. Another type of features is the

4-rectangle features that are used in other implementa-

tions of Haar-like features. The feature can be computed

quickly using integral images which are defined as

two-dimensional lookup tables and have the same size as

the input image.

The next step in the algorithm is training the machine

to be able to make decisions whether a pedestrian is pre-

sent in the image region. Adaboost is a machine learn ing

method that uses many weak classifiers to create a strong

classifier. Each weak classifier is assigned a weight to

help strengthen the overall classifier. The weak classify-

ers filter the image region as it passes through th em. If, at

any point, the region is filtered out, then the region is

considered not to have the desired object. The heavily

weighted filters come in first to make the process much

quicker and annihilate negative regions. Figure 2 shows

the overall Viola and Jones detection system.

Figure 1. Haar-like features.

125

M. K. A. AKKOUMI ET AL.

Figure 2. Viola & Jones object detection algorit hm.

The training process is a key stage to formulate strong

classifiers for the Haar-like features pedestrian detector

(HFPD). A combination of training samples is used to

formulate a cascade of classifiers to be used in the detec-

tion process. The complete process goes through 4 main

stages: data preparation, object marking and creating

object samples, training and then finally testing. The

trained detector was used for detecting a pedestrian

lower body region (mainly legs) in a given image. To

train the detector, a set of positive and negative samples

was collected. The positive samples contain on e or more

instances of the human lower body. The negative sam-

ples are the ones that contain no instances of the human

lower body and even no humans. The negative samples

were obtained from an online dataset [12]. The dataset

includes 2977 negative samples of various grey scale

backgrounds with no human or human like objects.

These images are used to train the detector to what is not

the object of interest and ultimately improves th e overall

detection ra te. The wider the range of b ackgroun ds being

used the lower the false positive rates are and the

stronger the classifier would be.

The positive samples were taken in a lab environment

with different backgrounds. Three detectors were trained

using 890, 1890 and 2890 p ositive samples, respectively.

The goal is to try various numbers of positive images and

compare the results. One might think that increasing the

number of positive samples would result in a stronger

cascade of classifiers but that’s not always the case.

There are several factors that determine the strength of

the cascade and these include but are not limited to: the

type of object being detected, the backgrounds of the

positive samples, Object rotation, and Object scaling.

The lower body samples are taken from different view-

points and appear in different poses. The illumination is

kept almost the same with minor differences. The posi-

tive samples were taken using a high definition camcor-

der with a 1280 × 720 p ixel resolution. The resolutio n for

these images is not a factor since all the images are re-

scaled during the training process. These positives will

later be used to specify where the location of the object

of interest is precisely. Various poses of the lower body

were captured to strengthen the cascade to overcome the

rotation drawback of Haar-like features. The images used

in the training process are converted to grey scale, thus

no color constraints are taken into consideration. The

next step prior to starting the training the detector is to

mark the legs in a bounding box in every positive sample

and save its coordinates. Then a vector file for the posi-

tive samples is created. This vector file is an output file

that contains information regarding the generated sam-

ples. The training process time varies according to sev-

eral factors, among these are: the number of training

samples being used, the number of stages the cascade

needs to cover, the memory allocation for the process,

and the processor speed. On average, it took between 2

to 4 hours to train the lower body detectors. Three cas-

cades were trained with 890, 1890 and 2890 positives

respectively and 2977 negatives. 50 images from the

INRIA online dataset were chosen at random for testing

[13].

3.2. HOG Pedestrian Detector

The Histogram of Oriented Gradients (HOG) detection

approach was first introduced in 2005 and focused on

detecting objects based on their edge orientations. The

HOG approach can be compared to the Scale-Invariant

Feature Transform (SIFT) approach proposed by David

Lowe in 1999 [14]. The two approaches share the same

concept of extracting unique features to help in the deci-

sion-making process of whether the target object is pre-

sent in an image. However, the HOG method segments

the image in a different way and makes use of local con-

trast normalization to improve the overall performance of

the system. Now, HOG is being used in multiple object

detection applications resulting in fast and accurate de-

M. K. A. AKKOUMI ET AL.

126



pixel is calculated based on the direction of the gradient

element at its center. According to [18], a fast way to

calculate the histograms of regions of interest is achieved

by using integral histograms.

tection [15-17]. The first step in the HOG algorithm is

gradient computation. The simplest and most efficient

way to accomplish that, as tested by Dalal and Triggs, is

to apply a 1-D, centered point, discrete derivative mask.

Applying other types of masks such as the 3 × 3 Sobel

mask doesn’t lead to better overall system performance.

The derivative mask system is defined as follows:

In order to pass the computed histograms of gradients

into a classifier, cells are organized in a 3 × 3 arrange-

ment called a block. Creating blocks helps make the al-

gorithm less susceptible to changes in illumination and

contrast. The blocks overlap in an image producing more

correlated spatial information to be used in th e descriptor,

which also improves the overall detection performance.

Figure 3 shows an example of blocks containing 9 cells

inside the detection window. The 3 × 3 and 6 × 6 blocks

worked best for Dalal and Triggs in their experimental

results and believe that varying the block size has less

effect on the detection as does overlapping the blocks.

 

 

Y i,j=X i,j+1X i,j1

Y i,j=X i+1,jX(i1,j)











The equation system contains vertical and horizontal

1D derivative masks that can be applied pixel wise to an

input image X. Y is the output image with the calculated

pixel derivatives on row i and column j. The whole im-

age is scanned to calculate each pixel orientation to be

used in computing the later histograms. The derivative

masks used can be expressed as: Also in some cases, increasing the number of cells

present in the block decreases the overall performance of

the detection system. The rectangular HOG, also known

as R-HOG, can be set with different block dimensions

but are best used in square arrangements. The R-HOG is

adopted in the tested HOG human detector presented in

this chapter. A block is represented by a multi-dimen-

sional feature vector that is used in the classification step.

Block normalization is needed to decrease the required

computation, thus L-2 normalization on the block is done

followed by a renormalization step.

























1 0 1

After calculating the gradients, the algo rithm defines a

detection window of fixed size (64 x 128 pixels) to scan

the image. The detection window is then divided into a

number of 8 × 8 pixel groups called cells, Figure 3. A

cell can be rectangular or radial in shape and can vary in

size although 6 × 6 pixel group is considered an optimal

solution for human detection. For the purpose of this Each block is normalized and used in the collected

feature vector. Using 2 × 2 cells results in having a 36

dimensional normalized feature vector, since 4-9-bin

histograms were used for the HOG detector. The final

step for the HOG algorithm is to use the feature vector as

input to a Support Vector Machine (SVM) classifier to

perform the decision making. SVM has been used by

many researchers in object detection and segmentation to

deliver a classification method for various objects in

study, the selected cells are rectangular. The next step in

this system finds a 9-bin histogram of pixel orientations

for each cell. The number of orientation bins selected

sug- gests looking at 20 degrees for each pixel orienta-

tion. The range from 0 - 180 degrees, for unsigned gra-

dients, is divided by the 9 bin orientation in which linear

gradient voting is represented. A weighted vote for each

Figure 3. HOG detection window with cells and blocks.

127

M. K. A. AKKOUMI ET AL.

input images [19-21]. Linear SVM is one of the most

common methods used for forming different classes of a

dataset. The HOG algorithm feeds the descriptor vector

to a trained linear SVM to determine human presence in

a given test image. The HOG scheme was tested and

performed extremely well on two datasets: the MIT pe-

destrian database and then on a new dataset created by

Dalal and Triggs called the INRIA dataset. A flow dia-

gram of the HOG method is shown in Figure 4.

4. System Analysis and Results

In all the conducted experiments, three rates were ob-

served: false negative rate, false positive rate, and detec-

tion rate. In this paper, these terms are defined as follows:

a false negative rate is calculated by summing the num-

ber of events where the detector missed a human present

in the image and divide it by the total number of events,

a false positive rate is the number of events where the

detector had found something that it thinks is a human

but it is not divided by the total number of events, and

the detection rate is the number of events where the de-

tector had found a human in the image divided by the

total number of events. In add itio n, in th is pap er an even t

is defined as one of three things: not detecting a human

present in the image, falsely detecting a human, and de-

tecting a human. These rates are determined subject-

tively and through a predetermined number of test im-

ages. The background in the videos for the different sce-

narios was static (i.e., fixed camera positions) to help

overcome any background noise that might affect the

detection rate.

The collected experimental results show the perform-

ance of the combined human detector compared with the

two separated detectors. The feedback system maintained

a high detection rate and decreased the false positive rate

which results in a more robust detector. Indoor and out-

door scenarios with different image resoluti ons are test ed.

4.1. Merged HOG and Haar Detectors Results in

an Indoor Scenario

The first scenario tested for the two detectors was in-

doors, as shown in Figure 5. This scenario was used

previously to test the HOG full body and the Haar leg

detector separately. The collected results showed high

detection rates in both cases and very low false positive

and negative rates. The detection rate for the Haar leg

detector was 93.8% for 210 test images and the false

positives rate was 9.5%. The HOG detector was able to

locate the human in every frame with an insignificant

false positive rate.

Figure 4. Static histogram of oriented gradients approach.

Figure 5. HOG and Haar used in an Indoor scenario.

M. K. A. AKKOUMI ET AL.

128

The test results for the indoor scenario were taken to

show both detectors activities and how the algorithm

works in different cases. For example, the first and sec-

ond frames in the above figure show complete detectio n.

The third, sixth and eighth frames show a detected hu-

man by the HOG detector and missed detection by the

Haar detector as explained in Subsection 5.3.2. The

fourth and fifth frames show two cases of HOG detection

and Haar false detection. Note that in the fourth frame

the false detected leg is the upper body and within the

region of the human. In the fifth frame, a second false

positive is shown by the Haar detecto r behind th e human.

This false positive is discarded during th e feedback mes-

saging algorithm while the other one, which is in the

human detection region, is not. The seventh frame shows

one HOG detection box and three Haar detection circles,

two of which are true detection and one false positive

that falls within the HOG box.

4.2. Merged Detectors Tested on Two Humans in

an Outdoor Scenario

The second scenario used to test the two merged detec-

tors was of two humans in an outdoor scenario . Figure 6

shows the detected false positive and negative results

for both detectors. The first frame shows two HOG

boxes for the two humans and that the Haar detector has

missed both. The second and eightth frames are the

only ones where both detectors agree on spotting both

pedestrians. In the third frame, the HOG detector finds

both humans whereas the Haar finds none and adds a

false positive.

The fifth frame shows both humans detected as one

using the HOG full body detector. In this case, only one

alert is sent. In the fourth, sixth, and seventh frames, the

HOG finds the two humans whereas the Haar detector

only finds one. Two alerts are sent out to the authorized

personnel. When tested separately using 300 test images,

the detectors showed different detection, false positive

and negative rates. The HOG outperformed the Haar

detector in the detection and negative rates by almost

20% for each. Both detectors had approximately the

same false positive rate of 6%. Using the feedback mes-

saging system, a more accurate human detector can be

established by merging the two full and part-based de-

tectors. The feedback system helps decrease the false

positive rate for the combined detector. Table 1 shows

the statistics for all three cases.

Figure 6. Results of applying both detectors in an outdoor scenario.

Table 1. Detection statistics for separated and merged detectors.

Detector Types Resolution

(in pixels) Total Number of

Test Images Detection Rate False Positive

Rate False Negative

Rate Average Detection

Time (in ms)

HOG Full Body

Detector 640 × 480 300 97.0% 5.3% 3% 790

Haar Leg Detector 640 × 480 300 77.33% 6% 22.67% 50

Merged Detector 640 × 480 300 97.0% 0.67% 3% 880

129

M. K. A. AKKOUMI ET AL.

Each of the 300 test images must ideally produce two

alerts, one for each human in the captured frame. Thus,

the expected total number of true alerts sent is 600. The

false positive rate can be decreased using information

from both detectors where the human is expected to be.

Therefore, a huge reduction in the false positive rate can

be observed. On the other hand, the negative rate stays

the same as the one for the more accurate detector, which

in this case is the HOG full body detector. The final de-

tection rate for the merged detector is 97%. The detec-

tion time for the final detector is approximately the sum

of the detection time of both detectors in addition to a

small margin taken for the feedba ck messaging system.

4.3. Merged Detectors Tested on Multiple

Humans in an Outdoor Scenario

The last scenario investig ated has multiple humans walk-

ing in an outdoor scene. Again, the two detectors are

applied on several test frames to determine subjectively

the false positive, false negative and detection rates.

Figure 7 shows the results of merging the two detectors.

As expected, the HOG detector produced a detection rate

higher than that of the Haar leg detector. The HOG de-

tection rate was 93.5% while the Haar had a detection

rate of 62.8% for 300 test images. The false positive rate

in both cases was less than 3%. Note that the Haar leg

detector was not able to find all four pedestrians in the

test images. This is due to the training dataset that only

included one instance of the target object for each image.

In this scenario, four humans are walking around and

at times partially or fully occluding one another. Table 2

shows the detection, false positive and negative rates in

addition to the average detection time for each detector.

The detection time is higher than the previous scenario

due to an increase in the video resolution from 640 × 480

to 848 × 480 pixels. The system requires just over a sec-

ond to determine whether one or more humans are pre-

sent in frames of size 848x480 pixels. Ideally, the num-

ber of produced alerts should be 1200, but in this case,

the 300 test images contained 1, 2, 3 or 4 humans per

frame. The total number of expected alerts is 838 alerts.

Note that the detection rate for the merged detector is not

much higher than that of the fu ll body detector du e to the

high negative rate that was not decreased. On the other

hand, the false positive rate was taken out by the feed-

back messaging system. The false positives from both

detectors were not in the same location and also did not

correspond with the location of the moving object given

by the tracker.

Table 2. Detection statistics for multiple human separate and merged detectors.

Detector Resolution

(in pixels) Total Number of

Test Frames Detection

Rate False Positive

Rate False Negative

Rate Average Detection

Time (in ms)

HOG Full Body Detector 848 × 480 300 93.5% 1.1% 6.5% 1010

Haar Leg Detector 848 × 480 300 62.8% 2.7% 37.2% 70

Merged Detector 848 × 480 300 93.5% 0% 6.5% 1150

Figure 7. Results of applying both detectors for multiple human detection.

M. K. A. AKKOUMI ET AL.

130

5. Future Work and Conclusions

reason for the

ents

nded by a grant from

the Federal Highway Administration and the program is

administered by the Oklahoma Department of Transpor-

ategy for maritime security,” 2010.

s.gov/xlibrary/assets/HSPD13_MaritimeSe

dson

04.13

Th size of the input image is the main e

slower detection time, and is due to a bigger number of

visited detection windows required for detection. The

authors believe that downsampling the images can help

decrease the detection time to fit in a model for real-time

or near real-time pedestrian detection. Additionally,

work accomplished with General Purpose Graphical

Processing Units (GPGPUs) indicates processing speed

increases with this kind of application. Based on the ex-

perimental results collected thus far, the authors believe

that combining the two detectors in addition to pre-

processing with an object tracker would result in a robust

personnel detection and tracking system. The first stage

in the system is the tracking stage. The object tracker

identifies moving silhouettes in the video capture and

alerts the user of a potential threat. The second stage in-

cludes the HOG full bod y detector that looks at the loca-

tion of the moving object and determines whether it is a

human or not. The third stage introduces the Haar-like

feature pedestrian detector that tries to find upper and

lower human body regions. The fourth stage starts the

feedback messaging between the detectors to decide

whether the detected region actually contains a pedes-

trian or it’s a false positive. After several iterations, the

system converges and the detection results are collected.

The results collected in this paper are based on several

training and testing data sets. This helps establish a more

generalized solution to the presented challeng es. The two

stages complement one another in such a way that the

detection system is much stronger than the current sys-

tems. The Viola and Jones approach is not a computa-

tionally heavy approach and provides object detection at

different scales and backgrounds. Thus, the feedback

stage will help improve the detection rate without slow-

ing down the overall system. In this paper, we proposed

a low cost distributed ITS-based smart sur- veillance

security system for port security. This system is very

scalable and provides improvements to a major inter-

modal maritime application. Using image processing

techniques security can be enhanced to capture unau-

thorized personnel in restricted areas. Port security op-

erators can rely on alerts produced by the pedestrian de-

tection and tracking system as well as the container

tracking devices to assess port security. These systems

complement the overall security system and integrate

well as building blocks. This security approach can be

used in various applications and sites to improve overall

security nationwide.

6. Acknowledgem

This research program is partially fu

tation. The project is federally sponsored under the

SAFETEA-LU transportation authorization act. This

program is a five year program started in October 2005.

This specific research is part of the Phase III tasks and

deliverables.

7. References

[1] “The national str http://www.dh

curityStrategy.p

[2] R. H. Brakman and J. J. Limarzi, “ITS at the Hu

Valley Transportation Management Center,” IEEE Intel-

ligent Systems, Vol. 19, No. 3, 2004, pp. 8-12.

doi:10.1109/MIS.20

onference,

9/NTC.1993.293012

[3] M. J. Kelly, D. J. Folds and N. Sobhi, “ATMS 2000:

Hybrid Automation or a Lights Out Traffic Management

Center?” Proceedings National Telesystems C

1993, pp. 37-42. doi:10.110

nce In-

p. 886-893.

-511-I-518.

72-76.

[4] R. Huck, J. Havlicek, J. Sluss, Jr. and A. Stevenson, “A

Low-Cost Distributed Control Architecture for Intelligent

Transportation Systems Deployment in the State of Okla-

homa,” Proceedings IEEE International Confere

telligent Transportation Systems, Vienna, Austria, 2005,

pp. 919-924.

[5] N. Dalal, and B. Triggs, “Histograms of Oriented Gradi-

ents for Human Detection,” IEEE Computer Society Con-

ference on Computer Vision and Pattern Recognition,

Vol. 1, 2005, p

[6] P. Viola and M. Jones, “Rapid Object Detection Using a

Boosted Cascade of Simple Features,” IEEE Computer

Society Conference on Computer Vision and Pattern Rec-

ognition, Vol. 1, 2001, pp. I

[7] L. Yun and Z. Peng, “An Automatic Hand Gesture Rec-

ognition System Based on Viola-Jones Method and

SVMs,” IEEE International Workshop on Computer Sci-

ence and Engineering, Vol. 2, 2009, pp.

doi:10.1109/WCSE.2009.769

[8] T. Ephraim, T. Himmelman and K. Siddiqi, “Real-Time

Viola-Jones Face Detection in a Web Browser,” IEEE

Canadian Conference on Computer and Robo

2009, pp. 321-328. t Vision,

/CRV.2009.48doi:10.1109

es, 2010,

[9] D. Hefenbrock, J. Oberg, N. T. N. Thanh and R. Kastner,

“Accelerating Viola-Jones Face Detection to FPGA-

Level Using GPUs,” IEEE International Symposium on

Field-Programming Custom Computing Machin

pp. 11-18. doi:10.1109/CRV.2009.48

[10] M. Kolsch and M. Turk, “Analysis of Rotational Robust-

ness of Hand Detection with a Viola-Jones Detector,”

Proceedings International Conference on Pattern Recog-

nition, Santa Barbara, Vol. 3, 2004, pp. 107-110.

[11] T. Mita, T. Kaneko and O. Hori, “Joint Haar-like Fea-

tures for Face Detection,” IEEE International Conference

on Computer Vision, Beijing, Vol. 2, 2005, pp.

M. K. A. AKKOUMI ET AL. 131

inrialpes.fr/data/human/

150-1157.

1619-1626.

[12] N. Seo, “Tutorial: OpenCV Haartraining,” 2008.

http://note.sonots.com/SciSoftware/haartraining.html

[13] N. Dalal, “INRIA Person Dataset,”

http://pascal.

[14] D. Lowe, “Object Recognition from Local Scale-In-

variant Features,” IEEE International Conference

Computer Vision, Vol. 2, 1999, pp. 1

doi:10.1109/ICCV.1999.790410

[15] H. X. Jia and Y. J. Zhang, “Fast Human Detection by

Boosting Histograms of Oriented Gradients,” IEEE In-

ternational Conference on Image and Graphics, 2

683-688. 007, pp.

ter Vision and Pattern Recognition, Vol. 2,

[16] Q. Zhu, M. C. Yeh, K. T. Cheng and S. Avidan, “Fast

Human Detection Using a Cascade of Histograms of

Oriented Gradients,” IEEE Computer Society Conference

on Compu

2006, pp. 1491-1498.

[17] C. Zhou, L. Tang, S. Wang and X. Ding, “Human Detec-

tion Based on Fusion of Histograms of Oriented Gradi-

ents and Main Partial Features,” International Congress

on Image and Signal Processing, 2009, pp. 1-5.

doi:10.1109/CISP.2009.5304536

[18] F. Porikli, “Integral Histogram: A Fast Way to Extract

Histograms in Cartesian Spaces,” IEEE Compute

Conference on Computer Vision ar Society

nd Pattern Recognition,

on Information Processing,

ages Using Genetic Algorithms and

IEEE

Vol. 1, 2005, pp. 829-836.

[19] H. Li and J. Cao, “Detection and Segmentation of Mov-

ing Objects Based on Support Vector Machine,” IEEE

International Symposium

2010, pp. 193-197.

[20] E. Pasolli, F. Melgani, M. Donelli, R. Attoui and M. De

Vos, “Automatic Detection and Classification of Buried

Objects in GPR Im

Support Vector Machines,” IEEE Geoscience and Remote

Sensing Symposium, Vol. 2, 2008, pp. II-525-II528.

[21] G. Zhu, C. Xu, Q. Huang and W. Gao, “Automatic Multi-

Player Detection and Tracking in Broadcast Sports Video

using Support Vector Machine and Particle Filter,”

International Conference on Multimedia and Expo, 2006,

pp. 1629-1632. doi:10.1109/ICME.2006.262859