Hand Gestures Recognition Based on One-Channel Surface EMG Signal ()
1. Introduction
Hand gestures are involved in many aspects of human life, including games, medical care, education and shopping. Hand gesture recognition refers to the process that a computer automatically detects and analyzes the measured bioelectrical signals in order to identify hand gestures of an individual. It enjoys great popularity worldwide and has been studied extensively. So far a new method for hand gesture recognition with Electromyography (EMG) has been put forward considering its convenience and financial benefits.
EMG is the superposition of the action potentials of the muscle tissue occurring during a voluntary contraction, providing information on the flexion and extension of the muscles as well as the shape and position of the limbs during the completion of the movement [1] . The hand gesture recognition based on EMG has several advantages compared with optical detection. The design of the EMG sensor is relatively simple while the sensor is not sensitive to the environment. In addition, the signal processing for EMG is not so complicated. However, the problem of using EMG to recognize hand gestures is the inaccuracy, so solving the problem is the motivation of the paper.
This paper focuses on the experiment where we plan to use EMG signals to recognize 2 hand gestures. And based on that, we are going to improve the accuracy of recognition by optimizing the algorithm. We plan to use the surface electromyography sensor (OpenBCI Ganglion) to capture muscle activity information on the skin surface of the corresponding muscle group. We are going to perform experiments to find the best placement of the electrodes and get the most reliable data. MATLAB is utilized for analyzing and processing the signals and for distinguishing the EMG generated by different gestures. The accuracy will be evaluated by performing additional tests.
Starting from the methods adopted during the whole process of the research including signal processing and classification, the rest of the paper explains the experiment in detail, analyzes and discusses the results and draws conclusions based on the findings. Several novel ideas are to be demonstrated in the following part as well.
2. Methodology
2.1. Data Acquisition
Only one channel was chosen to collect data from the OpenBCI device, and three electrodes were placed on the right arm of the subject. Two of them were attached to 1+/1− pin and the other one was attached to bottom pin, working as a common ground. During the process, we found that the results varied according to the placement of electrodes. The EMG is produced by the electric potential change, which is generated by muscle cells. As a result, gestures produced by different parts of muscle will generate distinguished potentials. The optimal position of the electrodes is determined by performing several trials of the acquisition experiment (shown in Figure 1). In addition, we found out the testing environment is another parameter that needs to be controlled to get a high-quality raw data.
2.2. sEMG Signal Processing
The recognition of the gesture SEMG signals consists of three parts: data pre-processing, feature extraction and classification. For data pre-processing, a Butterworth filter and a segmentation algorithm are used for getting a cleaner data and for reducing the data amount. The feature extraction algorithm compresses the sEMG signal segments into feature vectors. The features are designed so that they emphasize the gesture class specific characteristics of the sEMG signal. The classifier is trained with the feature vectors to distinguish the different gestures from each other with high accuracy.
1) Data Pre-Processing
a) Filtering
We use Butterworth filter to process the original signal. Butterworth filter is a signal processing filter with flat frequency response curve in passband (shown in Figure 2) [2] . In this experiment, the passband was chosen from 20 Hz to 90 Hz, with a 3 dB passband ripple and 40 dB stopband attenuation.
As we can see in the amplitude-frequency using FFT (see in Figure 3 and Figure 4), the low frequency, which is of no relation to the EMG signal, has been filtered.
Figure 1. The placement of the four channel-surface EMG electrodes is shown in the on the bottom. The gestures used for detection are shown on the top.
Figure 2. The amplitude-frequency figure of a low-pass Butterworth filter.
Figure 3. The amplitude-frequency figure of the original signal.
Figure 4. The amplitude-frequency figure of the filtered signal.
b) Detection of Gesture Action Segment
In the beginning, we planned to use the sliding window method based on the calculation of energy in order to identify the start and the end of the segments.
We have:
(1)
where
is half of the window length,
refers to the amplitude of the EMG signal at time t, and
represents the energy value of signal at time
. From the moment, if a series of
is consistently larger than A (a certain threshold value), in that case, the
moment is considered to be the start of an action segment. Using the same method the end of the action can be recognized.
However, we found that it was difficult to determine the value of the threshold energy for each action segmentsince the signals of different actions vary widely. A specific threshold could not be found to recognize all pulses. Therefore, we utilize algorithms based on the toolbox function Hilbert in MATLAB to find a dynamic threshold in order to solve this problem.
The Hilbert transform is a specific linear operator that is given by the convolution with the function
(see in (2)). It is related to the actual data by a 90-degree phase shift.
(2) [3]
The toolbox function hilbert in MATLAB computes the Hilbert transform for a real input sequence x and returns an analytical signal of the same length, y = hilbert(x), where the real part of y is the original real data and the imaginary part is the actual Hilbert transform [4] . The magnitude of the analytical signal is the complex envelope of the original signal.
However, after figuring out the dynamic threshold and the beginning and end of the actions, we discovered that due to the instability of some signals, one action can be recognized as two because of some sudden attenuation of the signal (see in Figure 5). Therefore, we have made some improvements for the envelope extraction with an algorithm of “dilation” and “erosion” (see in Figure 6).
Figure 5. The mistake in recognition when detecting the action segment
Figure 6. The detection of activity after the improvement with an algorithm of “dilation” and “erosion”.
2) Feature Extraction
Feature extraction is carried out on the SEMG signals to preserve the signal patterns in order to distinguish different gestures. In our experiment, the Mean Absolute Values (MAV), the Root Mean Square (RMS), the variance and the Auto-Regressive (AR) model coefficients are used to the modeling of different SEMG signals and then input to the network for classification. A short description of the calculation is given below.
Mean Absolute Value (MAV) is defined as:
(3)
where N denotes the length of the signal and
represents the EMG signal in a segment [5] .
Variance: the general equation used to find variance by:
(4)
where
is the mean value of the EMG signal, N denotes the length of the signal and
represents the EMG signal in a segment.
Root Mean Square (RMS) can be expressed as:
(5)
where N denotes the length of the signal and
represents the EMG signal in a segment.
Auto Regressive (AR) model coefficients: the AR model of a signal can be expressed as:
(6)
where
refers to the coefficient of the kth order of the signal, p refers to the order of the model, and
is the white noise.
In this experiment, we use the burg method to fit the AR model and get the coefficients of four orders as a feature of the signal.
3) Classification
In sEMG pattern recognition, different algorithms have been used to assign the feature vectors. Neural Network group of classifiers are widely used due to its expandability and ability of bearing both simple complex cases. However, the choice of features and the time-constraints make this classifier excessively complex. Others preferred to utilize the Fuzzy logic approach, which allows to insert user experience in the system, and to contradict itself for patterns changing [6] . In our experiment, after extracting a 7-order feature vector of the static hand gesture, we selected k-nearest neighbors (KNN) algorithm at first, because of its simplicity and practicability. The principle of KNN is classifying objects by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors. However, the classification of objects usually requires a large amount of computation, because we need to calculate the distance between every testing point and the rest to get its k nearest neighbors. For that reason a Support Vector Machine (SVM) was chosen, which provides high accuracy in calibration and classification. SVM is a discriminative classifier formally defined by a separating hyperplane. It is widely used in binary classification, since it benefits from structural risk minimization principle and avoidance of overfitting. When there are limited training data available, SVM usually outperforms the traditional parameter estimation methods. It effectively avoids outfitting problem, especially when there is only small amount of data samples [7] .
The objective of SVM algorithm is to find a decision boundary in an N-dimensional space that distinctly classifies that data points (Figure 7). Maximizing the margin distance between two hyperplanes provides some reinforcement so that future data points can be classified with more confidence. So, the ultimate goal is to find a plane that has the maximum margin.
Suppose there are two gestures GTR1, GTR2 needed to be classified. We denote the training set with samples as:
(7)
where
represents a feature vector of a gesture:
(8)
A separating plane can be written as:
(9)
Then the only problem is to find the maximum margin (find the minimum w), which can be solved by kernel function [8] .
The first hundred repetitions of each gesture were used as training samples to train the classifier and the rest of the gesture samples were used for testing. Recognition results for the two subjects are given in Table 1 and Table 2. In the first column, the figures in parenthesis are the total numbers of test samples and figures on their left are the numbers of correct recognitions. Accuracy is defined as
Table 1. Results of using SVM algorithm in classification.
Table 2. Results of using KNN algorithm in classification.
Figure 7. Illustration of SVM algorithm. Solid line is the hyperplane to distinguish two groups.
where was the number of correct recognitions, was the number of wrong recognitions. Time refers to the total time used to train the model.
Through the chart, we can see by using SVM the average recognition rate higher than using KNN algorithm. Also, the time consumed to train the model in SVM is short than KNN. Our result showed that, based on a relatively [5] small data amount, SVM is the better choice for classification algorithm.
3. Experiment
1) Experiment Subject
One of our group members vonlunteers to be a subject in the experiment. He is 20-year-old healthy male, height of 188 cm and weight of 90 kg. He has no history of neuromuscular or joint disease and he is right-handed.
2) Experiment Device
OpenBCI is an open-source brain-computer interface platform. OpenBCI boards have 4 channels and can be used to measure and record electrical activity produced by the brain (EEG), muscles (EMG), and heart (EKG).
3) Task
The subject sat comfortably on a chair and was relaxed. Then, he executed 2 hand gestures and each gesture was repeated more than 60 times. Every gesture action lasted about 1 second with an interval of 1 to 2 seconds.
4. Conclusions
We have carried out this experiment in order to explore plausible ways to identify hand gestures with sEMG signals and to compare different algorithms for classification. After using one channel from the OpenBCI board to collect sEMG signals of two different hand gestures, we processed the signal and extracted four features to generate feature vectors for classification. Then we classified the signals with two different algorithms, KNN and SVM. The overall result of gesture recognition has achieved our expectations, reaching an accuracy of 97.5% for one gesture and 100% for another with a testing sample of 40 for each gesture. Meanwhile, we compare the two algorithms we have selected. From the result, it can be concluded that in such case SVN achieves a higher accuracy for recognition than KNN, along with a shorter time.
However, our experiment still needs improving in some aspects. To begin with, since we did not have a deep understanding of the classification methods using KNN and SVM, we have not optimized the algorithm by adjusting the parameters. It can be further improved in the future by optimizing the two algorithms.
In addition, we did not design the hardware. The whole device is not wearable and is inconvenient to move around, which is a limitation for people in need of help.
On all accounts, recognition of hand gesture based on SEMG signals is, though complicated and challenging, an efficient way to identify gestures and help people in need. Further study should be carried out including the optimization of the algorithm for classifying by adjusting the parameters and the development of hardware.
Acknowledgements
We would like to thank Professor Jan Van der Spiegel and our teaching assistant Susan Chen for their guidance with our sincere gratitude. Also, many thanks are expressed to Yubin Qin and his group for their kind help during the experiment.