Detection of Learner ’ s Concentration in Distance Learning System with Multiple Biological Information

The trend of distance learning education has increased year by year because of the rapid advancement of information and communication technologies. Distance learning system can be regarded as one of ubiquitous computing applications since the learners can study anywhere even in mobile environments. However, the instructor cannot know if the learners comprehend the lecture or not since each learner is physically isolated. Therefore, a framework which detects the learners’ concentration condition is required. If a distance learning system obtains the information that many learners are not concentrated on the class due to the incomprehensible lecture style, the instructor can perceive it through the system and change the presentation strategy. This is a contextaware technology which is widely used for ubiquitous computing services. In this paper, an efficient distance learning system, which accurately detects learners’ concentration condition during a class, is proposed. The proposed system uses multiple biological information which are learners’ eye movement metrics, i.e. fixation counts, fixation rate, fixation duration and average saccade length obtained by an eye tracking system. The learners’ concentration condition is classified by using machine learning techniques. The proposed system has performed the detection accuracy of 90.7% when Multilayer Perceptron is used as a classifier. In addition, the effectiveness of the proposed eye metrics has been confirmed. Furthermore, it has been clarified that the fixation duration is the most important eye metric among the four metrics based on the investigation of evaluation experiment.


Introduction
Distance learning system can be regarded as a ubiquitous computing application because it has relieved learners from fixed learning places and realized education opportunity anywhere.Assessment of learners' condition in distance learning systems has gained a lot of attentions in Human Computer Interaction (HCI) research.Basically, it is necessary for an instructor to ensure the learners' participation in learning process and assist them based on their condition during the class.Recently, various studies have investigated learners' emotions as one of the factors that affect their learning process.For instance, if a learner is stressed or bored, he/she may not learn well.One of the promising approaches to detect learners' emotions is the use of biological signals [1].In fact, biological signals such as eye movements, heart rate and skin temperature are controlled without human's consciousness [2].Thus, this technique is considered as one of the best methods to detect learners' emotions [3].If the learner is taking a class with such a negative emotion, he/she may not concentrate well on the ongoing lecture.
In the current literature, evaluation of learners' concentration with a real time feedback to the instructor in a distance learning system is missing.Generally, in distance learning environments, it is difficult for an instructor to know if the learners understand the lecture or not.Thus, at the end of the session, if the learners do not understand the lecture, the instructor's effort and his/her instructional ideas will become useless.Hence, this is considered as a crucial issue in distance learning systems.
In order to overcome this issue, different approaches such as online questionnaires [4] and facial recognition techniques [5] have been introduced in the past few decades.However, in the proposed approaches, learners can hide some information which might be useful for the instructor.For instance; Jaffar et al. [4] claimed that by using questionnaire approach, majority of learners provided positive information in order to satisfy their instructors.Moreover, they believe that if they evaluate their instructors based on their performance, it may affect them academically, especially, in the aspect of the final examination assessment.Thus, it is necessary to design a system which can recognize learners' condition and notify the instructors of it when the learners are not concentrated on the lecture.
The main goal of this work is to accurately detect learners' concentration in distance learning systems, based on the learners' condition (for instance; if an instructor can get information of learner's concentration, he/she may adjust the lecture skills).In the context on this study, "concentration" is defined as the situation where the brain can effectively process information.On the other hand, "not concentrated" is the situation where the brain cannot process information effectively.This may be influenced by different factors such as content style, instructor's lecture skills, health condition, and so on.Therefore, this study focuses on two factors which are the content style and the instructor's lecture skills.In our previous study [6], influence of the content style on the learners' concentration was investigated (e.g.some learners prefer the content style with only text while others prefer text and figures etc.).Our study found out that majority of learners prefer the content style which include both pictures/figures and text.Moreover, they declared that the content style with only text makes them bored and lose concentration easily.
Hence, in this paper, in order to achieve our goal, a system which detects learners' concentration in distance learning systems with multiple biological information will be proposed.In the proposed system, an eye tracking system will be employed.First, during a class session, learners' eye metrics will be recorded in the eye tracking system.Secondly, the eye tracking data will be analyzed by considering two states, namely, "concentrated" and "not concentrated".Finally, if more than the threshold, a number of learners are not concentrated, a notification will be sent to the instructor's computer display.Based on the notification the instructor will immediately adjust the content style and instructional strategy to the learners.This technique will enhance the learners' concentration and make the session more interesting.
The rest of this paper is organized as follows: Section II will review the works related to this study.Next, in Section III, the proposed system will be explained in detail.Then, in Section IV, the evaluation of the proposed approach through experimental results will be shown and discussed in Section V. Finally, in Section VI, this paper will be concluded and future work directions will be provided.

Related Works
Since the past few years, there has been a debate about the best approach that can be used to assess learners' behaviors during learning.Some studies pointed out emotion as the main factor that affects learners' performance in learning process [7] [8].Until now, few studies have addressed reliable approaches on how to detect such emotions.Out of the various proposed approaches, facial recognition techniques [9], biological signals and machine learning techniques [10] have been studied.
Hwang and Yang [11] deployed an image recognition approach to capture face image of learners during learning, and analyzed their face features to evaluate their effective states.They detected an effective state when both eyes of learners were detected, and argued an ineffective state when both eyes were closed.However, even when learner's eyes are closed he/she is not always inattentive.
Charoenpit and Ohkura [1] proposed a new e-learning system focusing on an affective aspect, which integrates three different biological sensors including Electroencephalogram (EEG) sensor, Electrocardiogram (ECG) sensor and eye tracking system.Their study deployed EEG sensor to record electrical activities of learners' brain and ECG to record electrical activities of the learners' heart during a class session.Using the proposed system, they analyzed learners' emotions such as boredom, anxiety and anger during learning, and designed the environment to avoid such emotions.However, their study did not discuss instructor's lecture skills as the factors that may raise learners' negative emotions in the lecture.Krithika and Priya [12] proposed a system to detect learners' concentration level by continuously monitoring their head rotation and eyelid status (if the eyelid is closed or open).Moreover, they considered facial features to detect whether the learners are focusing on the visual content or not, and discussed the head movement to determine the learners' concentration level.Their method was intended to detect learners' concentration level to only visual content.Moreover, by considering the facial features and eyelid status, the detection precision is skeptical because some learners tend to close their eyes when they brainstorm about something.In that case, it is hard to judge that their concentration level is low.
Hwang et al. [13] proposed a formative assessment approach using data mining technique which integrates six computational intelligence schemes in order to determine learning behaviors and performance from learners.However, their proposed system was focused on asynchronous e-learning system where there is no interaction between learners and an instructor.Therefore, such detected results cannot be used to improve the lecture in real time.
Chen and Sun [14] proposed a system to evaluate learners' emotions and performance by providing different types of multimedia materials such as text, image and video.Their goal was to examine how learning materials may affect learners' performance as well as their emotions.Finally, they concluded that video based multimedia materials bring the best performance and positive emotions.
Their study focused on the content style as a factor that affects learners' performance during learning.However, they did not consider the instructor's lecture skills even though they may affect learners' performance as well.
Recently, a lot of studies on eye movements [15] [16] [17] and machine learning techniques [18] [19] have been done to detect different kinds of learner's behaviors during learning.However, detection of learners' concentration with a feedback to an instructor has not been given much attention in distance learning environments.This is because most of the studies have focused on assessing learners in asynchronous learning and assisting them based on their condition [20].For instance, Calvi et al. [20] discussed a system to detect learners' high mental workload and to propose links to additional materials.Their method assists learners by providing additional materials when the high mental workload condition is detected.However, a real time feedback to an instructor about learners' condition is not discussed.
Considering the current literature in assessment of learners, it is believed that our proposed system is one of the promising solutions to improve the quality of distance learning education.

Proposed Distance Learning System
This paper proposes an efficient distance learning system, which accurately detects learners' concentration condition during a lecture, using multiple biological information with the support of machine learning technique.When more than the threshold, a number of students are not concentrated on the lecture, the system notifies the instructor of it.In order to detect such a condition, an eye tracking system, OGAMA software and machine learning technique are utilized.Moreo-ver, several eye metrics including fixation counts, fixation rate, fixation duration and average saccade length are introduced to obtain learners' concentration condition.In the following subsections, the device and software to obtain learners' eye metrics, the proposed system design and the classifiers of concentration condition are described.

Eye Tracking System
Eye tracking system is a common device which tracks learners' eye movements and records them.THE EYE TRIBE has been selected in this study as the eye tracking system because it is the cheapest one among similar products.Therefore, it is possible to deploy the proposed system to large scale of distance learning classes.
The sampling rate of the device is 30 Hz with accuracy of 0.5° visual angle, i.e. the angular average distance from the actual gaze point to the point measured by the device.This device uses cornea reflection technique to record eye movements.In this technique, when the near infrared light (Figure 1) is directed towards the center of the pupil, it causes visible reflection in the cornea.When the reflection is tracked by the camera, the eye movements' data is recorded [21].

OGAMA Software
OGAMA is open source software designed to analyze eye and computer mouse movements in a slide show study [22].In order to record and analyze the eye mo-

Proposed System Design
Figure 2 shows the proposed distance learning system design."Learner" is an individual who has registered for the distance learning system.Each learner takes a class remotely using a web browser in each location in a specific time."Instructor" gives a lecture in a distance learning class and has the power to make lecture content, presentation material, quiz, assignment and examination, and to interact with each "Learner" in LMS.In addition, "Instructor" has the power to provide immediate feedback on learner's works, discussion and evaluation.
"Concentration Detection System" is composed of THE EYE TRIBE and a computer.THE EYE TRIBE installed on the computer tracks learner's eye movements during a class session.The eye movements' data including fixation duration, fixation rate, fixation counts and average saccade length are recorded in this system in real time.The computer analyzes the data to determine if the learner is in the concentration condition or not by machine learning technique.If the analytical result indicates that the learner is not concentrated on the lecture, it is notified to "Notification" module in LMS through "Notification System".If the analytical result indicates that the learner is concentrated on the lecture, the system continuously tracks the learner's eye movements."LMS (Learning Management System)" is a software which manages documentation, administration, reporting and delivery of electronic educational technology courses [24].In addition, it also manages assessment of classes and can be an interface between instructors and learners."Notification" is one of the components in LMS.When a learner is not concentrated on the lecture, each "Concentration Detection System" notifies it to "Notification" module in LMS.When more than the threshold, a number of learners are not concentrated on the lecture, "Notification" module sends a real Figure 2. Proposed distance learning system design.time message to the instructor's computer display to notify him/her of it.Then, "Instructor" will adjust his/her content style and instructional strategy to "Learner's.

Classifiers of Concentration Detection
In this proposed system, four types of eye metrics obtained by THE EYE TRIBE are used to detect learners' concentration condition.In order to determine if a learner is concentrated or not concentrated on the lecture, machine learning techniques are introduced.Three classifiers, namely, Multilayer perceptron (MLP), Sequential Minimum Optimization (SMO) and J48 have been selected and compared in this study.Multilayer perceptron (MLP) [25] is a feed forward artificial neural network model that maps sets of input data into a set of appropriate output.It consists input, output and hidden layers.MLP constructs a multidimensional space by the hidden node activation and separates the three classes malignant, benign and normal tissues as much as possible.Firstly, it passes the weight assigned to different layers, and determines the output and compares it with the desired output.After that, it propagates error signal and adjusts the connection weights respectively.
On the other hand, Sequential Minimum Optimization (SMO) [26] is an algorithm for solving the quadratic programming problem that arises during the training of Support Vector Machine (SVM).Support Vector Machine requires the solution of a very large Quadratic Programming (QP) optimization problem.It breaks a large QP problem into a series of smallest QP problems.The smallest QP problems are solved analytically, which avoids a time consuming numerical QP optimization as an inner loop.
J48 [27] is a decision tree that is used for classification using the information entropy concept.Decision is done by splitting each data attributes into a smaller subset in order to examine the entropy differences and the attribute with the highest normalization information gain is chosen.The data splitting stops only when a subset belonging to the same class is found and the leaf node gets created.
Here, the main reasons for using machine learning techniques are individual difference and environmental changes.In general, it is almost impossible to set a threshold to each eye metric to determine if a learner is concentrated or not on the lecture.This is because the absolute value of biological information depends on the person.In addition, each biological information is influenced by the situation independently; hence, not every eye metric always shows the same tendency to the concentration condition.However, when multiple biological information is taken into account, some average tendency of the concentration condition could be extracted since it can absorb the individual difference and environmental changes.
To extract such an average tendency from several eye metrics and make a decision if a learner is concentrated or not, the machine learning techniques are effective.

Performance Evaluation
A set of experiment was conducted in order to evaluate the proposed approach.Specifically, four biological information from learners' eye movements including fixation duration, fixation counts, fixation rate and average saccade length were analyzed.The experimental setup employed a standard computer with a 27-inch monitor and THE EYE TRIBE.THE EYE TRIBE was connected to the computer via Universal Serial Bus 3.0 (USB 3.0) interface and placed beneath the computer display where the distance from the device to the tracked person was less than 60 cm.The computer screen resolution was 1024 × 768 and the light intensity was to 60 lux.
The experiment involved 27 subjects who are students at Shibaura Institute of Technology aged between 21 and 35 years old.Furthermore, each subject was asked if he/she had any visual problem.Those who had no visual problem were selected for the experiment.As an experimental procedure, each subject was instructed properly on how to use THE EYE TRIBE.In order to detect the learners' concentration condition, their eye movements were recorded while concentrated on two contents, namely, easy and boring contents.
The easy content (Content 1) was composed of simple mathematics equations while the boring content (Content 2) contained nothing.Meanwhile, no questions were included in the boring content.Furthermore, at the beginning of the experiment, each subject was told that both contents are composed of mathematical equations.In addition, they were informed that the questions are going to appear on the screen randomly, thus, they have to wait for the questions to be displayed.The trick here was to trigger their negative emotion which may contribute to distract their concentration on the boring content.Therefore, it can be assumed that the subjects concentrate on the easy content because they can solve the equations without any difficulty while they cannot concentrate on the boring content since they cannot do anything except waiting for the content appeared.Subsequently, the calibration was performed successfully and the experiment was started one subject at a time as shown in Figure 3.
In this experiment, each subject was asked to brainstorm about the questions and give out the answers within the experimental time.The duration given for each content, was 90 seconds.After the experiment finished, each subject was given a questionnaire to evaluate both contents.Based on their response from questionnaire, Content 1 was easy, while for Content 2, most of the subjects felt bored and declared that they could not concentrate on the content.
The investigation of fixation counts, fixation rate, fixation duration and average saccade length metrics was done to determine the learners' concentration on both contents.The study conducted by Kodappully et al. [28] mentioned that biological information has a correlation with brain activities.They also claimed that many fixation counts on the area of interest implies the interest on the content, while high fixation rate (fixation count per second) implies effective information processing.Moreover, they stated that the long fixation duration in the area of interest implies high mental workload which contributes to negative emotions such as stress and anxiety.On the other hand, the longer the average saccade length implies less concentration due to the fact that the content is skipped several times.Thus, based on the previous investigation the hypothesis of this study has been made as shown in Table 1.
Successively, the experimental results were analyzed and each metric was investigated by comparing the average value obtained from 27 subjects for Content 1 and Content 2, respectively.Figures 4-7 show the results of each metric, namely, fixation count, fixation rate, fixation duration and average saccade length, respectively.From these four figures, it seems that all the hypotheses have been confirmed on average.However, it is difficult to decide each subject's concentration condition using these four eye metrics since the deviation is quite large.In addition, not every eye metric for a subject follows each hypothesis as stated in the previous section.
Therefore, machine learning techniques were used to classify the concentration condition using four types of eye metrics.The performance of three classification algorithms in machine learning, namely, Multilayer Perceptron (MLP), Sequential Minimal Optimization (SMO) and J48 decision tree were compared in terms of accuracy and execution time in different scenarios.
Figure 8 reveals that MLP has achieved the highest classification accuracy of 90.7%.However, the execution time is a bit longer in MLP due to the back propagation technique.Also, the accuracy of MLP decreases when some eye metrics parameters were excluded from training process.Moreover, Fixation duration metric is considered as the most significant metric since once it was excluded in the training process, the classification accuracy decreases from 90.7% to 66.7% which is the smallest accuracy in MLP.
In Figure 8, the vertical axis shows the classification accuracy, while the horizontal axis shows the inclusion and exclusion of some metrics in training process as shown in Table 2.       subject was waiting for the questions appeared on the screen as they were instructed.At the end of the experiment, each subject realized that there were no questions on Content 2.Moreover, all the subjects admitted that they were not able to concentrate on the content because the questions do not appear as per their expectation.The experimental results reveal that the proposed biological metrics were able to detect the learners' concentration condition properly once the data was analyzed in machine learning regardless of the individual difference in eye movements.This is the evidence that the biological information from eye movements has a correlation with brain activities.

Discussion
This paper proposed a novel automatic approach to detect learners' concentration condition in distance learning systems through their eye movements.In this study, the four eye metrics were selected because Spering and Carrasco [29] claimed that eye movements are associated with brain activities.
Furthermore, the four eye metrics results analyzed in OGAMA implied learners' concentration condition.However, the tendency of each eye metric depended on the subject due to the individual difference in eye movements.Thus, the average value of each metric from all the subjects was analyzed as shown in Figures 4-7.
From the results, the concentration condition was clearly discriminated when 27 subjects focused on the easy and boring contents.The main goal of this study was to detect learners' concentration condition using multiple biological information.However, each subject has different tendency with the four eye metrics for concentration condition as seen from the deviation in Figures 4-7.Therefore, a threshold based discrimination of learners' concentration con dition cannot work out.Hence, the idea of classification by utilizing machine learning techniques were Regarding the importance of each eye metric, when all the eye metrics were included in the training process, the accuracy was 90.7% for MLP.However, when any one metric was excluded in the training process, the accuracy in MLP drastically decreased to less than 80%.This result reveals that the proposed four eye metrics are suitable for detecting learners' concentration condition in distance learning systems.Moreover, when the fixation duration metric was excluded from the training process, the classification accuracy in MLP decreased to 66.7% which was the smallest accuracy.This shows that the fixation duration among the proposed eye metrics was the most significant metric.
Additionally, MLP has achieved the highest classification accuracy.However, the classification accuracy in MLP could not reach 100%.This is because of the low sampling frequency of the eye tracking device.According to Sam et al. [30], the eye tracker with low sampling frequency affects the quality of eye movement data.It can be solved by using a high performance eye tracking device.
It is believed that the proposed system to detect learners' concentration condition will contribute a lot to improve the quality of distance learning education.

Conclusions
The goal of this work was to detect learners' concentration condition in distance learning systems through their eye movements.To achieve this goal, in this paper, an efficient distance learning system using multiple biological information with the support of machine learning techniques was proposed.The proposed approach performed the classification accuracy of 90.7% when MLP was selected as the classifier.In addition, the effectiveness of the proposed four eye metrics, namely, fixation counts, fixation rate, fixation duration and average saccade length was confirmed.Furthermore, the influence of each eye metric on the classification accuracy was also investigated.The result shows that the fixation duration is the most important metric among the four.
Currently, the effectiveness of the proposed method has been evaluated from only twenty-seven subjects.This is considered as a small number of subjects due to the limitations with THE EYE TRIBE tracking system, whereby subjects putting spectacles or contact lenses, their eye movements cannot effectively be detected by the device.Nonetheless, it is believed that with the rapid advancement of eye tracking systems, in near future, there might be a mechanism where a device will be able to overcome those limitations.
In future works, a mechanism to send a real time feedback to an instructor will be studied and implemented.After that, the proposed system will be tested in a real distance learning environment and the overall performance of the system will be evaluated.
Figure 1.THE EYE TRIBE installed on a computer.

Finally
, learners' concentration was measured for Content 1 and Content 2. For Content 1, all subjects gave the right answers.However, for Content 2, each subject

Figure 4 .
Figure 4. Average fixation counts for easy content and boring content.

Figure 5 .
Figure 5. Average fixation rates for easy content and boring content.

Figure 6 .
Figure 6.Average fixation durations for easy content and boring content.

Figure 7 .
Figure 7. Average saccade lengths for easy content and boring content.

Table 1 .
Hypotheses on content evaluation based on eye metrics.

Table 2 .
Description of x-axis label from Figure8.
ever, the execution time in MLP was longer than other algorithms.This is due to the back propagation technique where learning occurs in perception by changing connection weight after each piece of data is processed.In this study, the classification accuracy is more important than the execution time.Moreover, the execution time in MLP is negligible for this purpose.Therefore, it can be concluded that MLP is the best classifier among these three.