Research on the Correlation between Learning Effectiveness and Online Learning Behavior Based on Online Education Scene

With the development of electronic equipment and communication technology, online learning came into being. However, online learning makes the time and space divided in the teaching process, so that the teachers can not accurately control the learners’ learning state and evaluate the learners’ learning situation. In this paper, statistical analysis method is used to analyze the online learning behavior and achievements of three courses of a university network education institute, and the index correlation between the learning effect and the learning behavior in the process of online education is studied. It also reflects the problem that the online learning does not give full play to its advantages and the process of online learning is neglected by the learners.


Introduction
In general, online learning refers to using the network for learning or teaching activities is, including MOOC Mu class platform, Adult online education, the group of we chat, mobile phone APP and other forms of learning (Wei, 2012). In recent years, the research on online learning both at home and abroad is also showing an increasing trend year by year. For example, Riqin Bao, who is an associate professor in Quzhou Radio and Television University, made a research on influence factors for intention of learners to use Mobille learning in Open education (Bao, 2017); Yang Tian made an empirical study of effects of Online Learning social behaviors on learning outcomes (Tian et al., 2017); Ying Wang researched a study on the Performance evaluation model for Online learning based on mobile Internet; and there are a lot of research on online learning (Wang et al., 2017). The reason for the rapid increase in the number of studies is mainly from the rapid development of society which requires education to break the existing unified planning situation and provide an absolute, interactive, flexible teaching system to provide a convenient and practical way for on-the-job education, on-the-job training and individual education (Wang & Hu, 2013).
We can clearly see that the online learning methods continue to spread and in-depth in colleges and society, and it provides a new way of learning and a method of studying for the incumbent and students, and it also brings the fresh blood for the development of education. In recent years, there are also some studies related to the factors that affect the results of online learning, showing a trend of increasing year by year and indicating that online learning in the country has been getting more and more attention. Its development is very fast.
However, due to the online learning lacks the way of face-to-face communication used by traditional educational methods, the teachers and learners use the network as the transmission path for knowledge transfer and learning (Gorobtsov et al., 2016). Because of this, the learning process presents a one-way characteristic, so that the teaching staff is difficult to control the teaching process and they can not communicate with the learners in a timely and effective manner in the teaching process, so the teachers can not teach in accordance with students' aptitude; at the same time, the teachers can't get an accurate assessment about the students' learning state and teaching effect because of the lack of communication between the teachers and students, it hinders the further development of online learning (Zhang et al., 2016). In order to maximize its advantages about the online learning, it has been more and more important to explore the factors that affect the results of online learning. Therefore, through the analysis of learning behavior of learners' online learning and the establishment of performance evaluation model, to explore the factors that affect the academic performance, it is not only conducive for the teaching staff to adjust the teaching methods and curriculum scientifically and to improve the teaching effect which can help the learners understand the learning process and optimize the learning environment, it also can compensate for the defects brought by the lack of communication. On the other hand, it can also provide guidance for the learners to improve the learning efficiency in the process of autonomous learning and help the learners to find a good way to complete the accumulation of knowledge and self-improvement though making more efficient use of online learning.

Research Ideas
In this paper, we study the correlation between the online learning behavior or other factors and the grades of online learning in online education. Among them, we are based on the Online learning behavior related data, according to the grades of examination paper, we analyzed the direct correlation between behavior and achievement. To ensure the rationality and validity of the Online learning's indicators and the data of the grades of examination paper, we should make a series of pre-processing of all indicators at the first time, and then analyze the correlation between the influencing factors and the examination performance to research the students' behavioral characteristics who have different grades of examination paper.

Data Preprocessing
The case of this study is network education college of a university. In order to take into account the difference of liberal arts and science, and the difficulty of different courses, to avoid the emergence of extreme situations and get more reasonable analysis, we selected three courses called "Management Principles", "Advanced Mathematics" and "Data Structure" set up by the network education institute. The obtained data dimensions has the following: gender, birthday, course, major, the level of major, actual number of courseware on demand (V_Frequency), actual learning time (minutes), actual quantity of postings, homework score of EQS (H_Score), the grades of examination paper (Score) and final grade. (Note: The selection of the above indicators is permitted by the network education college and related student, and it does not involve privacy is-

sues.)
First of all, the data samples are preprocessed by the software called Python.
From the original sample of the data, we can learn a total of 1615 sample data of the "Advanced Mathematics" course, and we have a total of 765 sample data for the "Data Structure" course and 1208 sample data for the "Management Principles" course. Since some learners have no grade of examination paper, we suppose that the part of the data samples have default values and need to be removed. Therefore, after excluding this part of the data, we can get a total of 1076 sample data of the "Advanced Mathematics" course, a total of 515 sample data of the "Data Structure" course and a total of 1109 sample data of the "Management Principles" course. In addition, due to the task performance of EQS is involved as a dimension in the original data sample and the content of EQS task includes EQS homework completion times (H_Frequency) and the average score of work (H_Score). When the data is preprocessed, we converted the task performance of EQS to two learning behavior indicators: H_Frequency and H_Score.

Data Processing and Analysis
After the data sample's preprocessing is completed, the data samples have been transformed from the initial chaotic state to the normalized data that can be analyzed. Then, the data samples need to be analyzed statistically. The software for statistical analysis of the data samples is SPSS, and software version we use is  course is relatively little less viewed compared with the other two courses, but the largest number of viewing times and the standard deviation is too large, and it is likely that most learners believe that the "Management Principles" course is easier to grasp than the other two courses, so most people don't choose to watch the video or only watch a small number of times, and they think that it is enough to grasp the relevant content. But for the "Data Structure" and "Advanced Mathematics" courses' teaching video, the learner watch too many times, it is probably because that the two courses are more difficult, so that the learner must be repeated through watching the related video. Students need to be on the knowledge of the repeated understanding, repeated scrutiny, and then they can master the knowledge and be able to complete the learning interaction between teachers and students, in addition, they also can complete other learning activities like homework, and finally pass the examination.
c) The average watching time for the three courses is respectively 263.71 minutes, 161.63 minutes and 212.52 minutes, the minimum is 0 and the maximum are respectively 5833 minutes, 7830 minutes and 15520 minutes. The standard deviation is 565.980, 598.392 and 815.811. Combined with average number of watching teaching video of the learners, we can see that the "Management Principles" course is relatively little less viewed, but the viewing time is relatively longer, "Data Structure" course is viewed more times but the watching time is relatively shorter, "Advanced Mathematics" courses are more frequently viewed for a long time. It indirectly shows that the "Management Principles" course is inherently less difficult, and learners can obtain high returns and gain a lot of knowledge through less input. For the "Data Structure" and "Advanced Mathematics" courses, because the course itself is more difficult, the learners need to repeatedly study and they can gradually grasp the knowledge. d) From the perspective of learners' interaction, the average number of postings for the three courses is 2.59, 0.71 and 0.39, and the standard deviation is between 1.1 to 1.6, which means that no matter which course, the interaction between learners and learners is not very well in the course of learning. In another word, the interaction between learners and learners is very few. We can see that the largest number of postings of the "Management principles" course reached 14 times, the minimal number of postings is 0 times, and the standard deviation is larger, indicating that only a very small number of learners are willing to participate in interactive activities frequently, but most of the learners prefer to stay in the stage of self-learning; In addition, we can see the "Data Structure" and "Advanced Mathematics" courses which maximal number of postings is only 7 times and 9 times. The difference about the number of postings between the different course attributes reflects the difference between the learners, in another word, the more specialized courses, the less easily accepted by most learners. It is perhaps because the online learning more embodies the ability of a learner self-study, although there are teaching video and communication between teachers and students in the process of online learning. Especially for the highly professional courses, most of the learners will feel more difficult. e) Whether in the traditional way of learning or online learning, after-school assignments are inevitable practice and test methods for the learners. After-school homework reflects the learning situation of the learners after class, as well as the mastery of knowledge after the short-term learning process, it can quickly detect the learners for a certain part of the knowledge of the situation. As can be seen from the table, the indicator of average H_Frequency for the three courses is 1.65 times, 1.43 times and 1.31 times respectively, and the standard deviations are 0.816, 0.875 and 0.932 respectively. Due to the reasons set by the course itself, students can complete two after-school homework in a study cycle, the average time of completions indicates that the completion of the three courses is better and that the absence of two operations is only present on a very small number of people. Most learners are able to complete the homework on time and in time. While the three courses of EQS score were 70.4205 points, 36.5388 points and 40.3812 points, the standard deviation is 35.18986, 31.94978 and 34.14496, which shows that the learners' after-school homework can be completed on time and quantitatively, however, there is a clear difference in the quality of homework completion in different courses, among them, the afterschool homework quality of the "Management Principles" courses is much higher than the "Data Structure" and "Advanced Mathematics", which also reflects that the student's academic performance and knowledge of the situation have a certain relationship with the degree of difficulty of the course. f) Test paper results are also a test method that is important both in traditional learning and online learning, and it can reflect the learner's mastery of the course content in a study cycle. As can be seen from the analysis results, the average score of the three courses is 75.2347 points, 70.6596 points and 70.2004 points respectively, and the standard deviations are 19.08920, 13.14293 and 18.18375 respectively. It intuitively reflects the average score of the "Management Principles" course was significantly higher than the "Data Structure" and "Advanced Mathematics", in another hand, we can see that the "Management Principles" course is really easier. In addition, it can be seen from the table that the standard deviation of the "Data Structure" course is smaller than other two courses, indicating that the distribution of the course results is more concentrated, most students are in the same level about the mastery of the course. However, the other two courses appear a large uneven phenomenon.
All in all, though making a descriptive statistical analysis for the "Management Principles", "Data Structure" and "Advanced Mathematics" courses, we Through the data analysis, we can know the learners have low enthusiasm to participate in interactive activities, the completion of the work attitude is better but the quality is uneven. The knowledge of the relatively simple course is in good condition, and the liberal arts course is more likely to be accepted by the learner, and the science course is more difficult for the learner to accept, so it can initially think that the difficulty of the course can affect the learning behavior of the learner when they participate in the online learning, but also can affect the learners' academic performance.

Correlation Analysis
After understanding the characteristics and attributes of data samples though the descriptive statistical analysis of data samples, we must do the further analysis of the correlation. Correlation analysis refers to the analysis of two or more elements or behavioral indicators, which can be measured between the degree of correlation among these factors. In this step, SPSS is used to analyze the data samples.

1) Correlation analysis (independent variable indicators do not include curriculum)
In the process of this analysis, the "curriculum" indicator does not appear as an independent variable in the analysis process.
The correlation analysis of the "Management Principles" course is carried out.

Pearson correlation analysis is carried out in SPSS with six independent variables
and dependent variable about the Score. The results are shown in Table 4.
As can be seen from Table 4 There was no significant correlation between the independent variables and the dependent variable scores at the significance level of 0.05. The H_Score is the highest correlation with the Score, which shows that the higher the quality of the homework is, the higher the learner's academic achievement is. The correlation analysis of the "Data Structure" course is carried out. Pearson correlation analysis is carried out in SPSS with six independent variables and the dependent variable of the Score. The results are shown in Table 5.
It can be seen from Table 5 that the Age (correlation 0.118), Quantity of postings (correlation 0.139), H_Score (correlation 0.161) were significantly correlated with the Score at 0.01 significant level. The H_Frequency (Correlation 0.090) was significantly correlated with the Score. The correlation of the H_Score is the highest, and there is a positive correlation trend, indicating that the higher the quality of the completion of the operation and the higher the test paper score is, the greater probability that the learner's academic performance is excellent.
The correlation analysis of the "Advanced Mathematics" course is carried out. The Pearson correlation analysis of the six independent variables and the final results of the dependent variable is analyzed in the SPSS. The results of the analysis are shown in Table 6.  As can be seen from Table 6, the Age (correlation 0.189), V_Frequency (correlation 0.165), Quantity of posts (correlation −0.092), H_Score (correlation 0.209) was significantly associated with the Score at the significance level of 0.01; But at the significance level of 0.05, there was no significant correlation between the independent variable and the score of the dependent variable. The H_Score is highly correlated, and there is a positive correlation trend, indicating that the higher the quality of the job completion, the higher possibility that the learner's academic performance is excellent. Finally, we did the correlation analysis of the three courses and didn't distinguish among the courses, Pearson correlation analysis was performed on SPSS with six independent variables and the final outcome of the dependent variable, the results obtained in Table 7.
It can be seen from Table 7 that the Age (correlation 0.172), V_Frequency (correlation 0.057), Time of watching teaching video (correlation 0.087), Quantity of postings (coeerlation 0.126), H_Frequency (correlation 0.086) and H_Score (correlation 0.303) were significantly correlated with the Score. At the significance level of 0.05, there was no significant correlation between the independent variable and the score of the dependent variable. The H_Score is higher than that of the other indicators, and there is a positive correlation trend, indicating that the higher the quality of the homework is, the higher the learning achievement is.
2) Correlation analysis (independent variables include curriculum) According to the difficulty of the course on the three courses, we set their value, and we believe that the greater difficulty of the course will lead to the value more larger, so the code of "Management principles" of is defined as 1, "Data structure" encoding is 2, "Advanced mathematics" is coded as 3, we selected the six independent variables which are the same as the independent variables that do not contain the "curriculum" variables, and then add the "curriculum" variables, we make the seven "independent variables" indicators and results for Pearson correlation analysis, the results shown in Table 8.  It can be seen from Table 8 that the Age (correlation 0.162), Course (correlation −0.127), Time of watching teaching video (correlation 0.087), Quantity of postings (correlation 0.144), H_Frequency (Correlation 0.336) and H_Score (correlation 0.306) were significantly correlated with the Score; at a significance level of 0.05, the V_Frequency (correlation 0.043) has significant correlation with the Score. The H_Score is significantly higher than other indicators, which has a positive correlation trend, indicating that the completion of if the homework get a higher quality, the score of test paper of the learners were higher, can they get the higher the possibility of learners' academic performance; It can also be seen that the grades of science courses are lower than those of liberal arts courses.

Conclusion
The study found that each of learning behaviors is of great relevance with the learning effects, the Time of watching teaching video (correlation 0.087), Quantity of posts (correlation 0.144), H_Frequency (correlation 0.106), H_score (correlation 0.306) and V_Frequency (correlation 0.043) which indicate that academic performance and learning behavior are closely linked, and we can see the H_Score and H_Frequency are important factors affecting the final grades. They require us to pay more attention to the completion of after-school homework both in theory and in practice. Not only on time and quantity to complete, more importantly, students must ensure the quality of homework after completion.
On the other hand, teachers should also be able to pay attention to the diversity and rationality of the content design of the operation, and the operation should be a certain degree of difficulty and scientific; In addition, teachers also need to strengthen the supervision about the completion of the homework and conduct an online job review, for example, the learners who don't pass the test need to resubmit the assignment. Besides, the teachers can also do some random checks and do a certain proportion of random checks in the interval between the two operations, where the random sampling is that learners to be spotted need to participate in temporary online evaluation, there are two opportunities to be assessed for learners, if someone don't pass the evaluation twice, we see the previous job score as 0 points, but we asked the content of the online survey to be highly relevant to the content of the previous assignment. According to the analysis of the correlation of each factor, on the one hand, the research of this paper can guide the designing of teaching curriculum in the future and improve the teaching effect; on the other hand, it can be more scientific to set the assessment indicators for the students' grades.
As a new way of learning, online learning not only requires learners to have better academic performance, more importantly, it's very important to cultivate students' interest in learning, that is a good way which focuses on the process and participation in learning. From the case we can see that the learners are very negative for the discussion of the teaching video and the forum, most learners think that it is possible to achieve excellent results as long as they have completed the homework carefully, and there is no benefit for them to master the knowledge. So we should give full play to the advantages of online learning and adjust the effectiveness evaluation system of online learning, focusing on the learning process for the purpose of higher paper performance and helping learners find interest in the learning process, for example: 1) Improve the weight of the times of watching teaching video and the weight of the number of postings in the performance evaluation, the teachers guide the learners to gradually focus on the learning process and help them enjoy the process of access to knowledge; 2) Set up additional incentives system, such as the students who have the highest number of postings in the forum or messages to discuss the course can miss the relevant content and we will think that the job is full; encourage students to record their own videos about the learning content, and they can get the extra score about relevant homework content, and they don't need to do homework; 3) When the learners receive more bonuses, we can add score for the paper results or give free exams to them. Through these series of initiatives, I believe that learners can no longer only focus on the Score of homework, and they will take more emphasis on the learning process and get more interest of learning.