Design and Implementation of Book Recommendation Management System Based on Improved Apriori Algorithm

Abstract

The traditional Apriori applied in books management system causes slow system operation due to frequent scanning of database and excessive quantity of candidate item-sets, so an information recommendation book management system based on improved Apriori data mining algorithm is designed, in which the C/S (client/server) architecture and B/S (browser/server) architecture are integrated, so as to open the book information to library staff and borrowers. The related information data of the borrowers and books can be extracted from books lending database by the data preprocessing sub-module in the system function module. After the data is cleaned, converted and integrated, the association rule mining sub-module is used to mine the strong association rules with support degree greater than minimum support degree threshold and confidence coefficient greater than minimum confidence coefficient threshold according to the processed data and by means of the improved Apriori data mining algorithm to generate association rule database. The association matching is performed by the personalized recommendation sub-module according to the borrower and his selected books in the association rule database. The book information associated with the books read by borrower is recommended to him to realize personalized recommendation of the book information. The experimental results show that the system can effectively recommend book related information, and its CPU occupation rate is only 6.47% under the condition that 50 clients are running it at the same time. Anyway, it has good performance.

Share and Cite:

Zhou, Y. (2020) Design and Implementation of Book Recommendation Management System Based on Improved Apriori Algorithm. Intelligent Information Management, 12, 75-87. doi: 10.4236/iim.2020.123006.

1. Introduction

Data mining is an algorithm that mines hidden laws from a large amount of data for effective analysis, which can efficiently calculate data statistics, pattern processing and other related issues. Along with the quick growth of science and technology in our country, the speed of knowledge renewal and change is fast. A large amount of new knowledge is continuously integrated into people’s production and life. Books are an important channel for people to acquire knowledge especially for students in major universities. The demand for knowledge is increasing and libraries are the main source. However, with the continuous circulation and increase of books, the increase of borrowers and book information has raised the difficulty of book management. An efficient book management system for information recommendation plays a significant role in improving the learning efficiency of borrowers and book management [1] .

The research on book recommendation system has been a new trend in recent years. The research on library personalized recommendation service started from the Digital Library Program launched in the United States in 1991. The research on the application mode of data mining technology in the field of library management has become an important subject in academic research, which effectively promotes the rapid development of data mining technology in library management application in Europe and the United States. Took readers’ usage as the object, Michael Cooper adopted scientific approach such as clustering analysis and association rules to analyze and process readers’ access records and browsing information, and predict readers’ usage habits and behavior trends [2] . Takanori Kuroiwa et al. [3] built a knowledge base (KB) based on the book information selected through web service, and developed book utilization system (BUS), so that, users can search books through web service. Furthermore, an infrastructure was created to visualize user preferences extracted from KB, so as to share existing books among users. Shuntaro Yada [4] proposed a system named Serendy where referenced book information of friends on Twitter was presented to readers who seldom read books. Serendy did not depend on the interests of user or the content of book, but on the social capital of users in social networking service (SNS). Through the closed beta testing, algorithm of Serendy has been enhanced to accurately identify the book information mentioned in the tweet. With the continuous improvement of data mining theory and its application technology in library information management, personalized information recommendation system has been paid attention to by libraries in various countries. Foreign libraries have developed a variety of book recommendation systems. Typical personalized recommendation systems include “Tapestry” system, “Fab” system, “SiteSeer” system and “CiteSeer” system [5] . Compared with foreign research, domestic research started late, but also made breakthrough progress. At present, the representative ones in China are the personalized recommendation system of the digital library of Renmin University of China, the “My Digital Library” of the National Science Digital Library of the Chinese Academy of Sciences, the “MyLibrary” system of the library of Zhejiang University, and the “Library Automation Integration System” developed by Shenzhen Library. Another example is LUKA, the search engine of the Institute of Intelligence studied by Shanghai JiaoTong University, Open Bookmark of Tsinghua University, Web Semantic Analysis of Chinese Academy of Sciences, ECMiner proposed by Fudan University, etc. [6] .

In China, data mining technology is used to mine reader information and borrowing information of university libraries, and the mining results are applied to personalized service of libraries. Some achievements have also been made in this research and application. Chen et al. [7] built an acupuncture books knowledge platform to provide users with retrieval service. The platform uses a variety of data mining techniques to achieve automatic text extraction. Finally, through association mining and decision analysis, a comprehensive intelligent analysis of diseases and symptoms, meridians, acupuncture points, and acupuncture rules in ancient acupuncture books was realized, and retrieval service was provided to users through browser/server structure. Guo et al. [8] established a decorrelated principal component analysis model based on correlation theory to obtain the main interfering factors of book user evaluation. Secondly, they have established a predictive scoring system based on linear regression theory, which can predict the score of books. Finally, they built a collaborative filtering model for book recommendation. These studies are of great significance to the development of domestic recommendation systems [9] .

However, at present, there are many kinds of recommendation systems, and the algorithms of recommendation are totally different. Some algorithms make use of the content that users usually use, and some algorithms make use of the knowledge of users. Although these recommendation methods have their own advantages, their degree of matching with the actual needs of users is still low. At the same time, the recommendation system also faces the problem of performance optimization. In order to solve these problems, this paper designs a book management system for information recommendation based on Apriori data mining algorithm. Using the efficient mining function of the improved Apriori data mining algorithm, the strong association rules in the reader borrowing database are mined. The development direction and correlation degree of various subject can be mined from the book borrowed by the borrower, association matching can be carry out with the books selected by the borrower according to the mined strong association rules, and the book information associated with the read books can be pushed to the borrower. The realization of personalized recommendation service is convenient for librarians to purchase, catalogue and classify, and provides students with required book resources. Finally, using the Shanghai Maritime University Library borrowing statistics as the data source, the experimental results show that the scheme can accurately and effectively recommend related books to borrowers. In addition, compared with the existing scheme, this scheme has smaller memory consumption.

The following paper is arranged as below: The second section describes the Overall structure and functional modules of the book management system; The third section introduces the traditional Apriori data mining algorithm and improves it. Based on the contents of the second and third sections, the fourth section uses the improved Apriori algorithm to realize the personalized information recommendation sub-module. The fifth section makes an experimental analysis of the book recommendation system based on the improved Apriori algorithm, which verifies that the book recommendation system in this paper can accurately mine book related information and effectively reduce the memory occupied by the computer. Finally, the sixth section is the conclusion of the paper.

The main contributions of this paper are as follows:

1) This paper designs a book recommendation system based on the improved Apriori algorithm, which can mine the strong association rules in reader’s borrowing statistics data set, and match the strong association rules with the books borrowed by the readers, and recommend the related books to the readers. The algorithm can effectively provide personalized recommendation services, and has a higher performance compared with the traditional systems.

2) The book recommendation system designed in this paper can not only recommend books related to books they borrowed, but also provide teaching reference for college teachers. College teachers can add relevant knowledge in the teaching process through the mining results, so that students can better understand the course knowledge. It can be seen that the scheme of this paper is of great significance for the development of intelligent book management.

2. The Overall Structure of Book Management System for Information Recommendation

2.1. Overall Structure

The book management system for information recommendation designed in this article combines the C/S (client/server) architecture and the B/S (browser/server) architecture. Modules implemented by C/S architecture, such as system management module and data mining management module, are open to the library staff. Modules implemented by B/S architecture, such as book searching module, borrowing record module and personalized recommendation module, are open to readers in the library. The overall structure of the system is shown in Figure 1.

2.2. Functional Modules

As shown in Figure 2, the functional modules of the system include system management module, book borrowing management module and information recommendation module. Among them, the system management module and the book borrowing management module respectively realize the overall management of the system and book borrowing situation management, the

Figure 1. Overall structure of book management system for information recommendation.

Figure 2. Structure of overall function of system.

information recommendation module is the most crucial functional module of the system.

Since our system is designed specifically with information recommendation in mind, this article will focus on the information recommendation module, which mainly includes three sub-modules: data preprocessing [10] , association rule mining [11] and information recommendation. The data preprocessing sub-module extracts relevant information about books and borrowers from the book-borrowing database and collects data for cleaning, conversion, and integration processing; The association rule mining sub-module uses an improved Apriori data mining algorithm and takes the processed data as item set to discover strong association rules with satisfied support degree (greater than minimum support threshold) and satisfied confidence coefficient (greater than minimum confidence coefficient threshold) based on item sets coming from the processed data. The rules discovered will be stored in the association rules database; The information recommendation sub-module works with this database of rules to match books selected by readers, and sends borrowers an notification containing information associated with their choice of books, thereby achieving the functionality of personalized information recommendation [12] [13] [14] [15] .

3. Improved Apriori Data Mining Algorithm

Major data mining techniques include association rule mining, data classification mining, data clustering mining, etc. Apriori algorithm is a highly efficient data mining algorithm based on association rules. In consideration of subsequent development on Apriori and analysis of association rules, relevant definitions are provided as follows:

Let the set of required mining items be I = { i 1 , i 2 , , i m } . T is the collection of all transactions.

Confidence coefficient: confidence coefficient refers to the “credibility” of an association rule. If A and B are both item sets, itemset A and itemset B have the following relations for transaction set S : A S , B S , A B , and the confidence of A B equals to The number of tuples with item sets A and B/the number of tuples with itemset A.

Support degree: the support degree of an association rule reveals the probability that both itemset A and itemset B appear in a transaction at the same time. The support degree of A B is the number of tuples with itemset A and itemset B/the sum of all tuples.

Strong association rule: mining strong association rules in the collection of all transactions is a type of association rule mining in data mining. Let the minimum support threshold be min_sup, and the minimum confidence threshold be min_conf. If, in a collection of transactions T, both Support degree ( A B ) m i n _ s u p and Confidence coefficient ( A B ) m i n _ c o n f are satisfied, then A B is a strong association rule in T.

Although the traditional Apriori algorithm [16] is capable of discovering strong association rules in a database, it has several performance limitations, such as: too many database scans and too many candidate item-sets which results in slow operation. The essence of Apriori data mining algorithm in association rules is mining strong association rules from the database with support degree greater than the minimum support threshold and confidence greater than the minimum confidence threshold. Building upon this essence, it is possible to improve the algorithm. Letting frequent item sets be sets of data that satisfy the aforementioned double thresholds, the improvement on Apriori is described as follows:

Data obtained by scanning the database of books is rendered in a Boolean matrix S, in which a row represents an item and a column represents a transaction. The obtained Boolean matrix S is as follows:

[ T 11 T 1 n T i j T m 1 T m n ] (1)

where: i = 1 , 2 , , m ; j = 1 , 2 , , n

Table 1 contains an example of a transaction database. The improvement on Apriori will be described using this example.

After Equation (1) is initialized, scanned values can be assigned to initialized matrix. If transaction T 1 contains item “1”, then T 11 = 1 , otherwise T 11 = 0 . The assigned Boolean matrix is as follows:

S = [ 1 0 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 1 1 0 1 0 ] (2)

Numbers in the same row are added to obtain frequent item sets L 1 : (“1” 6) (“2” 6) (“3” 6) (“4” 3) (“5” 4). Next, the numbers in each column are added and those with a sum less than 2 are removed. Then, “AND” operation is performed on T 1 j and T 2 j , T 1 j and T 3 j , T 1 j and T 4 j , and so on. The result of “AND” operation of the first two rows is the support degree for (1 2), which can be calculated to be 4. In the same manner, the minimum support degree can be calculated to be 2. Since a transaction must have at least three items in order for it to be used for the obtainment of 3-item frequent item sets, the sum of each column of each row is calculated and those with a sum inferior to 3 are removed. The result can be described in a Boolean matrix as follows:

S = [ 1 0 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 1 3 0 2 0 2 0 3 1 3 1 3 0 2 1 4 0 3 ] (3)

From Equation (3) it can be seen that the sum of T 2 , T 3 and T 7 are all 2. It is impossible to obtain 3-item item sets from them, so they should be removed.

The self-join rule is applied to frequent item sets L2 to obtain C3: (123) (124) (135) (124) (245) (235). When “ADD” operation is performed on the first three rows and C3, the result is 2, which is to say, the number of occurrences of (123) is 2. Similarly, the result of “AND” operation on row 1, 2, 5 and C3 is also 2. In

Table 1. Database of transactions.

the same vein, the occurrences of (125), (135) and (235) can be determined to be, respectively, 2, 3 and 1. Then, L3 is achieved. From here it can be inferred that only one 4-item item sets exist with a support degree of 1. The algorithm terminates.

4. Personalized Information Recommendation Sub-Module

In 3, strong association rules are obtained with an improved Apriori algorithm and stored in the association rules database. When a reader logs in to the library system, identifies books of interest and starts reading, the personalized information recommendation sub-module uses the association rules database to find matches for the selected books. It then sends a notification containing information on the matched books to reader, giving the latter more room for choices.

The work of personalized information recommendation sub-module is done with the help of book identifiers. When a reader chooses a book, the system matches the identifier of the book selected with identifiers of books associated to it in the association rules database, and then sends a notification about the associated books to the reading page of the reader. When a reader finishes with selecting books, the personalized information recommendation sub-module adds the selection to the book-borrowing database, resumes association rule calculations and then updates the database.

The personalized information recommendation sub-module achieves its eponymous purpose through notification about books, which makes the system more intelligent and more personalized. The structure of the sub-module is shown in Figure 3.

Today, libraries are usually digitally managed. Personalized information recommendation systems are therefore paramount for the efficient management of digitally powered libraries. Such systems not only allow information about books to circulate efficiently, they can also increase reader satisfaction. They constitute a big improvement on the traditional passive information providing method, rendering the management of libraries more digitalized and more intelligent.

Figure 3. Structure diagram of personalized information recommendation submodule.

5. Case Analysis

In order to test the book recommendation functionality of the system, experiment was done on a server with Windows 7 operating system, Ryzen 7 2700X CPU and 8G memory.

Ten borrowing records were randomly selected from all book-borrowing records of the library of SMU. Details are shown in Table 2.

For the sake of convenience, book identifiers will be hereafter mapped to and expressed as A1, A2, ∙∙∙ Records of book-borrowing transactions can be thus obtained, as shown in Table 3.

The system calculated association rules through an improved Apriori algorithm and obtained a collection of frequent item-sets as shown in Table 4.

Strong association rules satisfying minimum confidence (set as 80%) were found based on the frequent item-sets, the results are shown in Table 5. From the strong association rules in Table 5 it can be inferred that books in the TP337 class can be recommended to readers who have borrowed books in the TP301.6

Table 2. Records of borrowing.

Table 3. Recorded transactions of borrowing.

Table 4. Frequent item-sets.

Table 5. Association rules results.

class and books in the TN911.73 class; books in the TN911.73 class can be recommended to readers who have borrowed books belonging to TP301.6 and books belonging to TP337; books in the TP301.6 class can be recommended to readers who have either borrowed from either the TN911.73 class or from the T392 class; books in the TP319 class can be recommended to readers who have borrowed from the TP374 class. From the above experiment it can be seen that the system is capable of accurately and effectively recommend relevant books to readers.

In order to further evaluate the performance of our system, its time performance and mining performance were recorded during the above procedure and then compared to the performance of a hybrid cooperative system [17] and a K-means system. Results of the comparison are shown in Figure 4 and Figure 5.

From the results on time performance in Figure 4, it can be seen that the run time of our system was the lowest when support degree was in the range between 20% and 60%, whereas both the other two systems took significantly longer time to run, whatever the minimum support degree was set to be. The time performance of our system can be thus proved satisfactory and superior.

From the results on mining performance in Figure 5, it can be seen that the numbers of frequent item-sets of our system were the lowest when support degree was in the range between 20% and 60%, while the numbers of the other two systems were significantly higher. Mining performance is a direct indicator of information recommendation accuracy. The recommendation accuracy of our system can be thus proved satisfactory and superior.

The system performance of our system was measured by running a series of tests, each with a different number of clients and comprised of 10 trials. It was

Figure 4. Comparison of time performance between three systems.

Figure 5. Comparison of mining performance between three systems.

then compared with the system performance of the other two systems. The average performance of the 10 trials of each test is shown in Table 6.

From the results in Table 6, our system can be observed to have small CPU and memory usage. In the condition where 50 clients were running simultaneously, the CPU utilization rate was only 6.47% and only 25.59% of memory was used, which is evidently superior to the other two systems. This proves that our system has good system performance. It can assume its role of book information recommender with ease even when a large quantity of clients is running concurrently.

6. Conclusion

The book recommendation management system described in this paper mines strong association rules based on an improved Apriori algorithm. The improvement on Apriori can effectively help the mining of strong associations between books and utilize the mining results to recommend relevant books to readers, thus offering them a personalized service. The information mined can also serve as references for university professors in their teaching. The latter can

Table 6. Comparison of system performance of three systems.

incorporate associated knowledge in their classes to help student better absorb course materials and have a better understanding. The system not only accurately recommends relevant book information, but also has a small CPU and memory usage, guaranteeing a smoother experience in its personalized recommendation service.

Acknowledgements

The author thanks all the people who helped herself during the study.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Kurmashov, N., Latuta, K. and Nussipbekov, A. (2015) Online Book Recommendation System. Proceedings of the 2015 Twelve International Conference on Electronics Computer and Computation, Almaty, 27-30 September 2015, 1-4.
https://doi.org/10.1109/ICECCO.2015.7416895
[2] Michael, C. (2001) Usage Patterns of a Webbased Library Catalog. Journal of the American Society for Information Science & Technology, 52, 137-148.
https://doi.org/10.1002/1097-4571(2000)9999:9999<::AID-ASI1547>3.0.CO;2-E
[3] Kuroiwa, T. and Bhalla, S. (2007) Dynamic Personalization for Book Recommendation System Using Web Services and Virtual Library Enhancements. Proceedings of the 7th IEEE International Conference on Computer and Information Technology (CIT 2007), Aizu-Wakamatsu, 16-19 October 2007, 212-217.
https://doi.org/10.1109/CIT.2007.72
[4] Yada, S. (2014) Development of a Book Recommendation System to Inspire “Infrequent Readers”. In: Proceedings of the 16th International Conference on Asia-Pacific Digital Libraries, Springer, Cham, 399-404.
https://doi.org/10.1007/978-3-319-12823-8_43
[5] Chen, C.C. and Chen, A.P. (2012) Using Data Mining Technology to Provide a Recommendation Service in the Digital Library. The Electronic Library, 25, 711-724.
https://doi.org/10.1108/02640470710837137
[6] Gao, B.Z., Du, S.Y., Li, X.Z. and Liu, F.G. (2016) Research on the Application of Persona in Book Recommendation System. Journal of Physics: Conference Series, 910, 18-20.
https://doi.org/10.1088/1742-6596/910/1/012023
[7] Chen, C.Y., Hong, J.M., Zhou, W.L., Lin, G.H., Wang, Z.F., Zhang, Q.F., et al. (2017) The Method and Application to Construct Experience Recommendation Platform of Acupuncture Ancient Books Based on Data Mining Technology. Chinese Acupuncture & Moxibustion, 37, 768-772.
[8] Guo, X.Q., Feng, L.C., Liu, Y.I. and Han, X.I. (2016) Collaborative Filtering Model of Book Recommendation System. International Journal of Advanced Media and Communication, 6, 283-294.
https://doi.org/10.1504/IJAMC.2016.080974
[9] Paruchuri, V. and Granville, B. (2020) A Case-Based Reasoning System for Aiding Physicians in Decision Making. Intelligent Information Management, 12, 63-74.
https://doi.org/10.4236/iim.2020.122005
[10] Cordon, I., Luengo, J., Garcia, S., Herrera, F. and Charte, F. (2019) Smartdata: Data Preprocessing to Achieve Smart Data in R. Neurocomputing, 360, 1-13.
https://doi.org/10.1016/j.neucom.2019.06.006
[11] Bhavithra, J. and Saradha, A. (2019) Personalized Web Page Recommendation Using Case-Based Clustering and Weighted Association Rule Mining. Cluster Computing, 22, 6991-7002.
https://doi.org/10.1007/s10586-018-2053-y
[12] Gupta, M. and Kumar, P. (2020) Recommendation Generation Using Personalized Weight of Meta-Paths in Heterogeneous Information Networks. European Journal of Operational Research, 284, 660-674.
https://doi.org/10.1016/j.ejor.2020.01.010
[13] Gao, R., Li, J., Li, X.F., Song, C.F. and Zhou, Y.F. (2018) A Personalized Point-of-Interest Recommendation Model via Fusion of Geo-Social Information. Neurocomputing, 273, 159-170.
https://doi.org/10.1016/j.neucom.2017.08.020
[14] He, B. and Zhang, H. (2016) Library Personalized Information Recommendation of Big Data. Proceedings of the 2016 IEEE International Conference of Online Analysis and Computing Science, Chongqing, 28-29 May 2016, 289-292.
https://doi.org/10.1109/ICOACS.2016.7563099
[15] Tang, Y. and Wang, W. (2018) A Literature Review of Personalized Learning Algorithm. Open Journal of Social Sciences, 6, 119-127.
https://doi.org/10.4236/jss.2018.61009
[16] Zhu, S. (2019) Research on Data Mining of Education Technical Ability Training for Physical Education Students Based on Apriori Algorithm. Springer Journal, 22, 14811-14818.
https://doi.org/10.1007/s10586-018-2420-8
[17] Zhang, S. and Ge, Y. (2015) Personalized Tag Recommendation Based on Transfer Matrix and Collaborative Filtering. Journal of Computer and Communications, 3, 9-17.
https://doi.org/10.4236/jcc.2015.39002

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.