^{1}

^{*}

^{2}

^{1}

Recommender system (RS) has become a very important factor in many eCommerce sites. In our daily life, we rely on the recommendation from other persons either by word of mouth, recommendation letters, movie, item and book reviews printed in newspapers, etc. The typical Recommender Systems are software tools and techniques that provide support to people by identifying interesting products and services in online store. It also provides a recommendation for certain users who search for the recommendations. The most important open challenge in Collaborative filtering recommender system is the cold start problem. If the adequate or sufficient information is not available for a new item or users, the recommender system runs into the cold start problem. To increase the usefulness of collaborative recommender systems, it could be desirable to eliminate the challenge such as cold start problem. Revealing the community structures is crucial to understand and more important with the increasing popularity of online social networks. The community detection is a key issue in social network analysis in which nodes of the communities are tightly connected each other and loosely connected between other communities. Many algorithms like Givan-Newman algorithm, modularity maximization, leading eigenvector, walk trap, etc., are used to detect the communities in the networks. To test the community division is meaningful we define a quality function called modularity. Modularity is that the links within a community are higher than the expected links in those communities. In this paper, we try to give a solution to the cold-start problem based on community detection algorithm that extracts the community from the social networks and identifies the similar users on that network. Hence, within the proposed work several intrinsic details are taken as a rule of thumb to boost the results higher. Moreover, the simulation experiment was taken to solve the cold start problem.

The goal of a Recommender System [

Collaborative filtering (CF) has some challenges while collecting and analyzing the information from the database. The challenges are: 1) Sparsity: There are many items to be recommended, even if there are many users, the user/rating matrix is sparse, and it is very difficult to find users that have rated the same items; 2) Cold Start: Difficult to make recommendations to the new users that have not rated any items yet and difficult to deal with items that have not been rated or bought yet. Weike Pan, Manos Papagelis et al. [

Asela Gunawardana et al. [

In social networks, each entity is represented as a node in the network, and the communication between two nodes has been represented as an edge. The analysis of social networks is important in many fields such as recommender systems [^{®}, flipkart^{®}, jabong^{®}, etc., that enhances the business opportunities for retailer [

In order to evaluate the community division, we define the quality function called modularity [_{ij} is the link between node i and j, the value of A_{ij} is 1 if there exists a link between i to j, otherwise zero, k_{i}k_{j}/2m is the expected number of links between i and j, _{ii} is the actual number of links in the community and a_{i} is the expected number of links for that community. Equation (3), where e_{ij} is the percentage of edges between communities i & j. Therefore, the modularity (Q) functions are defined as follows:

To explain the calculation of the modularity of a community in

community division of the example graph with 6 vertices into the clusters C_{1} = {v_{1}, v_{2}, v_{3}}, C_{2} = {v_{4}, v_{5}, v_{6}} the values of e_{ij} are the total sums of the matrix elements belong to a pair of C_{1i} and C_{2} divided by the total sum of

all matrix elements:

the partition of the example graph into the two communities is Q = 0. 357. The overall flow for the modularity maximization algorithm shown in

Input: Undirected network G

Output: Community is assigned to each node with maximum modularity (Q)

Repeat

・ Initially, each node belongs to its own communities.

・ Compute the actual number of links in the community one.

・ Divide the actual number of links with the total number of links in the network.

・ Compute the expected number of links to the same community

・ Subtract the actual number of links in the community from the expected number of links to the same community.

・ Calculate the increase and decrease of modularity measure Q for all possible communities.

・ Merge the pairs with the greatest increase (or smallest decrease) in Q.

・ Repeat the process from the step 2 for the next community.

・ Update the modularity value by adding the modularity value for the two communities.

・ Until return community with high modularity.

The cold-start problem can be observed as one of the special problems in the recommender system. The cold-start occurs in Recommender Systems (RS) due to the lack of information about the users and the items [

In traditional collaborative recommender system, one of the most complicated tasks in the recommender system is to determine the similarity among the users. The correlation coefficient is used to measure the similarity among the users. A correlation coefficient is a statistical measure of the degree to which changes in the value of one variable predict change in the value of another. The Pearson correlation coefficient named after Karl Pearson (1857-1936), it is a measure of the linear correlation between active users and all other users who rated the target item, gives a value between +1 and −1 inclusive. The value 1 is the total positive correlation, 0 is no correlation, and −1 is a negative correlation. The assumption of this method is if the user had similar taste in the past will have the similar taste in the future and also a user preference to remain constant and steady over time. Pearson correlation coefficient can be easily implemented and can achieve high accuracy [_{a}_{,p} is the rating of the active user a to the product p, r_{u}_{,p} is the rating of the neighbor u to the product p, n is the total number of items,

By applying community detection techniques in online e-commerce network, we can get 1) group of users similar to the target user; 2) predict rating for the active user based on similar community user; 3) recommend a product/service to the new user; 4) recommend a new product such as new movie to the users in RS. For the Item-Cold Start problem, detect the community based on the user-movie matrix, then creates a short description of the new movie. This description can be used as the seed to grow around the communities and its description as the recommendation for the new movie. For the User Cold-Start problem, the user is new to the recommender system (X), but his/her profile is available in the recommender systems (Y). The modularity maximization algorithm detects the community based on the user-movie matrix available in recommender systems (Y). We can use the external community profile available in recommender system Y such as amazon^{®} to recommend the item for the new user in a recommender system in X such as ebay^{®}. Community detection is used to find out a similar group of user to new target user in the system. We can assume that the two users are similar in terms of book ratings, to have the similar taste in the movie also; or if two of them can be friends. We can make use of this information to predict that the new user’s interest in another dimension can be used as a solution to the user cold-start problem.

Input: User-Movie Rating Database

Output: Community is Extracted

・ Initially, extract the user-movie rating information from the Movie rating database.

・ Creates a matrix from the given set of values.

Movies/users | Movie 1 | Movie 2 | Movie 3 | Movie 4 | Movie 5 |
---|---|---|---|---|---|

Active User | 5 | 3 | 1 | 2 | ? |

User1 | 4 | 3 | 1 | 5 | 3 |

User2 | ? | ? | 4 | 4 | 5 |

User3 | 5 | 3 | 4 | 1 | 4 |

User5 | 1 | 3 | ? | 4 | ? |

Recommender System | User Profile | Contextual parameters | Community Data | Product features | Knowledge Model |
---|---|---|---|---|---|

Collaborative | Yes | Yes | Yes | No | No |

Content-Based | Yes | Yes | No | Yes | No |

Knowledge-based | Yes | Yes | No | Yes | Yes |

Source from [

・ If the user is not rated for any movie assign zero rating for that movie.

・ Find the adjacency matrix for the user-movie matrix.

・ Create graphs from adjacency matrix.

・ Find the simple graphs that do not contain loops and multiple edges.

・ The undirected graph is extracted from the simple graph.

・ Apply the community detection algorithm.

・ Return community with high modularity.

The Movie Lense rating data set was collected from the website movielens.umn.edu. The data set contains about 100,000 ratings (1-5) from 943 users on 1664 movies. To simulate the User Cold-Start problem, we removed all the other information from the rating data set like the description about the movie, time, etc. We performed the community detection in the user-movie rating matrix. After detecting the community, we can find out the relevant movies to the users. This can be used for the new user in another e-commerce site to recommend the movies. Similarly for Item Cold-Start problem, detecting the community based on the user-movie matrix, creates a short description of the new movie. This description used as a seed to grow around the communities and its description as a recommendation for the new movie. Based on this observation, in the prediction formula to predict the unknown rating for the active user can be calculated by using the community in the network. The developed algorithm is applied in the network that was extracted from the movie database. The output of the community detected is shown in

Community Id | Name of the Movie |
---|---|

1 | Butch Cassidy and the Sundance Kid, Glory, Lawrence of Arabia, Casablanca, Cool Hand Luke, Harold and Maude |

2 | Return of the Jedi, Silence of the Lambs, Raising Arizona, Lawrence of Arabia, Princess Bride, Raiders of the Lost Ark. |

3 | Fargo, Star Wars, Titanic, Silence of the Lambs, In the Company of Men, Leaving Las Vegas |

4 | Fargo, Return of the Jedi, Godfather, Shawshank Redemption, Leaving Las Vegas, Empire Strikes Back |

5 | This Is Spinal Tap, Lone Star, Postino, Cinema Paradiso, Three Colors: Red, Fifth Element. |

6 | Schindler’s List, Babe, Princess Bride, Killing Fields, Godfather: Part II, This Is Spinal Tap, Cool Hand Luke |

7 | Shawshank Redemption, Return of the Jed, Graduate, Empire Strikes Back, Fargo, Godfather: Part II |

following is the formal definition of the dynamic network. Given a dynamic network _{t}} and a set of edges E_{t}. This is much more appropriate when the relevant movie for any community will vary over the period of time. Using them, we can easily find out the similar user in the network. The proposed algorithm is tested with the other community detection algorithm. The comparison result is shown in _{u}_{,p} is the rating of the neighbor u to the product p and

In this paper, we have analyzed the cold-start problem in the recommender system. One of the most important properties of complex networks is community structure. The nodes of the communities are tightly connected within the community and loosely connected among other communities. The community detection algorithm

S.No | Algorithm | Total number of Community | Single Community Structure |
---|---|---|---|

1 | Edge betweenness | 4 | 2 |

2 | Modularity Maximization | 2 | 0 |

3 | Walktrap | 4 | 1 |

4 | Leading eigenvector | 2 | 0 |

5 | label. propagation | 1 | 0 |

6 | Spin glass | 3 | 1 |

used to detect the community in the complex network. The cold start community-based algorithm alleviates the cold start problem in collaborative filtering recommender system. Among all other community detection algorithms, cold-start community-based algorithm gives a better result. This algorithm extracts the community from the network and finds out the community member within the community. It extracts the community based on current community structure. It shows the dynamic behavior that some user added or deleted in the network. The real-time networks such as social networks are dynamic in nature, as the nodes and edges keep on getting added and deleted from the network. This method is also used to find out the similarity value and predict the rating for the unrated item also. In future, the present work may be extended to study the dynamic nature of community structure by analyzing social media network as a dynamic network and also to find out a solution to the sparsity problem in collaborative filtering based on community.

S. Vairachilai,M. K. Kavithadevi,M. Raja, (2016) Alleviating the Cold Start Problem in Recommender Systems Based on Modularity Maximization Community Detection Algorithm. Circuits and Systems,07,1268-1279. doi: 10.4236/cs.2016.78111