Searching Experience Sharing Based on Ant Colony Model

Search engine is an important tool to all the Internet users. It helps users finding useful contents in the cyberspace. However, searching experiences among different users are difficult to be shared and accumulated. In this paper, a concept called search-trail is proposed. Based on ant colony model, search-trails are created from the searching steps to the target contents. The search-trails built from various users are very similar to the trails generated in an ant colony. The simulations of the proposed solution demonstrate that even in the case of few searching experienced users, the generated search-trails still possess 96.29% similarity to the expected ones in 60 days. It shows that the concept of search-trails can really help users accumulating, sharing and reusing their search experiences.


Introduction
Recently, using the contents and services published in the web to help doing research work is indeed an efficient and useful way.In fact, today's Internet can be considered as the largest knowledge base that has ever existed.In order to identify the required contents in this huge knowledge base, utilizing search engines is the most practical way [1][2][3][4].
The process of using a search engine is an interactive process.Web users first submit their keywords and will obtain a set of response URLs; by following these search results users can retrieve the contents of these URLs.This search steps are repeated until either the users are satisfied by what have found or they become frustrated and finally give up.For those people knowing what they are searching, i.e., they know the exact keywords related to the searching topics, current web search solutions are good enough [5,6].However if the research studies are still in brainstorming stage, "what keywords are the right ones" and "how to identify the next search from the current search" sometimes are very difficult to be answered.Especially, for the users without enough knowledge about the research topic, search engines are still not helpful enough.
From "human" point of view, one of the best ways to tackle the above issues is to reuse someone's experience.Of course, this "someone" should be a knowledgeable guy or a group of knowledgeable ones in the corresponding field.Experience, in fact, is a kind of implicit knowledge which is difficult to be expressed explicitly.So, this raises a new issue: "Can we provide a mechanism such that it can record those knowledgeable guys" search experiences and let others share their experiences later?"Recording someone's search steps is easy.The difficult part is how to keep "useful" search experiences instead of all the search steps.In order to solve this issue, by mimicking ants' behavior, a new solution called search-trail is proposed.Search-trail treats Internet world as an ant colony [3,7] and each web user is treated as an ant inside the ant colony such that it is able to follow and spread pheromones on the routes to the food sources.In this case, these routes are the search steps approaching to the target contents.As many ants searching the food sources for a while, they can collaborate with each other through pheromone spreading in the area.Some routes will finally become useful search-trails leading to the food sources while others may just disappear.The existing search-trails with strong pheromone can be treated as an associated network of the related keywords and contents.Search-trail map, a visualized interface, is proposed to organize search-trails in a way such that they can be easily reused and shared by either experienced or inexperienced users.Unlike traditional search technology, Search-trail map tries to record how the users search through the web.
In order to see how search-trail map influences the search behaviors, a set of simulations is constructed.In the simulation scenarios, the expected search-trails, i.e. the right trails to the food sources, are assumed to be known.The results of the simulations show that even in the worst case, with 2% experienced, 8% standard, and 90% inexperienced users, the search-trails generated in two months period are 96.29%similar to the expected ones.For all other cases, the search-trails generated in less than 30 days are 100% similar to the expected ones.These simulations show that search-trail map can really help users accumulating and sharing their search experiences.Furthermore, these experiences can be polished day after day just like what ants do in their colony.
This paper is organized as follows.In Section 2, the model of search-trail is introduced.The design of the mechanism is given in Section 3. In order to show the superiority of search-trail map, a set of simulations is discussed in chapter 4. Section 5 is the conclusion remark of this research.

Ant Colony Model for Search-Trail
In this section, ant colony model for search-trail is introduced first.Ant colony model is originally inspired by the ants' foraging behavior.In the real world, ants communicate with each other by using an indirect communication called stigmergic [8].In fact, two ants achieving stigmergic communication is through releasing pheromone.Initially, ants wander randomly until some ants find food.These ants release pheromone on the trails back to their colony.The trails may be found by other ants.Then, other ants may continue wandering randomly or follow the trails they have found.In nature, pheromone evaporates over time.When the pheromone upon a trail evaporates, its attractive strength is reduced.If an ant follows a trail, it may release more pheromone to reinforce the pheromone density.More ants travel through the same trail, more pheromones are spread.Therefore, important trails would remain with higher pheromone density.On the other side, the pheromone evaporation mechanism leads some trails to be discarded.Finally, only good trails remain.
The idea of ant colony model has been applied to many combinational optimization problems, such as traveling salesman problem, assignments and scheduling problems, or routing problems [9][10][11][12].Algorithms based on ant colony model treat ants as separate agents [13][14][15].These ants walk around a graph representing the problem to solve.The pheromones released on the edges influence an ant to select its next step.For this kind of algorithms, a mechanism acting as pheromone controller is required.The pheromone controller reduces the density of pheromone level on edges to simulate the pheromone evaporation process.By decreasing the pheromone density, the influences from past experience are reduced and it encourages the exploration of new paths.
The idea of utilizing ant colony model for search-trails is as follows.In a search-trail map, an Internet user is treated as an ant.During its search process, keywords and web pages being visited are recorded as the nodes of a trail.For instance, at beginning a user chooses keyword A to perform search and a set of URLs is returned by the search engine.The user opens a page B from the returned set and then follows a hyperlink available in B to another page C.Then, this user might decide to reformulate the search keyword as D and performs another search.A page E from the new returned set is visited again.The above search process can be modeled as a search-trail as given in Figure 1(a).Note that keywords are in rectangular shapes while pages are in oval shapes.When multiple searches performed, more than one search-trail may appear and a map similar to the trails inside an ant colony can be generated.A web user can choose to follow any existing trail or to create a new one in the map.Pheromone is spread upon the edge of a search-trail when a user is moving from one node to the other thru the edge.If during a period of time an edge of a search-trail has no any access, then a fix amount of pheromone of the edge evaporates.
Based on the above model, users surely will create lots of search-trails after a period of time.Of course, there exist many target contents, i.e., similarly to the food sources in the ant colony, such that no search-trails can reach them.In the map, some search-trails attract more users and become more concrete while others are discarded finally.The detail mechanism for building searchtrail map is illustrated in the next section.

Search-Trail Map
To develop search-trail map based on ant colony model, pheromone level control is the key issue.Instead of assigning a value to represent pheromone level at the very beginning, the exact time that the pheromone will be completely evaporated is used to represent the initial pheromone level.When an ant travels thru an edge, the pheromone level of the edge is changed to the time by adding a fixed time interval to its previous evaporated time.Since pheromone is spread upon edges in a searchtrail map, maintaining each edge's pheromone at its right level is important.When the edges' pheromone levels of a node are all reduced to zero, the node will be removed from the map.
To calculate the remaining pheromone level, it can just subtract the current time from the completely evaporated time.However, the evaporation rate of each edge is not a constant value in a search-trail map.The reason why the evaporation rates are different is that some unpopular but important nodes could be removed from search-trail map after a period of time.Popularity means lots of users are interested in the topic.For example, US President Obama is the topic attracting many searches.Important but not popular is the topic attracting few users.For example, Kenya President Kibaki is important to some web users but definitely is not as popular as President Obama.The search-trails of the two presidents in the search-trail map should maintain readable as long as possible even though the number of the search-trails of President Obama is much larger than the number of President Kibaki.Therefore, the evaporation rate in a search-trail map is adjusted by the popularity.If a node has few out edges to the other nodes, i.e., the degree of the node is small, we say the node is an unpopular node and its out edges' evaporation rate is decreased.On the other hand, a popular node's out edges' evaporation rate is increased.We may wonder under this mechanism the popular nodes should be removed quicker than unpopular nodes.The truth is a popular node's edge usually has more ants traveling thru and therefore the pheromone released on the edge usually is large.In fact, popular nodes are not easy to be re-moved from a search-trail map.
It is obvious that the complexity of a search-trail map is highly related to the number of the edges in the map.In order to reduce the complexity, limiting a node's degree inside a preferred bound is reasonable since too many edges are not helpful for searching.
Before introducing the solution in detail, we first summarize all variables in Table 1.
In order to guarantee a newly created edge can be noticed by other users initially, the initial_pheromone_ value usually should be given a larger value.On the other hand, for existing edges, if they can lead to some important nodes, their pheromone values are accumulated by passing-by users and therefore the pheromone_spread_ value is not necessary as big as the value of initial_ pheromone_value.
The data structures of Node, Edge, and Ant are given in Figures 2-4 respectively.
Figure 5 is the major algorithm for search-trail solution.In this algorithm, when an ant moves from its current node to another node named nextNode, the method Ant_Movement (ant, nextNode) is invoked.The algorithm starts by assigning the ant's current location as firstNode, then it checks whether edge e = (firstNode, nextNode) has already existed in the search-trail map.If it does, then the number of ants traveling thru e during the current time interval is increased by one.If e is not existed in the search-trail map, then e is created by executing the Edge_Creation (firstNode, nextNode) method and e is added into the search-trail map.
After a short period of time pheromone_update_time_ interval, a search-trail map needs to update the pheromone level of each edge.The algorithm is listed in Figure 6.In this algorithm, each edge's evaporation_rate is time is smaller than the current_time, which means the edge is evaporated, the edge should be removed from the search-trail map.If an edge is removed from the map, its firstNode's edge_count should also be reduced by one.When a node's edge_count becomes zero, it is removed from the map.

Search-Trail Simulation
In order to see the impact on the web search when adopting search-trail, an experiment is proposed.The input to this experiment is as follows.There are 900 nodes and 3417 edges having been visited and gone thru by many ants after a long enough period of time.Inside this colony, there are totally 83 keywords and web contents considered as "good" targets.Figure 7 is the visualization map of the input.The trails with dark grey color are the expected good search-trails to be generated after simulation.
The simulation mechanism is given in Figure 8.It starts by generating the above input map named "Reference Map".Then, an Ant Dispatcher creates ants in every dispatch_ant_time_interval time period.As many ants join searching, some search-trails on the map become strengthened.Meanwhile, according to the algorithm stated in the previous section, the Pheromone Controller is developed to control the pheromone level on the edges of the simulated map.In the simulation, there are three different categories of ants in the colony, namely, experienced users, standard users, and inexperienced users.Each of these ants, its search experience is decided by experience_level which indicates the probability of choosing good nodes by its own.The average time for an ant staying on a node is defined by sleep_time.The probability of following existing search-trails is stored in  follow_trail_percentage.The ant will finish its search either the time it spends for search is more than search_ time or the good nodes it has found is more than satisfied_ node_number.Finally, the whole simulation is driven by Time Controller.In the Time Controller, simulation_time represents the duration of simulated days.Table 2 lists all the key attributes of the classes.
Tables 3 and 4 contains the attribute values used in the simulations done in this section.In order to see how good the search-trails generated by the ants after 30 days in each simulation, a similarity formula to show the difference between the expected search-trails and the generated search-trails is given in the following: In the formula, the numerator represents the generated search-trails that are expected and the denominator represents the expected search-trails plus the wrong ones generated by the ants.If all the search-trails generated by the ants are expected, the value of the similarity will be 100%.
In the above four simulations, each time we perform a new simulation, the amount of inexperienced user is increased while the amount of standard and experienced user is decreased relatively.The search-trails generated after day 1 are really random for all the four cases.However, when more and more ants join into the search, with the pheromone spread and evaporated, the searchtrails generated in the four cases are all gradually converged to the expected search-trails.In fact, the first two cases generate the exact expected search-trails in 30 days.For the forth simulation, if the simulation period is extended to 60 days, the generated search-trail similarity can reach 96.29%.

Conclusions
In this research a concept called search-trail is proposed.Instead of the search targets, search-trail focuses more on the search process.Based on ant colony model, searchtrails are created from the searching steps to the target contents.The search-trails built from various users are very similar to the trails generated in an ant colony.In order to see the impact when adopting the proposed solution in a search community, four different simulations are performed.The results show that search-trails really can help users accumulating and sharing search experiences.Even in the case of very few experienced users, the generated search-trails still possess 96.29% similarity to the expected ones in 60 days.
The future work of this research is to integrate searchtrail into the current available search engines.In order to achieve this goal, the complexity of the solution is required to be further polished.

Figure 1
(b)gives another example of search-trail.

Figure 7 .
Figure 7. Scenario for the experiments.

Table 1 . Variable definitions. Variables Description evaporation_rate
This variable is an edge variable.It indicates pheromone evaporation rate of a given edge.edge_count This variable is a node variable.It indicates the current number of out edges of a given node.pheromone_spread_value This variable is a global variable.It indicates the amount of pheromone spread on an edge when a user travels thru it.number_of_ants This variable is an edge variable.It indicates the number of users that travelled through the edge during a pheromone_update_time_interval.Current This variable is an ant variable.It indicates the current position of the ant.

Data structure for edge. Ant type Current Figure 4. Data structure for ant.
according to its popularity.When evaporation_ rate is larger than one, it is said to be popular.Otherwise, it is unpopular.The value of completely_evaporated_ time is updated according to the formula given in the Figure6.After updating completely_evaporated_time, it verifies whether the edge is already evaporated by comparing it with current_time.If completely_evaporated_ updated

Table 2 . Component attributes.
It indicates the frequency of an ant's creation.
Ant Dispatcher dispatch_ant_time_interval satisfied_ node_valueIt indicates the number of the nodes the ant visited and then it will finish the search.follow_trail_percentageItindicatesthe probability of the ant choosing existing search-trails.Ant search_timeIt indicates the time period the ant stay in searching.It is a random variable in normal distribution.