TITLE:
An Approach for Content Retrieval from Web Pages Using Clustering Techniques
AUTHORS:
R. Manjula, A. Chilambuchelvan
KEYWORDS:
Collaborative Filter, Automated Wrapper, Clustering, Information Retrieval, Data Repository
JOURNAL NAME:
Circuits and Systems,
Vol.7 No.9,
July
28,
2016
ABSTRACT: Mining the content from an
information database provides challenging solutions to the industry experts and
researchers, due to the overcrowded information in huge data. In web searching,
the information retrieved is not an appropriate, because it gives ambiguous
information for the user query, and the user cannot get relevant information
within the stipulated time. To overcome these issues, we propose a new
methodology for information retrieval EPCRR by providing the top most exact
information to the user, by using the collaborative clustered automated filter
which makes use of the collaborative data set and filter works on the
prediction by providing the highest ranking for the exact data retrieved. The
retrieval works on the basis of recommendation of data which consists of
relevant data set with highest priority from the cluster of data which is on
high usage. In this work, we make use of the automated wrapper which works
similar to the meta crawler functionality and it obtains the content in the
semantic usage data format. Obtained information from the user to the agent
will be ranked based on the Enabled Pile clustered data with respect to the
metadata information from the agent and end-user. The information is given to
the end-user with the top most ranking data within the stipulated time and the
remaining top information will be moved to the data repository for future use.
The data collected will remain stable based on the user preference and works on
the intelligence system approach in which the user can choose any information
under any instances and can be provided with suitable high range of exact
content. In this approach, we find that the proposed algorithm has produced
better results than existing work and it costs less online computation time.