Results 241 to 250 of about 15,044 (285)
Some of the next articles are maybe not open access.
Focused web crawler with revisit policy
Proceedings of the International Conference & Workshop on Emerging Trends in Technology - ICWET '11, 2011Focused crawlers aim to search only the subset of the web related to a specific topic, and offer a potential solution to the problem. The major problem is how to retrieve the maximal set of relevant and quality pages. In this paper, We propose an architecture that concentrates more over page selection policy and page revisit policy The three-step ...
S. Mali, B. B. Meshram
openaire +1 more source
Focused web crawler for Indonesian recipes
2017 International Conference on Sustainable Information Engineering and Technology (SIET), 2017Crawlers are commonly used to traverse and collect all public webs that are connected through links. The general crawlers could not be used for crawling or collecting web pages with a particular topic such as food recipe. This paper, propose focused web crawler for Indonesian food recipes using simple classification based on the analysis of Indonesian ...
Gusti Ahmad Fanshuri Alfarisy +1 more
openaire +1 more source
Finding seeds to bootstrap focused crawlers
World Wide Web, 2015Focused crawlers are effective tools for applications requiring a high number of pages belonging to a specific topic. Several strategies for implementing these crawlers have been proposed in the literature, which aim to improve crawling efficiency by increasing the number of relevant pages retrieved while avoiding non-relevant pages.
Karane Vieira +4 more
openaire +1 more source
The BINGO! focused crawler: from bookmarks to archetypes
Proceedings 18th International Conference on Data Engineering, 2003The BINGO! system implements an approach to focused crawling that aims to overcome the limitations of the initial training data. To this end, BINGO! identifies, among the crawled and positively classified documents of a topic, characteristic "archetypes" and uses them for periodically re-training the classifier; this way the crawler is dynamically ...
Sizov, S. +3 more
openaire +2 more sources
Towards a Keyword-Focused Web Crawler
2013This paper concerns predicting the content of textual web documents based on features extracted from web pages that link to them. It may be applied in an intelligent, keyword-focused web crawler. The experiments made on publicly available real data obtained from Open Directory Project with the use of several classification models are promising and ...
Tomasz Kuśmierczyk, Marcin Sydow
openaire +1 more source
Research on Tunnel Phenomenon in Focused Crawler
2010 International Conference on Internet Technology and Applications, 2010With the rapid development of search engine technology, the technology of focused crawler has been widely concerned and deeply researched. In this research field, there is a "tunnel phenomenon" seriously affecting the crawler's efficiency and coverage.
Chun Yan, Hang Zhang
openaire +1 more source
FCHC: A Social Semantic Focused Crawler
2011The World Wide Web is a huge collection of web pages where every second, new piece of information is added. Searching and retrieving relevant web resources is a protracted task and finding relevant resources w.r.t. some topic, without any explicit or implicit feedback adds more intricacy to the process.
Anjali Thukral +4 more
openaire +1 more source
CRAWLER-LD: A Multilevel Metadata Focused Crawler Framework for Linked Data
2015The Linked Data best practices recommend to publish a new tripleset using well-known ontologies and to interlink the new tripleset with other triplesets. However, both are difficult tasks. This paper describes CRAWLER-LD, a metadata crawler that helps selecting ontologies and triplesets to be used, respectively, in the publication and the interlinking ...
Raphael do Vale A. Gomes +3 more
openaire +1 more source
A Focused Crawler Based on Correlation Analysis
International Journal of Future Generation Communication and Networking, 2014With the rapid development of network and information technology, there is a wealth of huge amounts of data on the internet. But it’s a major problem faced by the majority of researchers how to effectively filter out a particular subject or field of information from these data.
Qiuli Qin, Xin Peng
openaire +1 more source
A Focused Crawler with Document Segmentation
2005The focused crawler is a topic-driven document-collecting crawler that was suggested as a promising alternative of maintaining up-to-date Web document indices in search engines. A major problem inherent in previous focused crawlers is the liability of missing highly relevant documents that are linked from off-topic documents.
Jaeyoung Yang +2 more
openaire +1 more source

