Results 261 to 270 of about 3,180 (309)
Some of the next articles are maybe not open access.

LEARNING-based Focused WEB Crawler

IETE Journal of Research, 2021
As the number of pages being published every day increases enormously, there is a consistent need to design an efficient crawler mechanism that can result in appropriate and efficient search result...
Naresh Kumar, Dhruv Aggarwal
openaire   +1 more source

Smart distributed web crawler

2016 International Conference on Information Communication and Embedded Systems (ICICES), 2016
Centralized crawlers are not adequate to spider meaningful and relevant portions of the Web. A crawler with good scalability and load balancing can bring growth to performance. As the size of web is growing, in order to complete the downloading of pages in fewer amounts of time and increase the coverage of crawlers it is necessary to distribute the ...
Sawroop Kaur Bal, G. Geetha
openaire   +1 more source

Advanced Web Crawlers

2020
In this chapter, we will discuss a crawling framework called Scrapy and go through the steps necessary to crawl and upload the web crawl data to an S3 bucket.
openaire   +1 more source

Implementation of Web Crawler

2009 Second International Conference on Emerging Trends in Engineering & Technology, 2009
The World Wide Web is an interlinked collection of billions of documents formatted using HTML. Ironically the very size of this collection has become an obstacle for information retrieval. The user has to shift through scores of pages to come upon the information he/she desires. Web crawlers are the heart of search engines.
Pooja Gupta, Kalpana Johari
openaire   +1 more source

Web Crawler for Searching Deep Web Sites

2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), 2017
In World Wide Web deep web searching is a most important issue till date. Searching relevant information on a web require different techniques. Crawler is a technique which will help to find out relevant information on web. Nowaday, humans are searching data with the help of search engines such as Google and Yahoo but these search engines will not ...
Tejaswini Arun Patil, Santosh Chobe
openaire   +1 more source

Web crawler research methodology [PDF]

open access: possible, 2011
In economic and social sciences it is crucial to test theoretical models against reliable and big enough databases. The general research challenge is to build up a well-structured database that suits well to the given research question and that is cost efficient at the same time. In this paper we focus on crawler programs that proved to be an effective
Nemeslaki, András   +1 more
openaire   +1 more source

Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 2001
Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is essential to develop effective crawling strategies to prioritize the pages to be indexed.
Filippo Menczer   +3 more
openaire   +1 more source

Real-time web crawler detection

2011 18th International Conference on Telecommunications, 2011
In this paper we present a methodology for detecting web crawlers in real time. We use decision trees to classify requests in real time, as originating from a crawler or human, while their session is ongoing. For this purpose we used machine learning techniques to identify the most important features that differentiate humans from crawlers.
Balla, A.   +5 more
openaire   +2 more sources

Learnable topic-specific web crawler

Journal of Network and Computer Applications, 2005
Topic-specific web crawler collects relevant web pages of interested topics from the Internet. There are many previous researches focusing on algorithms of web page crawling. The main purpose of those algorithms is to gather as many relevant web pages as possible, and most of them only detail the approaches of the first crawling.
A. Rungsawang, N. Angkawattanawit
openaire   +1 more source

Home - About - Disclaimer - Privacy