Results 1 to 10 of about 1,170 (91)

An Enhanced Semantic Focused Web Crawler Based on Hybrid String Matching Algorithm [PDF]

open access: diamondCybernetics and Information Technologies, 2021
Topic precise crawler is a special purpose web crawler, which downloads appropriate web pages analogous to a particular topic by measuring cosine similarity or semantic similarity score.
Sakunthala Prabha K. S.   +2 more
doaj   +2 more sources

Nuevos retos de la tecnología web crawler para la recuperación de información

open access: diamondMétodos de Información, 2014
El web crawler constituye una parte importante de la cadena documental en la recuperación de información, dado que genera el corpus documental necesario sobre el que aplicar los distintos algoritmos de recuperación.
Blázquez Ochando, Manuel
doaj   +3 more sources

An MLLM-Assisted Web Crawler Approach for Web Application Fuzzing

open access: goldApplied Sciences
Web application fuzzing faces significant challenges in achieving comprehensive test interface (attack surface) coverage, primarily due to the complexity of user interactions and dynamic website architectures.
Wantong Yang   +5 more
doaj   +2 more sources

Analysis on the Judicial Interpretation of the Crawler Technology Infringing on the Intellectual Property Rights of Enterprise Data [PDF]

open access: yesE3S Web of Conferences, 2021
In the actual process of web crawler infringement and criminal identification, there is a theory of “weakening the infringement typology and strengthening the presumption of legal interest”. This is also the basic method for the subsequent identification
Yang Juan
doaj   +1 more source

IHWC: intelligent hidden web crawler for harvesting data in urban domains

open access: yesComplex & Intelligent Systems, 2021
Due to the massive size of the hidden web, searching, retrieving and mining rich and high-quality data can be a daunting task. Moreover, with the presence of forms, data cannot be accessed easily.
Sawroop Kaur   +3 more
doaj   +1 more source

An Enhanced Focused Web Crawler for Biomedical Topics Using Attention Enhanced Siamese Long Short Term Memory Networks

open access: yesBrazilian Archives of Biology and Technology, 2022
The Internet is chosen to be one among the primary source of biomedical information. To retrieve necessary biomedical information, the search engine needs an efficient, focused crawler mechanism.
Joe Dhanith Pal Nesamony Rose Mary   +2 more
doaj   +1 more source

Effective Web Page Crawler [PDF]

open access: yesEngineering and Technology Journal, 2011
The World Wide Web (WWW) has grown from a few thousand pages in 1993 to more than eight billion pages at present. Due to this explosion in size, web search engines are becoming increasingly important as the primary means of locating relevant information.
Hilal Hadi Saleh, Israa Ali
doaj   +1 more source

SIMHAR - Smart Distributed Web Crawler for the Hidden Web Using SIM+Hash and Redis Server

open access: yesIEEE Access, 2020
Developing a distributed web crawler obliges major engineering challenges, all of which are eventually associated to scale. To retain corpus of search engine and a reasonable state of freshness, the crawler must be distributed over multiple computers. In
Sawroop Kaur, G. Geetha
doaj   +1 more source

A Critique Empirical Evaluation of Relevance Computation for Focused Web Crawlers

open access: yesBrazilian Archives of Biology and Technology, 2022
Analogous to the spectacular growth of information-superhighway, The Internet, demands for coherent and economical crawling methods are translucent to shoot up. Consequently, many innovative techniques have been put forth for efficient crawling.
Joe Dhanith Pal Nesamony Rose Mary   +2 more
doaj   +1 more source

Public opinion data collection of power network using topic crawler

open access: yesXi'an Gongcheng Daxue xuebao, 2022
The traditional public opinion data collection methods of power network have some problems, such as low recall rate, low calculation accuracy and being time-consuming. Therefore, the topic crawler technology was used to improve the data collection method.
XI Zenghui   +3 more
doaj   +1 more source

Home - About - Disclaimer - Privacy