Results 81 to 90 of about 76,818 (217)
Focused Crawl of Web Archives to Build Event Collections [PDF]
Event collections are frequently built by crawling the live web on the basis of seed URIs nominated by human experts. Focused web crawling is a technique where the crawler is guided by reference content pertaining to the event. Given the dynamic nature of the web and the pace with which topics evolve, the timing of the crawl is a concern for both ...
arxiv
The Robot Crawler Model on Complete k-Partite and Erdős-Rényi Random Graphs [PDF]
Web crawlers are used by internet search engines to gather information about the web graph. In this paper we investigate a simple process which models such software by walking around the vertices of a graph. Once initial random vertex weights have been assigned, the robot crawler traverses the graph deterministically following a greedy algorithm ...
arxiv
A Brief History of Web Crawlers [PDF]
Web crawlers visit internet applications, collect data, and learn about new web pages from visited pages. Web crawlers have a long and interesting history. Early web crawlers collected statistics about the web. In addition to collecting statistics about the web and indexing the applications for search engines, modern crawlers can be used to perform ...
arxiv
Intangible cultural heritage is an invaluable treasure for all human beings worldwide. In 2009, the United Nations Educational, Scientific and Cultural Organization (UNESCO) inscribed the Chinese Cantonese Opera, also known as Yueju in Chinese Pinyin, on
Chenghong Cen+7 more
doaj +1 more source
The Crawler: Three Equivalence Results for Object (Re)allocation Problems when Preferences Are Single-peaked [PDF]
For object reallocation problems, if preferences are strict but otherwise unrestricted, the Top Trading Cycles rule (TTC) is the leading rule: It is the only rule satisfying efficiency, individual rationality, and strategy-proofness. However, on the subdomain of single-peaked preferences, Bade (2019) defines a new rule, the "crawler", which also ...
arxiv
ORCA: a Benchmark for Data Web Crawlers [PDF]
The number of RDF knowledge graphs available on the Web grows constantly. Gathering these graphs at large scale for downstream applications hence requires the use of crawlers. Although Data Web crawlers exist, and general Web crawlers could be adapted to focus on the Data Web, there is currently no benchmark to fairly evaluate their performance.
arxiv
The UK Online Gender Audit 2018: A comprehensive audit of gender within the UK's online environment
Gender inequality has exploded as a recent issue within mainstream media across US and UK cultural commentary. High-profile scandals of sexual harassment and gender pay differences have focused attention on the on-going disparity between sexes and ...
Ana-Maria Huluba+2 more
doaj
Full publication of preprint articles in prevention research: an analysis of publication proportions and results consistency. [PDF]
Sommer I+6 more
europepmc +1 more source
Online content availability, commercial viability, and technological advancements for English and European languages direct mainstream search engines to prioritize the search results of these high-resource languages.
Muhammad Amir Mehmood, Bilal Tahir
doaj +1 more source
Multilingual Context Ontology Rule Enhanced Focused Web Crawler
Mukesh Kumar, Renu Vig
openalex +1 more source