Results 81 to 90 of about 76,818 (217)

Focused Crawl of Web Archives to Build Event Collections [PDF]

open access: yesarXiv, 2018
Event collections are frequently built by crawling the live web on the basis of seed URIs nominated by human experts. Focused web crawling is a technique where the crawler is guided by reference content pertaining to the event. Given the dynamic nature of the web and the pace with which topics evolve, the timing of the crawl is a concern for both ...
arxiv  

The Robot Crawler Model on Complete k-Partite and Erdős-Rényi Random Graphs [PDF]

open access: yesarXiv, 2017
Web crawlers are used by internet search engines to gather information about the web graph. In this paper we investigate a simple process which models such software by walking around the vertices of a graph. Once initial random vertex weights have been assigned, the robot crawler traverses the graph deterministically following a greedy algorithm ...
arxiv  

A Brief History of Web Crawlers [PDF]

open access: yesProc. of CASCON 2013, Toronto, Nov. 2013, 2014
Web crawlers visit internet applications, collect data, and learn about new web pages from visited pages. Web crawlers have a long and interesting history. Early web crawlers collected statistics about the web. In addition to collecting statistics about the web and indexing the applications for search engines, modern crawlers can be used to perform ...
arxiv  

Enhancing the dissemination of Cantonese Opera among youth via Bilibili: a study on intangible cultural heritage transmission

open access: yesHumanities & Social Sciences Communications
Intangible cultural heritage is an invaluable treasure for all human beings worldwide. In 2009, the United Nations Educational, Scientific and Cultural Organization (UNESCO) inscribed the Chinese Cantonese Opera, also known as Yueju in Chinese Pinyin, on
Chenghong Cen   +7 more
doaj   +1 more source

The Crawler: Three Equivalence Results for Object (Re)allocation Problems when Preferences Are Single-peaked [PDF]

open access: yesarXiv, 2019
For object reallocation problems, if preferences are strict but otherwise unrestricted, the Top Trading Cycles rule (TTC) is the leading rule: It is the only rule satisfying efficiency, individual rationality, and strategy-proofness. However, on the subdomain of single-peaked preferences, Bade (2019) defines a new rule, the "crawler", which also ...
arxiv  

ORCA: a Benchmark for Data Web Crawlers [PDF]

open access: yesarXiv, 2019
The number of RDF knowledge graphs available on the Web grows constantly. Gathering these graphs at large scale for downstream applications hence requires the use of crawlers. Although Data Web crawlers exist, and general Web crawlers could be adapted to focus on the Data Web, there is currently no benchmark to fairly evaluate their performance.
arxiv  

The UK Online Gender Audit 2018: A comprehensive audit of gender within the UK's online environment

open access: yesHeliyon, 2018
Gender inequality has exploded as a recent issue within mainstream media across US and UK cultural commentary. High-profile scandals of sexual harassment and gender pay differences have focused attention on the on-going disparity between sexes and ...
Ana-Maria Huluba   +2 more
doaj  

Full publication of preprint articles in prevention research: an analysis of publication proportions and results consistency. [PDF]

open access: yesSci Rep, 2023
Sommer I   +6 more
europepmc   +1 more source

Humkinar: Construction of a Large Scale Web Repository and Information System for Low Resource Urdu Language

open access: yesIEEE Access
Online content availability, commercial viability, and technological advancements for English and European languages direct mainstream search engines to prioritize the search results of these high-resource languages.
Muhammad Amir Mehmood, Bilal Tahir
doaj   +1 more source

Home - About - Disclaimer - Privacy