Results 71 to 80 of about 76,818 (217)

A Novel Term Weighing Scheme Towards Efficient Crawl of Textual Databases [PDF]

open access: yesarXiv, 2013
The Hidden Web is the vast repository of informational databases available only through search form interfaces, accessible by therein typing a set of keywords in the search forms. Typically, a Hidden Web crawler is employed to autonomously discover and download pages from the Hidden Web. Traditional hidden web crawlers do not provide the search engines
arxiv  

WebParF: A Web partitioning framework for Parallel Crawlers [PDF]

open access: yesarXiv, 2014
With the ever proliferating size and scale of the WWW [1] efficient ways of exploring content are of increasing importance. How can we efficiently retrieve information from it through crawling? And in this era of tera and multi-core processors, we ought to think of multi-threaded processes as a serving solution.
arxiv  

Sistem Peringkas Berita Otomatis berbasis Text Mining menggunakan Generalized Vector Space Model: Studi Kasus Berita diambil dari Media Massa Online

open access: yesTechne, 2014
Makalah ini akan membahas mengenai sistem yang memiliki fungsi utama membentuk ringkasan dari dokumen secara otomatis dengan menggunakan metode yang bersifat text mining.
Budhi Kurniawan Wangsa   +2 more
doaj  

Effective Focused Crawling Based on Content and Link Structure Analysis [PDF]

open access: yesIJCSIS June 2009 Issue, Vol. 2, No. 1, 2009
A focused crawler traverses the web selecting out relevant pages to a predefined topic and neglecting those out of concern. While surfing the internet it is difficult to deal with irrelevant pages and to predict which links lead to quality pages. In this paper a technique of effective focused crawling is implemented to improve the quality of web ...
arxiv  

Search Sounds: An Audio Crawler Focused On Weblogs.

open access: yes, 2006
[TODO] Add abstract here.
Òscar Celma   +2 more
openaire   +2 more sources

Navigating the Small World Web by Textual Cues [PDF]

open access: yesarXiv, 2002
Can a Web crawler efficiently locate an unknown relevant page? While this question is receiving much empirical attention due to its considerable commercial value in the search engine community [Cho98,Chakrabarti99,Menczer00,Menczer01], theoretical efforts to bound the performance of focused navigation have only exploited the link structure of the Web ...
arxiv  

Research overview of microblog analysis

open access: yesJournal of Hebei University of Science and Technology
Microblog is one of the important social information communication platform. Because of its characteristics of easy operation and fast spread, people can directly and quickly express their attitude to emergencies, public figures, hot products and daily ...
Bin LIU   +5 more
doaj   +1 more source

Swap Dynamics in Single-Peaked Housing Markets [PDF]

open access: yesarXiv, 2019
This paper focuses on the problem of fairly and efficiently allocating resources to agents. We consider a specific setting, usually referred to as a housing market, where each agent must receive exactly one resource (and initially owns one). In this framework, in the domain of linear preferences, the Top Trading Cycle (TTC) algorithm is the only ...
arxiv  

Malicious and Benign Webpages Dataset

open access: yesData in Brief, 2020
Web Security is a challenging task amidst ever rising threats on the Internet. With billions of websites active on Internet, and hackers evolving newer techniques to trap web users, machine learning offers promising techniques to detect malicious ...
A.K. Singh
doaj  

PVSS: A Progressive Vehicle Search System for Video Surveillance Networks [PDF]

open access: yesarXiv, 2019
This paper is focused on the task of searching for a specific vehicle that appeared in the surveillance networks. Existing methods usually assume the vehicle images are well cropped from the surveillance videos, then use visual attributes, like colors and types, or license plate numbers to match the target vehicle in the image set.
arxiv  

Home - About - Disclaimer - Privacy