Results 71 to 80 of about 76,818 (217)
A Novel Term Weighing Scheme Towards Efficient Crawl of Textual Databases [PDF]
The Hidden Web is the vast repository of informational databases available only through search form interfaces, accessible by therein typing a set of keywords in the search forms. Typically, a Hidden Web crawler is employed to autonomously discover and download pages from the Hidden Web. Traditional hidden web crawlers do not provide the search engines
arxiv
WebParF: A Web partitioning framework for Parallel Crawlers [PDF]
With the ever proliferating size and scale of the WWW [1] efficient ways of exploring content are of increasing importance. How can we efficiently retrieve information from it through crawling? And in this era of tera and multi-core processors, we ought to think of multi-threaded processes as a serving solution.
arxiv
Makalah ini akan membahas mengenai sistem yang memiliki fungsi utama membentuk ringkasan dari dokumen secara otomatis dengan menggunakan metode yang bersifat text mining.
Budhi Kurniawan Wangsa+2 more
doaj
Effective Focused Crawling Based on Content and Link Structure Analysis [PDF]
A focused crawler traverses the web selecting out relevant pages to a predefined topic and neglecting those out of concern. While surfing the internet it is difficult to deal with irrelevant pages and to predict which links lead to quality pages. In this paper a technique of effective focused crawling is implemented to improve the quality of web ...
arxiv
Search Sounds: An Audio Crawler Focused On Weblogs.
[TODO] Add abstract here.
Òscar Celma+2 more
openaire +2 more sources
Navigating the Small World Web by Textual Cues [PDF]
Can a Web crawler efficiently locate an unknown relevant page? While this question is receiving much empirical attention due to its considerable commercial value in the search engine community [Cho98,Chakrabarti99,Menczer00,Menczer01], theoretical efforts to bound the performance of focused navigation have only exploited the link structure of the Web ...
arxiv
Research overview of microblog analysis
Microblog is one of the important social information communication platform. Because of its characteristics of easy operation and fast spread, people can directly and quickly express their attitude to emergencies, public figures, hot products and daily ...
Bin LIU+5 more
doaj +1 more source
Swap Dynamics in Single-Peaked Housing Markets [PDF]
This paper focuses on the problem of fairly and efficiently allocating resources to agents. We consider a specific setting, usually referred to as a housing market, where each agent must receive exactly one resource (and initially owns one). In this framework, in the domain of linear preferences, the Top Trading Cycle (TTC) algorithm is the only ...
arxiv
Malicious and Benign Webpages Dataset
Web Security is a challenging task amidst ever rising threats on the Internet. With billions of websites active on Internet, and hackers evolving newer techniques to trap web users, machine learning offers promising techniques to detect malicious ...
A.K. Singh
doaj
PVSS: A Progressive Vehicle Search System for Video Surveillance Networks [PDF]
This paper is focused on the task of searching for a specific vehicle that appeared in the surveillance networks. Existing methods usually assume the vehicle images are well cropped from the surveillance videos, then use visual attributes, like colors and types, or license plate numbers to match the target vehicle in the image set.
arxiv