Results 31 to 40 of about 1,170 (91)

ARCOMEM Crawling Architecture

open access: yesFuture Internet, 2014
The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the
Vassilis Plachouras   +7 more
doaj   +1 more source

RCrawler: An R package for parallel web crawling and scraping

open access: yesSoftwareX, 2017
RCrawler is a contributed R package for domain-based web crawling and content scraping. As the first implementation of a parallel web crawler in the R environment, RCrawler can crawl, parse, store pages, extract contents, and produce data that can be ...
Salim Khalil, Mohamed Fakir
doaj  

Research and design of distributed high-performance network reptiles based on cloud platform

open access: yesDianxin kexue, 2017
With the arrival of large data age,data has become the most valuable resource.And web crawler technology as an important means of external data collection,has become a standard tool for data analysis.A high-performance,convenient cloud-based crawler ...
Enming SHI, Xiaojun XIAO, Yu LU
doaj   +2 more sources

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

open access: yesFuture Internet, 2014
The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages.
Thomas Risse   +13 more
doaj   +1 more source

Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics

open access: yesBMC Medical Informatics and Decision Making, 2004
Background Critically Appraised Topics (CATs) are a useful tool that helps physicians to make clinical decisions as the healthcare moves towards the practice of Evidence-Based Medicine (EBM).
Loh Marie   +4 more
doaj   +1 more source

Technostress: A Technology-Enhanced Literature Review

open access: yesApplied Medical Informatics, 2021
Background: Technostress is defined as stress or psychosomatic illness caused by working with computer technology on a daily basis. The worldwide COVID-19 pandemic restrictions and measures as computer-based working and computer-assisted education ...
Ariana-Anamaria CORDOŞ   +2 more
doaj  

PathMarker: protecting web contents against inside crawlers

open access: yesCybersecurity, 2019
Web crawlers have been misused for several malicious purposes such as downloading server data without permission from the website administrator. Moreover, armoured crawlers are evolving against new anti-crawler mechanisms in the arm races between crawler
Shengye Wan, Yue Li, Kun Sun
doaj   +1 more source

Multiple-Feature Extracting Modules Based Leak Mining System Design

open access: yesThe Scientific World Journal, 2013
Over the years, human dependence on the Internet has increased dramatically. A large amount of information is placed on the Internet and retrieved from it daily, which makes web security in terms of online information a major concern.
Ying-Chiang Cho, Jen-Yi Pan
doaj   +1 more source

Algorithms and software solutions for SQL injection vulnerability testing in web applications

open access: yesВісник Національного технічного університету "ХПÌ": Системний аналіз, управління та інформаційні технології, 2018
Software security gains importance day by day and developers try to secure web applications as much as possible to protect confidentiality, integrity and availability that are described in the fundamental security model so-called CIA triad. SQL injection
Arslan Berk   +3 more
doaj   +1 more source

Home - About - Disclaimer - Privacy