Results 31 to 40 of about 14,291 (315)

Development of Focused Crawlers for Building Large Punjabi News Corpus

open access: yesJournal of ICT Research and Applications, 2021
  Web crawlers are as old as the Internet and are most commonly used by search engines to visit websites and index them into repositories. They are not limited to search engines but are also widely utilized to build corpora in different domains and ...
Gurjot Singh Mahi, Amandeep Verma
doaj   +1 more source

Improving the performance of focused web crawlers

open access: yesData & Knowledge Engineering, 2009
This work addresses issues related to the design and implementation of focused crawlers. Several variants of state-of-the-art crawlers relying on web page content and link information for estimating the relevance of web pages to a given topic are proposed.
Πετρακης Ευριπιδης(http://users.isc.tuc.gr/~epetrakis)   +3 more
openaire   +3 more sources

Using Web Crawler Technology for Text Analysis of Geo-Events: A Case Study of the Huangyan Island Incident [PDF]

open access: yesThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2013
With the social networking and network socialisation have brought more text information and social relationships into our daily lives, the question of whether big data can be fully used to study the phenomenon and discipline of natural sciences has ...
H. Hu, Y. J. Ge
doaj   +1 more source

Construction and Application of the Attention Analysis Model of Brand Management Policies of Agricultural Products with Geographical Indications [PDF]

open access: yesNongye tushu qingbao xuebao, 2023
[Purpose/Significance] Geographical indications (GIs) are an important tool for local governments in China to carry out brand building of agricultural products. Brand management is a continuous systematic project involving multiple subjects.
HUO Mengjia, LIU Juan, Huang Jie
doaj   +1 more source

Research on post occupancy evaluation of Oze National Park in Japan based on online reviews

open access: yesJournal of Asian Architecture and Building Engineering, 2023
With the development of internet, online reviews are user-generated content posted in the web-media era and can extract meaning from the comments through data-mining technology.
Shouni Tang   +3 more
doaj   +1 more source

Scaling-laws of human broadcast communication enable distinction between human, corporate and robot Twitter users. [PDF]

open access: yes, 2013
Human behaviour is highly individual by nature, yet statistical structures are emerging which seem to govern the actions of human beings collectively. Here we search for universal statistical laws dictating the timing of human actions in communication ...
Faisal, A, Tavares, G
core   +2 more sources

Crawling the German Health Web: Exploratory Study and Graph Analysis

open access: yesJournal of Medical Internet Research, 2020
BackgroundThe internet has become an increasingly important resource for health information. However, with a growing amount of web pages, it is nearly impossible for humans to manually keep track of evolving and continuously changing content in the ...
Zowalla, Richard   +2 more
doaj   +1 more source

TwitterMancer: predicting interactions on Twitter accurately [PDF]

open access: yes, 2019
This paper investigates the interplay between different types of user interactions on Twitter, with respect to predicting missing or unseen interactions.
Byers, John W.   +3 more
core   +1 more source

iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling

open access: yes, 2016
Researchers in the Digital Humanities and journalists need to monitor, collect and analyze fresh online content regarding current events such as the Ebola outbreak or the Ukraine crisis on demand.
Diligenti M.   +4 more
core   +1 more source

Digital Availability of Product Information for Collaborative Engineering of Spacecraft [PDF]

open access: yes, 2019
In this paper, we introduce a system to collect product information from manufacturers and make it available in tools that are used for concurrent design of spacecraft.
A. L. Ramos   +7 more
core   +3 more sources

Home - About - Disclaimer - Privacy