Results 111 to 120 of about 11,273 (224)
Using Hadoop to implement a semantic method for assessing the quality of medical data
Recent technological advances in modern healthcare have lead to a vast wealth of patient data being collected. This data is not only utilised for diagnosis but also has the potential to be used for medical research.
Bonner, Stephen Arthur Robert
core
FP-Hadoop makes the reduce side of Hadoop MapReduce more parallel and efficiently deals with the problem of data skew in the reduce side. In FP-Hadoop, there is a new phase, called intermediate reduce (IR), in which blocks of intermediate values ...
Akbarinia, Reza +2 more
core
Búsquedas optimizadas en la página web de la ESPOL [PDF]
El presente documento ilustra el análisis y la implementación de una aplicación que se enfoca en realizar búsquedas optimizadas en la página de la ESPOL. Actualmente el sitio web de la universidad no cuenta con un proceso de búsqueda propio que permita
Rodríguez Rivera, Carlos +2 more
core
Data Lakes: A Survey of Concepts and Architectures
This paper presents a comprehensive literature review on the evolution of data-lake technology, with a particular focus on data-lake architectures. By systematically examining the existing body of research, we identify and classify the major types of ...
Sarah Azzabi +2 more
doaj +1 more source
Everyday quintillion bytes of data are created. About 90% of this data which are posts to social media sites, digital pictures, videos etc. are unstructured. These data is BigData and should be formatted to make it suitable for data mining and its subsequent analysis.
Daniel, Suman Elizabeth, A, Binu
openaire +2 more sources
Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire.
White, Tom
core
Improving Performance of Hadoop Clusters [PDF]
The MapReduce model has become an important parallel processing model for large- scale data-intensive applications like data mining and web indexing. Hadoop, an open-source implementation of MapReduce, is widely applied to support cluster computing jobs ...
Xie, Jiong
core
Hadoop mapreduce scheduling paradigms [PDF]
Apache Hadoop is one of the most prominent and early technologies for handling big data. Different scheduling algorithms within the framework of Apache Hadoop were developed in the last decade.
Yazidi, Anis +2 more
core
Hadoop es actualmente uno de los principales frameworks para analizar Big Data, uno de sus inconvenientes es la poca cantidad de métricas de rendimiento que se conocen sobre el.
Montero Rivero, Alejandro
core +1 more source
Using software-defined networking to improve the performance of Hadoop clusters
碩士由於現今網路發展迅速,巨量資料的時代已經來臨。全球各大企業與組織紛紛改採雲端運算模型來解決他們所面臨的諸多問題。雲端運算的龐大計算與儲存能力來自於大型資料中心,其中運行的巨量資料處理核心工具則多數為Hadoop MapReduce。相關研究指出資料中心透過MapReduce運算框架所產生的中介資料交互傳遞(Shuffling)行為造成網路壅塞現象,直接影響到運算工作的執行效能。針對這個問題,已有一些研究初步驗證了結合軟體定義網路技術將是一個可行的解決方案。換言之 ...
廖振宇;Liao, Jhen-Yu
core

