Results 231 to 240 of about 27,811 (284)
Some of the next articles are maybe not open access.

Data Deduplication Techniques

Journal of Software, 2010
With the information and network technology, rapid development, rapid increase in the size of the data center, energy consumption in the proportion of IT spending rising. In the great green environment many companies are eyeing the green store, hoping thereby to reduce the energy storage system.
Li AO, Ji-Wu SHU, Ming-Qiang LI
openaire   +2 more sources

Demystifying data deduplication

Proceedings of the ACM/IFIP/USENIX Middleware '08 Conference Companion, 2008
Effectiveness and tradeoffs of deduplication technologies are not well understood -- vendors tout Deduplication as a "silver bullet" that can help any enterprise optimize its deployed storage capacity. This paper aims to provide a comprehensive taxonomy and experimental evaluation using real-world data.
Nagapramod Mandagere   +3 more
openaire   +1 more source

Distributed data deduplication

Proceedings of the VLDB Endowment, 2016
Data deduplication refers to the process of identifying tuples in a relation that refer to the same real world entity. The complexity of the problem is inherently quadratic with respect to the number of tuples, since a similarity value must be computed for every pair of tuples.
Xu Chu, Ihab F. Ilyas, Paraschos Koutris
openaire   +1 more source

Transparent Data Deduplication in the Cloud

Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015
Cloud storage providers such as Dropbox and Google drive heavily rely on data deduplication to save storage costs by only storing one copy of each uploaded file. Although recent studies report that whole file deduplication can achieve up to 50% storage reduction, users do not directly benefit from these savings-as there is no transparent relation ...
Armknecht, Frederik   +3 more
openaire   +1 more source

FINDING DATA DEDUPLICATION USING CLOUD

YMER Digital, 2022
Data grows at the emotional rate of 50% per time, and 75% of the digital world is a copy1 Although keeping multiple clones of data is necessary to guarantee their availability and high continuity and the quantum of data redundancy is inordinate. By keeping a single dupe of repeated data, data deduplication is one of the most promising results to reduce
Mr. M. A. R KUMAR, Mrs. SRILATHA PULI
openaire   +1 more source

Deduplication Over Big Data Integration

2021
Entity Resolution is the process of matching records from more than one database that refer to the same entity. In case of a single database the process is called deduplication. This article proposes a method to solve deduplication problem using Scala over Spark framework.
M. El Abassi, Med. Amnai, A. Choukri
openaire   +1 more source

Data Deduplication Techniques and Analysis

2010 3rd International Conference on Emerging Trends in Engineering and Technology, 2010
Data warehouses are the repositories of data collected from several data sources, which form the backbone of most of the decision support applications. As the data sources are independent, they may adopt independent and potentially inconsistent conventions.
Srivatsa Maddodi   +2 more
openaire   +1 more source

Data Deduplication Based on Hadoop

2017 Fifth International Conference on Advanced Cloud and Big Data (CBD), 2017
Efficient and scalable deduplication techniques are required to serve the need of removing duplicated data in big data processing platforms such as Hadoop. In this paper, an integrated deduplication approach is proposed by taking the features of Hadoop into acount and leveraging parallelism based on MapReduce and HBase so as to speed up the ...
Dongzhan Zhang   +4 more
openaire   +1 more source

Home - About - Disclaimer - Privacy