Double Sliding Window Chunking Algorithm for Data Deduplication in Ocean Observation
As an essential means to eliminate redundant data, data deduplication technology significantly affects today’s era of massive data growth. In recent years, due to the rapid development of a series of related industries, such as marine monitoring ...
Shuai Guo +3 more
doaj +1 more source
Distributed exact deduplication for primary storage infrastructures [PDF]
Lecture Notes in Computer Science, Volume 8460, 2014Deduplication of primary storage volumes in a cloud computing environment is increasingly desirable, as the resulting space savings contribute to the cost effectiveness of a large scale multi-tenant ...
Paulo, João, Pereira, José
core +3 more sources
Revocable Attribute-Based Encryption Scheme With Efficient Deduplication for Ehealth Systems
The deduplication based on attribute-based encryption can be well used in eHealth systems to save storage space and share medical records. However, the excessive computation costs of existing schemes lead to inefficient deduplication.
Hua Ma +4 more
doaj +1 more source
FASR: An Efficient Feature-Aware Deduplication Method in Distributed Storage Systems
Deduplication technology can obtain higher space utilization by keeping only one duplicate. But in a distributed storage system, the overall deduplication ratio will be limited due to redundancy elimination across nodes.
Wenbin Yao +3 more
doaj +1 more source
Parallel Weighted Random Sampling [PDF]
Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines.
, Sanders, Peter
core +1 more source
Deduplication algorithm based on condensed nearest neighbor rule for deduplication metadata
Building effective deduplication index in the memory could reduce disk access times and enhance chunk fingerprint lookup speed,which was a big challenge for deduplication algorithms in massive data environments.As deduplication data set had many samples ...
Wen-bin YAO +3 more
doaj +2 more sources
Research on Routing Strategy in Cluster Deduplication System
A cluster deduplication system can coordinate the work of multiple nodes, which can better alleviate the disk index bottleneck existing in the large-scale data backup system.
Qinlu He +5 more
doaj +1 more source
Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties
We study a generalization of deduplication, which enables lossless deduplication of highly similar data and show that standard deduplication with fixed chunk length is a special case.
Lucani, Daniel E. +2 more
core +1 more source
A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems
Deduplication has been largely employed in distributed storage systems to improve space efficiency. Traditional deduplication research ignores the design specifications of shared-nothing distributed storage systems such as no central metadata bottleneck,
Hamandawana, Prince +4 more
core +1 more source
Accelerating Catalyst Materials Discovery With Large Artificial Intelligence Models
AI‐empowered catalysis research via integrated database platform, universal machine learning interatomic potentials (MLIPs), and large language models (LLMs). ABSTRACT The integration of artificial intelligence (AI) into catalysis is fundamentally reshaping the research paradigm of catalyst discovery.
Di Zhang +7 more
wiley +2 more sources

