A survey on deduplication systems
by Amdewar Godavari; Chapram Sudhakar
International Journal of Grid and Utility Computing (IJGUC), Vol. 15, No. 2, 2024

Abstract: With the arrival of new technological trends such as Big Data and Internet of Things, tremendous amount of duplicate data is being generated. Duplicate data causes the wastage of storage capacity and degradation of performance of the storage systems. Data deduplication is a storage optimisation technique that is used to eliminate duplicate data. Deploying deduplication system for primary storage or secondary storage is challenging due to extra latency incurred in deduplication processing. Apart from this, as duplicates are eliminated, deduplication affects contiguous placement of data on the disk, which is known as disk fragmentation problem. This paper gives overview of issues and solutions proposed for deploying deduplication component for primary and/or secondary storage systems with centralised or distributed approaches. Experiments are conducted using Destor tool on different data sets. The results are used to study the effect of different chunking algorithms on deduplication phases.

Online publication date: Mon, 08-Apr-2024

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Grid and Utility Computing (IJGUC):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com