Designing an Efficient Deduplication Algorithm for Audio Files in Cloud Storage
Keywords:
Deduplication, Hash Table, MD6, Audio Files, Cloud StorageAbstract
Duplicate data poses a significant challenge in big data storage systems as it consumes storage space, affecting data organization, management, and processing. To solvethis problem, hashalgorithms are used to generate hashkeys for files. However, as theamount of data stored in the cloud increases, the search and matching process takes longer. Additionally, hashkeys can match different files, known as collisions, which are related to the length of the hashkey. The longer the key, the less likely collisions will occur.In this paper, we present a technique for eliminating duplicate data at the file level to reduce storage of duplicate audio data in the cloud storage system. The proposed technique aims to reduce the search time for hashvalues by creatinga reduction table with multiple indexes. These indexes are designed based on the audio file format. Therefore, the hashtable includes multiple indexes, each for a specific format. To minimize the probabilityof collisions, MD6 algorithm is used, which produces a key with a length of 512 bits.
Downloads
Downloads
Published
Issue
Section
Categories
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.