Open Source Deduplication For Linux With Opendedup 186
tazzbit writes "The storage vendors have been crowing about data deduplication technology for some time now, but a new open source project, Opendedup, brings it to Linux and its hypervisors — KVM, Xen and VMware. The new deduplication-based file system called SDFS (GPL v2) is scalable to eight petabytes of capacity with 256 storage engines, which can each store up to 32TB of deduplicated data. Each volume can be up to 8 exabytes and the number of files is limited by the underlying file system. Opendedup runs in user space, making it platform independent, easier to scale and cluster, and it can integrate with other user space services like Amazon S3."
Re:How useful is this in realistic scenarios? (Score:1, Insightful)
Lets put it like this -
Imagine a CentOS box connected to a san - with the Data Mounts on NFS. Now imagine that all of those NFS mounts are deduped at the block level. How does 1:3 savings sound?
So you can store 2.4TB on 800GB ! Now imagine replicating that across a WAN circuit to another SAN for DR.
So not only does dedupe save you in storage costs, thin provisioning etc - it saves you in WAN costs as well. I'll gladly pay for a little more processor/memory up front in order to save those more expensive WAN/storage dollars.