5 Data Deduplication Best Practices

Studies suggest that organizations that havebackups or primary production data. This article
multiple copies of data buy, administer and usehighlights 5 best practices to help you select and
two to fifty times the amount of storage spaceimplement the best data deduplication solution for
they'd need with data deduplication. It's no wonderyour environment. 
than data redundancy is a major contributor to1. Consider the broad implications of deduplication.
explosive data growth. You'll want to consider how a deduplication
At the outset, data deduplication reduced datastrategy fits within your entire data management
redundancy in only specific circumstances, such asand storage strategy, accounting for tradeoffs in
full backups, VMware images and emailthings like computational time, accuracy, index
attachments. However, duplicate data would stillsize, the level of deduplication detected and the
persevere. This is mainly because of thescalability of the solution.
multiplication of test and development data across2. Learn what data does not dedupe well. Human
an organization over time. Backup, archiving, andcreated data dedupes differently than data
replication create numerous data copies that cancreated by computers, so you'll want to consider
be found throughout an organization. Add to thatwhat types of data to avoid deduplication efforts.
the fact that users often copy data to locations3. Don't obsess over space reduction ratios. The
for their own convenience. length of time that data is retained affects your
Organizations are now realizing these facts, andspace reduction ratios, but rather than increasing
are seeing data deduplication as a mandatory andthe number of full backups, consider increasing
integrated element of their overall IT strategy. your backup retention period.
Essentially there are two methods of reducing the4. Don't use multiplexing if you're backing up to a
cost of data storage. First, you can use aVTL. Multiplexing data in a virtual tape library
lower-cost storage platform, but that opens(VTL) wastes computing cycles.
numerous additional problems that I won't go into5. Pilot multiple systems before you select your
here. Second, you can leverage a sound datasystem. This will ensure you that the deduplication
deduplication strategy designed to reduce requiredsolution you choose integrates best within your IT
storage and data growth. environment and the data currently in-house.  
Data deduplication can reduce your data storageIf you'd like to learn more about these 5 data
costs by lowering the amount of disk spacededuplication best practices you can visit the
required to store data – whether that be dataoriginal blog series.