Many people are confused between the compression and deduplication because they are so similar. Both of them are designed to reduce the size of the data being stored in the storage. Let me explain the difference between them in simple English.
1. This is how your data looks like originally (Assuming only one unique file):
2. This is how your data look like after being stored in a ZFS pool with compression enabled.
3. This is how your data look like after being stored in a ZFS pool with deduplication enabled.
4. Let say we are storing three identical files, i.e.,
5. ZFS: Compression Only
6. ZFS: Deduplication Only
7. ZFS: Compression + Deduplication
Of course, enabling both compression and deduplication will save lots of free space. However, it comes with a very high price tag. If you like to enable deduplication, you need to make sure that you have at least 2GB of memory per 1TB of storage. For example, if your ZFS pool is 10TB, you need to have 20GB of memory installed in your system. Otherwise, you will experience a huge performance hit.
Hope this article helps you to understand the difference between compression and deduplication.