Background & Introduction
Data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data. In the deduplication process, unique chunks of data, or byte patterns, are identified and stored during a process of analysis. In many situations, data deduplication leads to much higher compression ratio and read performance. This technology has been used in areas like disk storage and backup systems for many years. With the rapid evolving of cloud computing, we began to notice its importance in OS memory management.
This website is dedicated to the research, design and implementation of data deduplication mechanisms at OS kernel level.
On more about why you may need the technologies from this website please go to Q&A.
We currently focus on two open source projects:
- UKSM: The basic data structures come from the Linux kernel module KSM. We completely rewrite the core algorithms of KSM. Now UKSM can transparently scan all applications’ memory ( including the KVM virtual machines ) in a highly CPU efficient way.UKSM has been used in many forked kernel branches in desktop, servers and mobile devices (you can google the key “linux uksm” or “android uksm” to see how people are using UKSM now) .
- Xen-dedup: We apply the design and algorithms of UKSM to the well-known Xen virtual machine. Xen-dedup is the first real data deduplication solution for Xen platforms.
For more details and benchmarks of these two projects, please go to their home pages.