Rethink the Sync

about | archive


[ 2007-February-13 20:48 ]

This is one of the two best paper winners from OSDI'06. The idea that is presented is brilliant: get rid of synchronous disk writes, where the program blocks until disk I/O has actually been written to disk. Instead, allow the program to continue, and buffer all output (such as network communication or updating the screen) until the disk write completes. This allows a combination of the benefits of synchronous and asynchronous disk writes, which is an interesting idea. I am skeptical about how useful their specific implementation is, but I can definitely see how alternative implementations of this idea would be useful. Evan Martin has a good discussion, and there are also a few question and answers on the OSDI'06 discussion page.

There are many interesting things in this paper, but for me one of the most interesting parts is the description about how many IDE disks "lie" about synchronous writes. They will report to the operating system that the write has completed, even though it has only been written to the cache. This has been reported before, but I didn't realize that getting this right requires the use of "write barriers." By default, Linux does not use this, and so a power failure could corrupt ext3, even though it is not supposed to. There is a mount option, barrier=1 that will prevent this, but I found a remarkable lack of documentation about this option on the Internet. I'm currently investigating this issue further.

E. B. Nightingale, K. Veeraraghavan, P. M. Chen, J. Flinn. Rethink the Sync. In Proceedings of the 7th Symposium on Operating System Design and Implementation (OSDI'06), Seattle, USA, November 2006.