Trouble with transactions (evanjones.ca)

[ 2013-May-13 07:23 ]

I spent my PhD at MIT researching high-performance distributed databases that support transactions. I am interested in this subject because transactions make it easy to write correct programs due to two nice properties. First, either an entire set of related changes are applied or none of them are, which simplifies error handling. Second, they allow you to pretend that concurrent updates never happen, and the database sorts it out for you. My personal theory is that most web applications are riddled with concurrency bugs that transactions would prevent, but because concurrent conflicting updates are so rare in the real world, no one notices (I would love to see some concrete evidence to support or disprove this theory, if anyone wants a research project).

However, since transitioning from building databases to using databases, I've learned that transactions can actually cause problems, even if you ignore potential performance and scalability problems. I gave a lightning talk at Ricon East 2013 with my rough thoughts on this subject (PDF slides) (video, but there were technical difficulties with the slides). I would love to hear opinions about using transactions in real applications (both problems and advantages), so I can flesh this out into a full length, intelligent article. In brief, the problems caused by transactions that we have run into are:

Weak consistency defaults (See Peter Bailis's article for details)
Indirectly calling functions that abort/commit a transaction in the middle, losing atomicity
Database APIs implicitly start a new transaction, hiding rollbacks or commits that are in the wrong place
Communicating with external systems needs to happen after the transaction commits, complicating program structure
Concurrency errors can happen at any point, so most programs need a top-level error handler
May want to retry on concurrency errors, but then you need to worry about retry limits and/or backoff
Transactions that are accidentally left open (e.g. due to bugs) cause concurrency problems for other tasks

My rough conclusion: Transactions are useful and do simplify programs. However, they don't completely eliminate the need to think, and you need to be careful about how you use them. Perhaps most importantly, I'm starting to think that most database APIs could be improved to avoid these pitfalls, and make transactions easier to use, or at least harder to use incorrectly. Any feedback is welcomed.