Thursday, May 7, 2009

Directory::Transactional

I recently remembered miyagawa's 20 modules talk from YAPC::Asia::2008. At the end he asked that other CPAN authors give a similar talk about some of their modules. I think that instead of giving a talk I will try to write a series of posts.

Directory::Transactional is a module I wrote for KiokuDB's plain files backend. It provides full ACID guarantees (as long as you also use it for all your read operations too) on an arbitrary set of files.

The interface revolves around a handle through which you create transactions (txn_do), and open all file and directory handles. For example if you do my $fh = $h->openw("file.txt") then $fh will be a filehandle open for writing to a copy of file.txt in a shadow directory created for the current transaction.

One cool feature is the auto commit implementation. By using Hash::Util::FieldHash::Compat we can track the lifetime of all returned resources. The first resource created outside of a transaction causes one to be opened. When the last resource goes out of scope the transaction is committed. Perl's reference counting can be a pain sometimes, but it also enables some really cool hacks.

The most fun I had writing this module was the test suite. On UNIX platforms the crash recovery stress test forks off a bunch of concurrent workers and then randomly issues a kill -9 every once in a while. Meanwhile a fixture loop is continually checking that the read values are always consistent. The test itself updates several "bank account" text files (each contains a number), and fixture assures that the accounts are always balanced. The actual update has additional delays to make sure that the files are not updated in the same OS time slice, and there truly is lock contention.

One major limitation is that it doesn't detect deadlocks if you access files out of order. I've been toying with maintaining a lock table, but that seems like a lot of work. If you are running on HPUX the OS will detect flock deadlocks and return EDEADLK, causing the transaction to rollback. Another option is to use the global option for deadlock prone code. This creates a single top level lock.

In the future I hope to steal File::Transaction::Atomic's atomic symlink swapping hack. This will allow readers to safely work with the files without using a lock (though they won't benefit from the isolation part of ACID).

No comments: