Friday, May 1, 2009

KiokuDB's First Year (give or take)

So, for lack of a better topic to talk about as my first post, I will orate about my latest large-ish project, KiokuDB.

There already is a fair amount of information about KiokuDB on the intertubes now. Most of it can be found from its project homepage (take a look at the talks or the architectural overview). I think instead of explaining what it is, I will try and tell why and how this project came to be.

KiokuDB has very humble beginings as a toy project by my coworker Jonathan Rockway, called MooseX::Storage::Directory. The idea was being able to use MooseX::Storage to serialize objects into YAML files and then fetch them back easily. Jon worked out a very cute API but MooseX::Storage was just not powerful enough to really be useful for storing complex data.

I was very interested in writing a "proper" object database ever since I started programming in Perl, but never tried because it's such a difficult task. In fact, Perl already has two similar projects, Pixie and Tangram, both of which try to provide an OO focused approach to persistence (as opposed to say DBIx::Class which is truer to the relational model). Unfortunately neither of those was very popular, and I suspect the reason was skepticism; people just didn't believe the transparency would work reliably in a language as rich and crazy as Perl.

Given the way things were going out with Moose in the last few years I felt like it was a good time to reinvent that wheel once more, leveraging Moose for the transparency while keeping very conservative defaults elsewhere. Persistence is a hairy problem, but since Moose based classes have so much meta data it's easy much easier to do the right thing for objects of those classes.

In May of 2008 I started sketching out an initial design, re-reviewing the MooseX::Storage code, talking at length with Sam Vilain, and googling for similar projects. I wasn't doing any coding at all but ideas were materializing in my brain. In July Stevan, my boss, told me that he'd like to use this for our next $work app, leading to the first commit on KiokuDB.

By September we had a KiokuDB backed website running on the Berkeley DB backend. This site was doing simple queries and navigational presentation of the data. Our first impressions were that object databases are indeed much more natural to use for that kind of data. This project begat many KiokuDB related features, like the initial version of what would become Catalyst::Model::KiokuDB.

Since then we've developed 4 more applications. One of these makes heavy use of relational data using Fey for "real" SQL (no OO inflation involved, lots of aggregate operations, etc), and KiokuDB for everything else. Two other apps include a CAS versioned schema, closely inspired by Git's versioning model.

Lately the project is also beginning to gather a community. KiokuDB powers Thumb-Rate.com and the Thumb-Rate app in the Apple iPhone Store. The #kiokudb IRC channel is also quite lively. In short, it's not just II that's using it for fun and profit ;-)

At least for us KiokuDB has been very successful so far. The amount of effort involved in prototyping apps has gone down, and the prototyping code very easily evolves into production quality code later as features are needed. Schema changes amount to simply refactoring the object model. It also reduced the amount of ad hoc non relational data stores. Using an ORM for simple configuration data is overkill, but with KiokuDB it's very natural.

That said, the project is still quite new, with many ideas left to explore. The biggest missing piece is probably Search::GIN which has very ambitous goals, but currently realizes only a handful.

Quite a long first post, I suppose. I guess I must be motivated ;-)

2 comments:

Dave Cardwell said...

I’ve been playing with KiokuDB recently and have found it great for prototyping apps and getting up-and-running quickly. I look forward to using it in some larger projects I have coming up.

One thing I would love to see is some Search::GIN documentation as so far I’ve been poking and prodding, and piecing together bits from the KiokuDB tutorials.

Is this on the cards? Is there more information available outside CPAN already?

Keep up the good work :)

nothingmuch said...

Hi,

Short answer: it's planned but still a bit far.

Search::GIN's TODO list is unfortunately very long, which is why I'm avoiding documenting it for now.

Basically it's up on CPAN so people can use it if they're willing to live with the current limitations/caveats. Drop by #kiokudb on IRC and I will explain in more detail =)