really, nothing here

software geek

13.8.07

Berkeley DB

I confess that I strayed. Perhaps its been spending too much time in fake grad school that had confused me, or perhaps I'm just not as saavy a developer as I thought I was.

I've been working on a certain 100,000,000 row dataset problem using hand built data structures and mmap-ing them out of the file system. This morning, on a lark, I loaded the data into a Berkeley BD instance, and I'm haven't looked back since. Even with a Btree index the database builds faster than my own code, or that of other good netflix libraries, and then you get the benefit of the index on retrieval.

I haven't evagelized for Berkeley DB here yet. And not everyone will agree with me that its a good solution for most of your data access needs (certainly my alma mater didn't. But if you have a highly performant problem and what a quick, simple and very low overhead mechanism for managing your data give Berkeley DB a look. It eve has ruby bindings. It's free as in beer and code. [Updated: See comment below for an actual description of the licensing requirements.]

So for all two of you that read this blog and know and care about these things: Why are we still using relational databases for our websites. I understand the need for flexible systems for our reporting tools and data mining, but when a website has maybe 30 queries that need to be executed, and that database is almost always the first piece of a architecture to fail, why not use a customer database without the overhead of query management? Why are we still parsing SQL for simple CRUD calls when it could be done 10 times faster?

Labels: , ,

1 Comments:

Blogger Gregory Burd said...

Jesper,

I'm happy to see that you found Berkeley DB and used it in exactly the way it was intended to be used. Thanks for the excellent writeup, and for praising our speed. Berkeley DB is a fun product to work on because it can help with this exact set of specialized database requirements.

One quick note, Berkeley DB is distributed under the Sleepycat License which is like the GPLv2 but a bit less restrictive. Sleepycat is now part of Oracle. Sleepycat was a commercial company selling licenses to use Berkeley DB in non-open source products (called dual-licensing just like MySQL). So, it is FOSS as long as your product is OSS, otherwise you need to speak to Oracle and purchase a commercial license. Your use case is covered by the Sleepycat License, so enjoy!

regards,

-greg

_____________________________________________________________________

Gregory Burd greg.burd@oracle.com
Product Manager, Berkeley DB/JE/XML Oracle Corporation

14/8/07 09:48  

Post a Comment

Links to this post:

Create a Link

<< Home