Asp Forum - [ANN] Lafcadio 0.7.0, 0.6.1: Excessively Clever Query Caching

Francis Hwang

1/20/2005 3:08:00 PM

Hi everybody,

I've just released the newest dev release of Lafcadio, 0.7.0, and the
bugfix release 0.6.1 for the stable branch.

== What's Lafcadio? ==
An object-relational mapping library for use with MySQL. It supports a
lot of advanced features, including in-Ruby field value checking,
extensive aid in mapping to legacy databases, an advanced query engine
that allows you to form queries in Ruby that can be run either against
the live database, or an in-memory mock store for testing purposes.

Lafcadio is more than a year old and is currently in use on production
websites, most notably http://rh..., an online community that
has a 6-year-old legacy database and gets more than 3 million hits a
month.

== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this: Everytime you run a
select against the DB, Lafcadio caches the results in memory. Then, if
you later run a second select that is a subset of the first, Lafcadio
detects it, figures out what it's a subset of, filters out the results
in memory, and returns you the results. This all happens transparently.

What does this mean? It means a significantly faster app, because if
you run these three queries:

select * from users where lname = 'Smith'
select * from users where lname = 'Smith' and fname like '%john%'
select * from users where lname = 'Smith' and email like '%hotmail%'

Lafcadio will only ask MySQL for the results for the first select
statement, and do the rest for you without using the DB connection.

Francis Hwang
http://f...

6 Answers

Shashank Date

1/20/2005 3:24:00 PM

Hey Francis,

--- Francis Hwang <sera@fhwang.net> wrote:

> I've just released the newest dev release of
> Lafcadio, 0.7.0, and the
> bugfix release 0.6.1 for the stable branch.

Link?

http://rubyforge.org/project...

> == What's Lafcadio? ==
>

<snip>

> == What's new in 0.7.0? ==
> Excessively Clever Query Caching goes like this:

<snip>

Awesome ! This is of great interest to me.

Have you though about parallel query dispatch over
horizontally partitioned data? I have done something
like this for MS SQL 2000. Interested?

>
> Francis Hwang
> http://f...

-- shanko

__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my...

Francis Hwang

1/20/2005 3:33:00 PM

On Jan 20, 2005, at 10:24 AM, Shashank Date wrote:

>> == What's new in 0.7.0? ==
>> Excessively Clever Query Caching goes like this:
>
> <snip>
>
> Awesome ! This is of great interest to me.
>
> Have you though about parallel query dispatch over
> horizontally partitioned data? I have done something
> like this for MS SQL 2000. Interested?
>

Quite. But seeing as I'm pretty unschooled in DB theory in general,
I've never heard of "parallel query dispatch". Care to explain, or
offer a link?

Also, if you're interested in seeing this feature ported over to a DB
you use (such as MS SQL 2000) I'm open to extending Lafcadio to work
with any other DB as long as I've got people actively testing them on
other DBs. (I always use MySQL, hence Lafcadio's MySQL focus up 'til
now.)

Francis Hwang
http://f...

Shashank Date

1/20/2005 4:01:00 PM

> Quite. But seeing as I'm pretty unschooled in DB
> theory in general, I've never heard of "parallel
> query dispatch".

Well, of course ! My bad: I am using our internal
terminology while talking to outside world ;-)

The correct term is "Federated Databases". And even
that term is context dependant. Google it in the
context of SQL Server 2K and you will get what I mean.

> Care to explain, or offer a link?

http://www.sql-server-performance.com/federated_dat...

> Also, if you're interested in seeing this feature
> ported over to a DB
> you use (such as MS SQL 2000) I'm open to extending
> Lafcadio to work
> with any other DB as long as I've got people
> actively testing them on other DBs.

I can surely help testing. Especially if involves
running test cases in the background. I won't be able
to devote too much time on the foreground though.

> (I always use MySQL, hence Lafcadio's
> MySQL focus up 'til
> now.)

No problem. Let me know how I can get started.

> Francis Hwang
> http://f...

-- shanko

__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my...

Francis Hwang

1/22/2005 7:01:00 PM

On Jan 20, 2005, at 11:00 AM, Shashank Date wrote:

> http://www.sql-server-performance.com/federated_dat...

Intriguing stuff. Once you've set this up in MS SQL 2k, what
requirements are there for a client to manage them? I mean, besides
what the database takes care of for you automatically.

And by the way, if you're working with federated databases, how big are
these tables you're dealing with? I'm just wondering how much bigger
the tables at my work can get before I need to look into something like
this.

>> Also, if you're interested in seeing this feature
>> ported over to a DB
>> you use (such as MS SQL 2000) I'm open to extending
>> Lafcadio to work
>> with any other DB as long as I've got people
>> actively testing them on other DBs.
>
> I can surely help testing. Especially if involves
> running test cases in the background. I won't be able
> to devote too much time on the foreground though.

Well, I'll put "port to MS SQL" on my to-do list and let you know when
a new beta release has MS SQL support ... then I just need a steady
supply of specific bug reports to chase down, after that.

Francis Hwang
http://f...

Shashank Date

1/22/2005 10:12:00 PM

Hi Francis,

> Intriguing stuff. Once you've set this up in MS SQL
> 2k, what requirements are there for a client to
> manage them? I mean, besides what the database
> takes care of for you automatically.

Umm .... mantaining the indexes comes to mind. I don't
know the details since we never actually used it as it
comes out of the box. We found out that the queries
were not being executed in parallel. Hence we wrote
our own version (in Ruby of course) and called it
"parallel query dispatcher" :-)

> And by the way, if you're working with federated
> databases, how big are
> these tables you're dealing with? I'm just wondering
> how much bigger
> the tables at my work can get before I need to look
> into something like
> this.

It is not only the size that matters (in this case
;-)) but the nature of the application. Our data is
being collected at various data centers and then
coalesced at the central server. So it comes naturally
partitioned. Further our queries are rarely (almost
never) across the partitions. This is a very important
aspect which lends itself to federation.
Add to that the fact that our combined database is
about 100GB and tables are typically over 5 Million
rows. So when we did not have the budget to scale up
we decided to scale out and were reasonably
successful. We were in production for almost a year on
four 3-server clusters throwing hundreds of queries
every day. We did dynamic load balancing and were
working on query caching (like the one you have
provided in Lafcadio) when the project got the
attention of higher-ups and a more generous budget to
scale up ... which almost always is a better
alternative.

> Well, I'll put "port to MS SQL" on my to-do list and
> let you know when
> a new beta release has MS SQL support ... then I
> just need a steady supply of specific bug reports
> to chase down, after that.

Great ! Let me know ...

> Francis Hwang
> http://f...

-- shanko

__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.co...

Francis Hwang

1/23/2005 11:20:00 PM

On Jan 22, 2005, at 5:11 PM, Shashank Date wrote:

> Hi Francis,
>
>> Intriguing stuff. Once you've set this up in MS SQL
>> 2k, what requirements are there for a client to
>> manage them? I mean, besides what the database
>> takes care of for you automatically.
>
> Umm .... mantaining the indexes comes to mind. I don't
> know the details since we never actually used it as it
> comes out of the box. We found out that the queries
> were not being executed in parallel. Hence we wrote
> our own version (in Ruby of course) and called it
> "parallel query dispatcher" :-)

So are you saying that the data came naturally partitioned, and you
left it partitioned, and then used Ruby to analyze queries and dispatch
them to the right database transparently? I suppose I could use a
concrete example to help me grok this.

Francis Hwang
http://f...

comp.lang.ruby

[ANN] Lafcadio 0.7.0, 0.6.1: Excessively Clever Query Caching

Francis Hwang

Shashank Date

Francis Hwang

Shashank Date

Francis Hwang

Shashank Date

Francis Hwang

x Login to ForumsZone