Asp Forum - OO database concepts...

Hal E. Fulton

3/25/2005 4:50:00 AM

I've been thinking about OO databases -- never having really
used such a beast. In fact, I am not sure that OOD even exist
in the sense that I would like.

Here's one thing I'm thinking about...

When one class inherits from another, we take it for granted
that they are "type compatible." Anywhere a Mammal can be
used, we can specify a Dog. (Insert standard discussion of
duck typing here.)

I'm thinking of the usual paradigm where a table is an ordered
sequence of fields. Neglecting the methods (which is another
issue entirely), we can think of the fields as instance vars.

But if a have a database table of Mammals... I can't store a
Dog in it. Hmm. I think I wish I could?

And don't get me started on singletons. Basically, the traditional
relational model assumes a fixed set of fields; but Ruby makes no
such assumption. Even objects of the same class may have different
attributes and methods.

Would it make sense to create a database that modeled Ruby's
paradigm more closely? Am I talking nonsense?

Thanks,
Hal

23 Answers

Aredridel

3/25/2005 5:11:00 AM

An ORDBMS like that would be very nice.

Postgresql has some features like that: each class gets its own table,
essentially, but its members show up in parent tables:

CREATE TABLE mammals (
legs integer not null default 2
);

CREATE TABLE dogs (
fur varchar(255) default 'Long and dirty',
) INHERITS mammals;

insert into dogs(legs, fur) values (4, 'Silky smooth');

select * from mammals;

# => 4

There's quirks (bugs, I'd say) with IDs and such, but it's workable in
some cases.

Ari

Asfand Yar Qazi

3/25/2005 6:28:00 AM

Aredridel wrote:
> An ORDBMS like that would be very nice.
>
> Postgresql has some features like that: each class gets its own table,
> essentially, but its members show up in parent tables:
>
> CREATE TABLE mammals (
> legs integer not null default 2
> );
>
> CREATE TABLE dogs (
> fur varchar(255) default 'Long and dirty',
> ) INHERITS mammals;
>
> insert into dogs(legs, fur) values (4, 'Silky smooth');
>
> select * from mammals;
>
> # => 4
>
> There's quirks (bugs, I'd say) with IDs and such, but it's workable in
> some cases.
>
> Ari
>
>

There are loads of OO databases for that bloody 'J' language - but
none for any scripting languages! Not even for Python! I find that
utterly surprising. E.g. http://www.garret.ru/~knizhnik/...

Perhaps Ruby needs to be the innovator in this field, shipping a full
file storage based OODBMS system with the standard library in the
future? :-)

vruz

3/25/2005 7:09:00 AM

> Perhaps Ruby needs to be the innovator in this field, shipping a full
> file storage based OODBMS system with the standard library in the
> future? :-)

There has been Aruna DB around for a while.
http://www.a...

Not sure how widely use it is, performance or quality of the codebase
ignored too.

It seems like it hasn't been actively developed lately. (but that may
be a good reason for someone to pick the project up if you happen to
like it)

If it's your current subject of research, maybe you can have a look at
it and tell us what you find.

cheers,
vruz

gabriele renzi

3/25/2005 9:14:00 AM

Asfand Yar Qazi ha scritto:

> There are loads of OO databases for that bloody 'J' language - but none
> for any scripting languages! Not even for Python! I find that utterly
> surprising. E.g. http://www.garret.ru/~knizhnik/...

In pythonland they have a concept of OODB wich includes "atop", "zodb"
and something else. It does not seem to fit with what ODMG says is an
OODB, anyway.

> Perhaps Ruby needs to be the innovator in this field, shipping a full
> file storage based OODBMS system with the standard library in the
> future? :-)

maybe just a binding to GOODS :)

Avi Bryant

3/25/2005 10:32:00 AM

Hal Fulton wrote:
> I've been thinking about OO databases -- never having really
> used such a beast. In fact, I am not sure that OOD even exist
> in the sense that I would like.
>
> Here's one thing I'm thinking about...
>
> When one class inherits from another, we take it for granted
> that they are "type compatible." Anywhere a Mammal can be
> used, we can specify a Dog. (Insert standard discussion of
> duck typing here.)
>
> I'm thinking of the usual paradigm where a table is an ordered
> sequence of fields. Neglecting the methods (which is another
> issue entirely), we can think of the fields as instance vars.
>
> But if a have a database table of Mammals... I can't store a
> Dog in it. Hmm. I think I wish I could?

Hal, you're thinking too much in the relational model still. Real
OODBs don't have tables, they have graphs of objects, just like in
memory. They tend to use persistence by reachability, which means that
they have a defined root object, and any object that you can get to
from that root object - regardless of type - gets stored in the
database. The only "table" of Mammals would be if you hung an array of
them off of the root, and you could stick whatever you wanted in that
array. This means the database has to be strongly, dynamically typed:
each object record knows what its own class is, and thus which fields
it has.

Note that this makes it much harder for the database to do automatic
indexing the way RDBMS do. Depending on the OODB, you may well have to
take care of this yourself, making sure that you have appropriate data
structures so that you can access any object by traversing a minimum of
pointer references and using as few large objects as possible (it's
usually a good idea to use trees of some kind for lookup rather than
massive hashtables, for example).

Gabriele mentioned GOODS. That's a reasonable open source choice if
you want a large-scale networked OODB, although you have to be very
careful about performance because every object access goes over the
network. I've written client libraries for both Squeak and Python (on
contract) for this, and porting the same design to Ruby would be very
doable - although the lack of a real WeakReference implementation will
make performance suffer. If you're careful, you can achieve
cross-language data sharing with this (I've done simple demos of having
Java, Python, and Squeak all accessing the same object data).

The author of GOODS also has a simpler on-disk OODB called DyBase that
there are already Ruby bindings to, that you might want to take a look
at for smaller-scale projects.

If you have more questions about OODBs, my general advice is to find a
Smalltalker - for whatever reason, in the Smalltalk world they're still
the default industrial choice (often in the form of GemStone, but
that's an expensive package targeted at big business; some of the heavy
users are JP Morgan, Washington Mutual, and the OOCL container lines).

Cheers,
Avi

Austin Ziegler

3/25/2005 4:41:00 PM

On Fri, 25 Mar 2005 15:24:47 +0900, Asfand Yar Qazi <ay1204@qazi.f2s.com> wrote:
> There are loads of OO databases for that bloody 'J' language - but
> none for any scripting languages! Not even for Python! I find that
> utterly surprising. E.g. http://www.garret.ru/~knizhnik/...
>
> Perhaps Ruby needs to be the innovator in this field, shipping a full
> file storage based OODBMS system with the standard library in the
> future? :-)

There's no such thing as a good OODBMS.

Period.

-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca

Glenn Parker

3/25/2005 4:52:00 PM

Austin Ziegler wrote:
>
> There's no such thing as a good OODBMS.
>
> Period.

+1

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoi...

Austin Ziegler

3/25/2005 4:57:00 PM

On Fri, 25 Mar 2005 19:34:48 +0900, Avi Bryant <avi.bryant@gmail.com> wrote:
> Hal Fulton wrote:
>> I've been thinking about OO databases -- never having really used
>> such a beast. In fact, I am not sure that OOD even exist in the
>> sense that I would like.
>>
>> Here's one thing I'm thinking about...
>>
>> When one class inherits from another, we take it for granted that
>> they are "type compatible." Anywhere a Mammal can be used, we can
>> specify a Dog. (Insert standard discussion of duck typing here.)
>>
>> I'm thinking of the usual paradigm where a table is an ordered
>> sequence of fields. Neglecting the methods (which is another
>> issue entirely), we can think of the fields as instance vars.
>>
>> But if a have a database table of Mammals... I can't store a Dog
>> in it. Hmm. I think I wish I could?
> Hal, you're thinking too much in the relational model still. Real
> OODBs don't have tables, they have graphs of objects, just like in
> memory. They tend to use persistence by reachability, which means
> that they have a defined root object, and any object that you can
> get to from that root object - regardless of type - gets stored in
> the database. The only "table" of Mammals would be if you hung an
> array of them off of the root, and you could stick whatever you
> wanted in that array. This means the database has to be strongly,
> dynamically typed: each object record knows what its own class is,
> and thus which fields it has.

> Note that this makes it much harder for the database to do
> automatic indexing the way RDBMS do. Depending on the OODB, you
> may well have to take care of this yourself, making sure that you
> have appropriate data structures so that you can access any object
> by traversing a minimum of pointer references and using as few
> large objects as possible (it's usually a good idea to use trees
> of some kind for lookup rather than massive hashtables, for
> example).

There are three fundamental problems with OODBMSs -- and they stem
from the theory of OODBMS, not just the implementation. Avi just
pointed out the first -- the indexing on OODBMSs utterly sucks,
which means that the performance has to come from your object model,
whereas an RDBMS or ORDBMS can have additional indexing applied or
derived from how the application uses the data. The performance for
an OODBMS will therefore suffer far quicker than the performance for
an (O)RDBMS; this is due to the second fundamental problem.

Avi also suggests the second, but it's not clearly stated as a
fundamental problem: persistence by reachability. This specifically
means that they are very Pythonic: there's only *ONE* way to get at
a particular set of data. In a traditional RDBMS, you can access
data from many directions. Consider the typical example of "owned"
data, an order and its itemised details (e.g., the items on the
order). In an OODBMS, the only way to determine how many Widgets you
sold last month is to either (1) traverse the object graph for all
orders last month and visit the order details or (2) store the data
in both details and elsewhere, doubling your data storage for that
information. In an (O)RDBMS, you can simply query the order details
table without having to visit the orders themselves. This makes your
data much more accessible, and usable for ways that may not have
been envisioned by the original architects or designers. An OODBMS
locks you into a particular object model to access the data, leading
directly to the third problem.

Portability. If you need to change the object model at all in an
object database, you have to go through a full-scale migration of
the entire data store. In an (O)RDBMS, you have only to go through a
migration (and sometimes not even that) only on the tables affected.
If you want to change your OODBMS implementation, it's infinitely
harder than the already difficult problem of migrating (O)RDBMS
implementations.

OODBs are crap -- and always will be.

-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca

Doug Beaver

3/25/2005 6:18:00 PM

On Sat, Mar 26, 2005 at 01:56:53AM +0900, Austin Ziegler wrote:

[ snip good summary of three fundmanetal problems of OODBMSs (hard to
index, rigid access paterns, poor portability) ]

> OODBs are crap -- and always will be.

the nice thing is that if you want to enforce a type hierarchy or
support different numbers of columns for a given class of objects, an
object-relational mapping layer will give you 80% of what the OODBMS is
giving you, without all the OODBMS baggage. "dynamic" columns are
pretty easy to setup with a RDBMS, as long as you don't mind reading in
multiple rows in order to recreate an object, along with a cheap join
against a domain table holding the column definitions. with modern
databases, it ends up being pretty fast.

i think ORM is already a de facto third class of database (as weird as
that sounds at first glance), and the ORM domain of solutions are only
going to get better at caching, prediction, and performance. i've been
working with hibernate lately and am pretty impressed with all the
optimizations it can make, my hand-coded database layers almost matched
its performance, and took literally dozens of hours longer to implement.

doug

p.s. i think the OODBMS manifesto is a good read, if anyone wants to
learn the motivations that led to creating them in the first place.

http://www-2.cs.cmu.edu/People/clamen/OODBMS/Manifesto/htManifesto/Mani...

--
"Contrary to what most people say, the most dangerous animal in the
world is not the lion or the tiger or even the elephant. It's a shark
riding on an elephant's back, just trampling and eating everything they
see." -- Jack Handey

gabriele renzi

3/25/2005 6:30:00 PM

> Avi also suggests the second, but it's not clearly stated as a
> fundamental problem: persistence by reachability. This specifically
> means that they are very Pythonic: there's only *ONE* way to get at
> a particular set of data. In a traditional RDBMS, you can access
> data from many directions. Consider the typical example of "owned"
> data, an order and its itemised details (e.g., the items on the
> order). In an OODBMS, the only way to determine how many Widgets you
> sold last month is to either (1) traverse the object graph for all
> orders last month and visit the order details or (2) store the data
> in both details and elsewhere, doubling your data storage for that
> information. In an (O)RDBMS, you can simply query the order details
> table without having to visit the orders themselves. This makes your
> data much more accessible, and usable for ways that may not have
> been envisioned by the original architects or designers. An OODBMS
> locks you into a particular object model to access the data, leading
> directly to the third problem.

I think there is a need to clarify (at least for me, since I'm dumb and
I generally don't understand things).
There is one kind of things going under the name of OODB wich seem to
meet your description, but there is another one, the one advocated from
ODMG wich seem to collide.
For example, the Object Query Language seem to allow the same kind of
access to objects that you could get via SQL.

Also I'm not sure I understand: once you have access to the root object,
if you want to have different views of the data could'nt you just create
an object wich actually is this view ?

comp.lang.ruby

OO database concepts...

Hal E. Fulton

Aredridel

Asfand Yar Qazi

vruz

gabriele renzi

Avi Bryant

Austin Ziegler

Glenn Parker

Austin Ziegler

Doug Beaver

gabriele renzi

x Login to ForumsZone