Asp Forum - [ANN] ndb Object-Relational Mapper

George Moschovitis

10/7/2004 11:12:00 AM

= n/db README

== Table of Contents

1. Introduction
2. Features
3. Sample Code
4. Technical Information
5. Links
6. Current Version and Status
7. Future
8. Feedback
9. Licence

== Introduction

The aim of this project is to deliver an efficient, yet simple,
Object-Relational Mapper Library. The library manages the
lifecycle of standard Ruby objects. Unlike similar projects, our
approach requires neither xml configuration files (i.e. J2EE), nor
SQL definitions (i.e. ActiveRecord).

By defining the Managed Objects (or Entities) using Ruby, we
leverage Ruby's OOP features to allow for OO definitions of DataBase
schemas. A simple meta-language is provided to allow fine-tuning
of the generated SQL code.

This library is designed to be integrated in a Web Application
Framework. This Framework will be released on a later day.

== Features

The library provides the following features:

+ Object-Relational mapping
+ Deserialize to Ruby Objects or ResultSets
+ Deserialize sql join queries to Ruby Objects
+ Connection pooling
+ Thread safety
+ SQL transactions
+ Callbacks
+ Simple support for cascading deletes

== Sample Code

require "n/db"

# initialize the DB interface
$db = N::Db.new(
:address => "localhost",
:database => "test",
:user => "postgres",
:password => "mypassword",
:connection_count => 20
)

# create a managed object

class MyObject
include N::Entity
manage {
prop String, :name
prop [Fixnum, "smallint default 1"], :level; sql_index :level
}
# non serialized attributes
attr_accessor value
end

class AnotherObject < MyObject
manage {
prop Time, :create_time
}

def initialize
@create_time = Time.now
super
end
end

obj = AnotherObject.new
obj.name = "gmosx"

# insert into db
$db << obj

obj.name = "drak"

# update
$db << obj

# get object id allocated when inserting
oid = obj.oid

# lookup
obj = $db.get(oid, AnotherObject)

# multiple commands

$db.open { |db|
obj = db.get(oid, AnotherObject)
db.delete(obj)
db.select("SELECT * FROM #{AnotherObject::DBTABLE} WHERE oid=1")
}

# graph like objects

class Child
include N::Entity
include N::Child
manage {
prop Time, :create_time
}
end

child = Child.new
child.set_parent(obj)

$db << child

# get children of class Child

$db.children(obj, Child, "ORDER BY create_time")

# sql transactions

$db.transaction {
....
}

== Technical Information

=== Installation

Just uncompress the distribution and add the 'n' directory to
the Ruby path.

tar xvfz ndb-x.x.tar.gz
cd ndb
ruby n/.tc-db.rb

=== How the package is organized:

doc/*:
RDOC generated documentation.

n/db.rb:
The main n/db file.

n/db/*:
The db related files.

n/utils/*:
Various utility files.

=== Testing:

Unit Test files start with the .tc- prefix, i.e. they are hidden
by default (Unix). You can run the db test unit as follows:

ruby n/.tc-db.rb

Before running this test unit, make sure tha the postgreSQL server
is running, and the 'test' database is created.

== Links

* http://ww..., Navel
* http://ww.../open-source, Project Home Page

== Current Version and Status

Version 0.2 was released on 07-10-2004. The sofware is actually usable
but not tested in a production environment. Comments from the Ruby
community are critical in order to fix possible bugs and improve the
API. Suggestions for missing features are also welcome. This version
only supports the Postgres Database.

== Future

* Support multiple DataBase backends (Postgres, MySQL, ...)
* Support prepared statements (pgsql)
* Support stored procedures (pgsql)
* Support caching
* Deserialize to OpenStruct
* Better documentation
* Code cleanup, refactoring

== Feedback

If you would like to report a bug, suggest a method for inclusion, or
make
some other suggestion (documentation, package layout, etc.), please
send
an email to ndb-feedback@navel.gr

== Licence

Copyright (c) 2004 Navel, all rights reserved. http://ww...

n/db (http://ww.../open-source) is copyrighted free software
created and maintained by George Moschovitis (mailto:gm@navel.gr)
and released under the same license as Ruby.

Standard disclaimer:
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.

24 Answers

Brian Candler

10/7/2004 12:21:00 PM

On Thu, Oct 07, 2004 at 08:14:53PM +0900, George Moschovitis wrote:
> Comments from the Ruby
> community are critical in order to fix possible bugs and improve the
> API. Suggestions for missing features are also welcome. This version
> only supports the Postgres Database.

I'd like a database object mapping framework to remember the 'old' values of
properties at the time an object was read, and perform safe atomic updates.

e.g. sample
+---+-------+------+-------+
|id | a | b | c |
+---+-------+------+-------+
| 1 | hello | Ruby | world |
+---+-------+------+-------+

obj = $db.get(1) # a="hello", b="Ruby", c="world"
...
obj.a = "goodbye" # still remembers old_a = "hello"
obj.b = "Perl" # still remembers old_b = "Ruby"
obj.save

generates:

(1) lax

UPDATE sample SET a='goodbye',b='Perl' WHERE id=1 AND
(a='hello' OR a='goodbye') AND (b='Ruby' OR b='Perl');
or
(2) strict

UPDATE sample SET a='goodbye',b='Perl' WHERE id=1 AND
a='hello' AND b='Ruby';

and raises an exception if the row processed count = 0. That stops
simultaneous updates stamping on each other. The 'strict' version is useful
for counters, sequences etc.

If this exception occurs, it's then easy enough to query the database to get
a set of (keys,values) where the value in the database is not consistent
with the value you're trying to save, to allow the user to resolve the
conflict:

# this needs wrapping in a method, iterating across all properties
obj2 = $db.get(1)
diff = {}
diff[:a] = [obj.a, obj2.a] if (obj2.a != obj.old_a AND obj2.a != obj.a)
diff[:b] = [obj.b, obj2.b] if (obj2.b != obj.old_b AND obj2.b != obj.b)
obj.old_a = obj2.a
obj.old_b = obj2.b

These 'old' properties might also be useful for object-level transactions,
without serialisation as ActiveRecord/Transaction::Simple does.

Regards,

Brian.

George Moschovitis

10/7/2004 12:41:00 PM

Hello Brian,

thank you for your suggestion! It took me some time to understand it,
but it
seems interesting. I 'll have to think about it a little bit longer
about how to implement
this in the library.

best regards,
George

khaines

10/7/2004 2:32:00 PM

On Thu, 7 Oct 2004 21:20:37 +0900, Brian Candler wrote

> I'd like a database object mapping framework to remember the 'old'
> values of properties at the time an object was read, and perform
> safe atomic updates.
>
> e.g. sample
> +---+-------+------+-------+
> |id | a | b | c |
> +---+-------+------+-------+
> | 1 | hello | Ruby | world |
> +---+-------+------+-------+
>
> obj = $db.get(1) # a="hello", b="Ruby", c="world"
> ...
> obj.a = "goodbye" # still remembers old_a = "hello"
> obj.b = "Perl" # still remembers old_b = "Ruby"
> obj.save
>
> generates:
>
> (1) lax
>
> UPDATE sample SET a='goodbye',b='Perl' WHERE id=1 AND
> (a='hello' OR a='goodbye') AND (b='Ruby' OR b='Perl');
> or
> (2) strict
>
> UPDATE sample SET a='goodbye',b='Perl' WHERE id=1 AND
> a='hello' AND b='Ruby';
>
> and raises an exception if the row processed count = 0. That stops
> simultaneous updates stamping on each other. The 'strict' version is
> useful for counters, sequences etc.

That's an interesting suggestion, as it gives one a defense against some
external process coming in and changing things unexpectedly. Putting this
on my todo for Kansas!

> These 'old' properties might also be useful for object-level
> transactions, without serialisation as
> ActiveRecord/Transaction::Simple does.

In Kansas, it keeps each old value within the context of a transaction, but
after the transaction is committed, the old values go away. I don't provide
any interface, however, for actually access the old values at any time. Can
you elaborate more on ways that you would see the old values being used?
This is quite educational.

As for object-level transactions without serialization, I'm not sure about
AR, but with Kansas one can set a read-only attribute on data objects, and
can direct a query to return the data with the objects already in a read-
only state. While read-only is on, changes to the object will not get
serialized back to the db.

Thanks,

Kirk Haines

Brian Candler

10/7/2004 4:37:00 PM

On Thu, Oct 07, 2004 at 11:31:37PM +0900, Kirk Haines wrote:
> That's an interesting suggestion, as it gives one a defense against some
> external process coming in and changing things unexpectedly. Putting this
> on my todo for Kansas!

That's another O<->R mapping engine? Looks like there's a big ecosystem to
choose from :-)

> > These 'old' properties might also be useful for object-level
> > transactions, without serialisation as
> > ActiveRecord/Transaction::Simple does.
>
> In Kansas, it keeps each old value within the context of a transaction, but
> after the transaction is committed, the old values go away. I don't provide
> any interface, however, for actually access the old values at any time. Can
> you elaborate more on ways that you would see the old values being used?
> This is quite educational.

I think that once they're committed to the database, the old values are no
longer needed (and indeed, should be reset to the known current values in
the DB). The 'old' values represent a database snapshot at the last instance
that the DB was touched, and I think this is useful even before a
transaction starts.

My main worry is users A and B both pulling up the same record onto a
screen, making changes, and then writing back both; the one who gets there
first risks having their changes overwritten. Checking the old values have
not been changed as part of an atomic update is simple and robust, and
doesn't require record locking.

I wrote a system once which incorporated this concept, but since it had a
web interface, I think I ended up putting the 'old' values in hidden form
variables anyway.

> As for object-level transactions without serialization, I'm not sure about
> AR, but with Kansas one can set a read-only attribute on data objects, and
> can direct a query to return the data with the objects already in a read-
> only state. While read-only is on, changes to the object will not get
> serialized back to the db.

OK, I think perhaps we're talking about something else. With AR (which I've
only looked at briefly), you can make changes to an object within a
transaction, and if it fails, the objects themselves are rolled back to
their state at the start of the transaction. This is done using Austin
Ziegler's Transaction::Simple library, which just keeps a copy of the object
in marshalled form in an instance variable. It rolls it back using, in
outline:

r = Marshal.restore(@__foo__)
self.replace(r) if respond_to?(:replace)
r.instance_variables.each do |i|
instance_variable_set(i, r.instance_variable_get(i))
end

I was just thinking that if you're keeping the properties in a hash, and
have a separate hash for their snapshot values, then you get this capability
for free:

@props = @oldprops.dup

But it's not clear to me whether the best approach is to have 'obj'
containing both old and new values, or whether you should just have two
separate objects representing then and now:

obj1 = $db.get(n)
obj2 = obj1.dup
... make changes to obj2
obj2.save(obj1) # => generates SQL to update obj1 to obj2 atomically

Incidentally, talking about the OR ecosystem, I have some proof-of-concept
code which takes XML and stores it in SQL. It takes a stream of tag_start(),
element(), text(), tag_end() method calls (as generated by an
REXML::StreamParser), and writes rows in a DB. <foo>text</foo> ends up as a
single row. It can then spool out any subtree by doing a single 'select' and
converting this back into method calls; there are helper methods to turn a
method call stream into either XML text or a tree of REXML::Element objects.
The nice thing is that no intermediate representation is built in RAM, so
in principle you could spool gigabytes of XML in and out of SQL.

Dunno if anyone is interested in this... I was going to add a mechanism to
convert XPATH queries into SQL queries, and was starting to think about how
different element types could be mapped to different tables, at which point
it starts to look like an OR mapping solution. At the moment there are
just global 'elements' and 'attributes' tables. I will look at Kansas, AR,
NDB and others and see what good ideas I can steal from them :-)

But in principle an object with attributes should map quite nicely to
<class attr1=val1 attr2=val2...>
Having a hierarchy can be useful too, e.g. for access control, where a
user can only "see" objects which are below them in the tree. And XML is
still useful as an export/import tool, even if most people on this list use
YAML anyway :-)

Regards,

Brian.

khaines

10/7/2004 6:58:00 PM

On Fri, 8 Oct 2004 01:36:37 +0900, Brian Candler wrote
>
> That's another O<->R mapping engine? Looks like there's a big
> ecosystem to choose from :-)

In alphabetical order:

Active Record, Criteria, Kansas, Lafcadio, ndb are the five that pop to mind
immediately.

> My main worry is users A and B both pulling up the same record onto a
> screen, making changes, and then writing back both; the one who gets
> there first risks having their changes overwritten. Checking the old
> values have not been changed as part of an atomic update is simple
> and robust, and doesn't require record locking.

(*nod*) This makes a great deal of sense to me.

> OK, I think perhaps we're talking about something else. With AR
> (which I've only looked at briefly), you can make changes to an
> object within a transaction, and if it fails, the objects themselves
> are rolled back to their state at the start of the transaction. This
> is done using Austin Ziegler's Transaction::Simple library, which
> just keeps a copy of the object in marshalled form in an instance
> variable. It rolls it back using, in outline:
>
> r = Marshal.restore(@__foo__)
> self.replace(r) if respond_to?(:replace)
> r.instance_variables.each do |i|
> instance_variable_set(i, r.instance_variable_get(i))
> end
>
> I was just thinking that if you're keeping the properties in a hash,
> and have a separate hash for their snapshot values, then you get
> this capability for free:
>
> @props = @oldprops.dup
>
> But it's not clear to me whether the best approach is to have 'obj'
> containing both old and new values, or whether you should just have two
> separate objects representing then and now:
>
> obj1 = $db.get(n)
> obj2 = obj1.dup
> ... make changes to obj2
> obj2.save(obj1) # => generates SQL to update obj1 to obj2 atomically

Ah. Yeah, what I was talking about was when one wants to query data from a
db, and then do things with that data, possibly changing the data in the
objects, but without serializing the changes back to the db, or at least
without serializing them back immediately and without having to do
everything within a transaction.

You are just talking about the rollback implementation. Kansas' rollback
implementation needs some work, actually, but right now it keeps on hand
every value for every field between the start of the transaction and the
commit. There's just no interface for, at the user level, accessing those.

In the case of an object rollback, either done explicitly by one's code, or
because there was an exception thrown in the update, Kansas rolls each of
the fields back to the original values. So it's similar to your first
example. It's simple, but in practice it works well, in my experience.

> Dunno if anyone is interested in this... I was going to add a
> mechanism to convert XPATH queries into SQL queries, and was
> starting to think about how different element types could be mapped
> to different tables, at which point it starts to look like an OR
> mapping solution. At the moment there are just global 'elements' and
> 'attributes' tables. I will look at Kansas, AR, NDB and others and
> see what good ideas I can steal from them :-)
>
> But in principle an object with attributes should map quite nicely to
> <class attr1=val1 attr2=val2...>
> Having a hierarchy can be useful too, e.g. for access control, where
> a user can only "see" objects which are below them in the tree. And
> XML is still useful as an export/import tool, even if most people on
> this list use YAML anyway :-)

I'm interested. It's always good to see what other ideas are floating
around that one can learn from or piggy back off of!

Kirk Haines

John W. Long

10/8/2004 2:28:00 AM

Kirk Haines wrote:
> Active Record, Criteria, Kansas, Lafcadio, ndb are the five that pop to mind
> immediately.

There is also Vapor (http://vapor.rub...). A nice listing of ORM
libs can be found on this page:

http://rpa-base.rubyforge.org/wiki/wiki.cgi?Pack...

--
John

George Moschovitis

10/8/2004 9:06:00 AM

Kirk Haines wrote:
> On Fri, 8 Oct 2004 01:36:37 +0900, Brian Candler wrote
>>That's another O<->R mapping engine? Looks like there's a big
>>ecosystem to choose from :-)
> In alphabetical order:
> Active Record, Criteria, Kansas, Lafcadio, ndb are the five that pop to mind
> immediately.

There are many O<->R libraries out there, but let me point out where ndb
is different:

ActiveRecord:
this seems to me to be an R->O mapping and not vice versa
i think this doesnt use standard ruby objects but something like OpenStruct.

Criteria:
Uses strange OO query objects

Lafcadio:
R->O again, you have to write sql definitnions

Kansas:
the same.

n/db:
- Uses STANDARD ruby objects ie (O->R mapping)

- You write NO SQL table definitions
(you can finetune the generated sql schema using a simple metalanguage
in ruby)

- Because the managed objects are standard objects, you can use
inheritance or synthesize your objects using mixins. A number
of usefull mixins are provided (for example Child, SqlTreeTraversable
(graph preorder pattern) etc).

- You can optionally use a unified id space. Ie oids are allocated to
different classes of object from the same space. This allows you for
example code like the following:

class Article
....
end

class User
....
end

class Comment
....
end

a = Article.new
u = User.new

c1 = Comment.new
c1.set_parent(a)
$db << c1

c2 = Comment.new
c2.set_parent(a)
$db << c2

article_comments = $db.children(a, Comment)
user_comments = $db.children(u, Comment)

- you can automatically deserialize to ruby objects or to resultsets,
a future version will support deserialization to OpenStruct objects.

- nice support for SQL JOIN operations.

- moreover there are some optimizations (for example pregenerated
insert/update/deserialize code for each managed class) and more
are coming in the next version (prepared statements, stored procedures).

- there are some more nice tricks for you to discover until I find time
to write proper documentation.

I hope more people will have a look at the code, and tell me about
problems, and bugs or give suggestions and ideas for improvement.
Any cool ideas will be integrated in a future version.

regards,
George Moschovitis

--
www.navel.gr | tel: +30 2106898050 | fax: +30 2106898437

Navel does not accept liability for any errors, viruses or omissions in
the contents of this message. The full corporate policy is available on
our site.

have fun: www.joy.gr

George Moschovitis

10/8/2004 9:20:00 AM

> I wrote a system once which incorporated this concept, but since it had a
> web interface, I think I ended up putting the 'old' values in hidden form
> variables anyway.

This is a COOL idea :)

> Having a hierarchy can be useful too, e.g. for access control, where a
> user can only "see" objects which are below them in the tree. And XML is
> still useful as an export/import tool, even if most people on this list use
> YAML anyway :-)

well n/Db fully supports a hierarchy. Have a look at the Child and
ParentClass and at DbConnection#children etc.

George Moschovitis

--
www.navel.gr | tel: +30 2106898050 | fax: +30 2106898437

Navel does not accept liability for any errors, viruses or omissions in
the contents of this message. The full corporate policy is available on
our site.

have fun: www.joy.gr

Brian Candler

10/8/2004 12:00:00 PM

On Fri, Oct 08, 2004 at 06:24:44PM +0900, George Moschovitis wrote:
> >Having a hierarchy can be useful too, e.g. for access control, where a
> >user can only "see" objects which are below them in the tree. And XML is
> >still useful as an export/import tool, even if most people on this list use
> >YAML anyway :-)
>
> well n/Db fully supports a hierarchy. Have a look at the Child and
> ParentClass and at DbConnection#children etc.

Yep. The trouble with just having a child/parent link is that it's very
inefficient to do a search which is limited to all nodes within a subtree,
and this is the one vaguely clever bit of what I've implemented. Each node
has a 'path' attribute stored in a compacted form. It has the property that

select * from elements where path like 'ABC%' order by path;

will give you all the elements under node ABC, *and* in the correct order to
spool them out to XML (each parent preceeds its children). It's very easy to
perform searches which are limited to any arbitary subtree.

The path is constructed using these rules:

* root node has path ""
* a child has a path constructed of its parent's path plus a child ID
* the child ID counts from 0, and is encoded into a variable-length string
using base64, using symbols 0-9 and A-V for numbers 0-31.

x 0 to 31
Wxx 32 to 2^10-1
Xxxxx 2^10 to 2^20-1
Yxxxxxx 2^20 to 2^30-1
Zxxxxxxxx 2^30 to 2^40-1

Such paths can be easily decomposed by inspection, e.g.

4AWVUP -> 4 A WVU P -> /4/10/1022/25

and are short enough to be used as a primary key. The variable length
encoding allows for both deep and wide trees with reasonable efficiency.

Anyway, I'll try to stick it up on the web somewhere over the weekend if
anyone wants to play.

Regards,

Brian.

George Moschovitis

10/8/2004 12:25:00 PM

>
> Yep. The trouble with just having a child/parent link is that it's very
> inefficient to do a search which is limited to all nodes within a subtree,
... anyone wants to play.
>

Have a look at the SqlTreeTraversable Mixin in the file n/db/mixins.rb
If you include this mixin in your managed Object you can efficiently
serach in subtrees. This mixin implements the common Preorder Tree
Traversal SQL pattern. Is this what you are looking for?

regards,
George Moschovitis

--
www.navel.gr | tel: +30 2106898050 | fax: +30 2106898437

Navel does not accept liability for any errors, viruses or omissions in
the contents of this message. The full corporate policy is available on
our site.

have fun: www.joy.gr

comp.lang.ruby

[ANN] ndb Object-Relational Mapper

George Moschovitis

Brian Candler

George Moschovitis

khaines

Brian Candler

khaines

John W. Long

George Moschovitis

George Moschovitis

Brian Candler

George Moschovitis

x Login to ForumsZone