Asp Forum - Modified Single Table Inheritance

Ryan Glover

12/26/2006 10:51:00 PM

Hello,

I am developing an application that will have 100's of model classes
each derived from a single source class. Each model will have 4 similar
attributes but then have about a dozen unique ones each. I looked into
using STI but balked at creating tables with a 1000 columns. I then
looked at using inherits_from and creating an new table for each class
that holds the new class columns. Problem is, I'll end up with 100's of
tables, which also does not appeal to me.

My third solution, the one I am asking advice on, is this.

I create a single table with the 4 common columns and a dozen columns
with names such as float1, float2, float3, ... , string1, string2, ..
etc. Creating enough columns of each type to cover my largest subclass.
Then I use STI and for each class I map the generic columns to the nicer
names inside the subclasses. So in one class float1 may be miles/hour
while in another class it might be turkeys/hectare.

Does this sound like a reasonable approach?

Also, how would I map the generic names to nice names in my subclasses?
(I am a RoR noob)

And of course, does anyone see a much better way to do what I have
proposed?

Thank you,
Ryan Glover

--
Posted via http://www.ruby-....

9 Answers

Eric Hodel

12/26/2006 11:02:00 PM

On Dec 26, 2006, at 14:51, Ryan Glover wrote:

> I am developing an application that will have 100's of model classes
> each derived from a single source class. Each model will have 4
> similar
> attributes but then have about a dozen unique ones each. I looked
> into
> using STI but balked at creating tables with a 1000 columns. I then
> looked at using inherits_from and creating an new table for each class
> that holds the new class columns. Problem is, I'll end up with
> 100's of
> tables, which also does not appeal to me.

I think you want the Rails mailing list.

--
Eric Hodel - drbrain@segment7.net - http://blog.se...

I LIT YOUR GEM ON FIRE!

Mat Schaffer

12/27/2006 4:41:00 AM

On Dec 26, 2006, at 5:51 PM, Ryan Glover wrote:
> My third solution, the one I am asking advice on, is this.
>
> I create a single table with the 4 common columns and a dozen columns
> with names such as float1, float2, float3, ... , string1, string2, ..
> etc. Creating enough columns of each type to cover my largest
> subclass.
> Then I use STI and for each class I map the generic columns to the
> nicer
> names inside the subclasses. So in one class float1 may be miles/hour
> while in another class it might be turkeys/hectare.
>
> Does this sound like a reasonable approach?
>
> Also, how would I map the generic names to nice names in my
> subclasses?
> (I am a RoR noob)
>
> And of course, does anyone see a much better way to do what I have
> proposed?

Even though we're not technically on the RoR list, I thought this was
still an interesting question for ORM, STI, OOP and probably some
other acronyms too :)

I'd be curious to hear just what it is that you're doing that needs
100s of classes, first. Sounds like you might be taking OOP a bit
too far for your purposes.

But to answer your question, I like the concept but it brings up fear
of bugs. Dropping meaningful column names will wreak havoc on any
cases where you have to write a query by hand. But everything is a
trade off. Personally, I'd try the 100s of tables route first and
see how it panned out.

Best of luck. Sounds like a tricky one.
-Mat

David Vallner

12/28/2006 12:47:00 AM

Ryan Glover wrote:
> Hello,
>
> I am developing an application that will have 100's of model classes
> each derived from a single source class. Each model will have 4 similar
> attributes but then have about a dozen unique ones each.

Why are they deriving from a single class anyway?

If it's only for a few identical attributes without significant
functionality to go along, I'd say it's a bad idea to use inheritance.
And with latent typing in Ruby, you Rarely If Ever need inheritance
anyway. Saves you the bother that is inheritance mapping at any rate.

> Problem is, I'll end up with 100's of
> tables, which also does not appeal to me.
>

I can't understand this. It's rather common that an ORM database schema
has a number of tables in the same order of magnitude as there are model
classes. My biased opinion is that data should be modelled on the
database level where you have some theorethical foundations about what
are "good shapes" for the data. It also doesn't hurt to at least make
the DB schema somewhat document the data model - the data an application
gathers / stores is the most likely part of it to persist in the long
term. Of course, you're probably in a better position to judge this.

> My third solution, the one I am asking advice on, is this.
>
> I create a single table with the 4 common columns and a dozen columns
> with names such as float1, float2, float3, ... , string1, string2, ..
> etc. Creating enough columns of each type to cover my largest subclass.
> Then I use STI and for each class I map the generic columns to the nicer
> names inside the subclasses. So in one class float1 may be miles/hour
> while in another class it might be turkeys/hectare.
>
> Does this sound like a reasonable approach?
>

Database schemas like this tend to make it into The Daily WTF with some
regularity.

(Unfortunately, they also make it into production systems.)

If there's a chance someday, someone, somewhy might want to access your
data at a DB level (using a reporting tool maybe), there'd be no end to
the grief.

Or, if a bug in your code gets the database into an inconsistent state,
and you have to drop down to SQL to patch things up.

Avoid.

> And of course, does anyone see a much better way to do what I have
> proposed?
>

Use migrations to create the schema from scratch if you just want to
avoid writing a lot of SQL up front?

David Vallner

Ryan Glover

12/28/2006 1:19:00 AM

Hello David,

> Why are they deriving from a single class anyway?

All the objects are the same thing, so to speak. As a silly example of
what I am trying to do, let's say that each user an array named fruit.
There are many types of fruit and each have their own unique attributes
(# of grapes, fuzziness index of peach, banana radius) but they are all
fruit. My program is just like this, except I have hundreds of fruit.
Perhaps even a thousand. Ideally I would like to be able to add more
fruit as the program evolves just by adding a new fruit subclass and a
new fruit view.

>
> I can't understand this. It's rather common that an ORM database schema
> has a number of tables in the same order of magnitude as there are model
> classes. My biased opinion is that data should be modelled on the
> database level where you have some theorethical foundations about what
> are "good shapes" for the data. It also doesn't hurt to at least make
> the DB schema somewhat document the data model - the data an application
> gathers / stores is the most likely part of it to persist in the long
> term. Of course, you're probably in a better position to judge this.
>

I guess my problem with 100's perhaps a 1000 tables is db performance.
I don't have oracle, only mysql or postgres. Performance wise, wouldnt
it be better to have a few thousand rows in one table as opposed to a
few dozen spread across 100's of tables?

> Database schemas like this tend to make it into The Daily WTF with some
> regularity.
>
> (Unfortunately, they also make it into production systems.)
>
> If there's a chance someday, someone, somewhy might want to access your
> data at a DB level (using a reporting tool maybe), there'd be no end to
> the grief.
>
> Or, if a bug in your code gets the database into an inconsistent state,
> and you have to drop down to SQL to patch things up.
>
> Avoid.

I appreciate all these warnings and I understand what you are saying.
In this case I am client for the project so I'll take my lumps as they
come.

Thanks for your response,
Ryan Glover

--
Posted via http://www.ruby-....

David Vallner

12/28/2006 1:46:00 AM

Ryan Glover wrote:
>> I can't understand this. It's rather common that an ORM database schema
>> has a number of tables in the same order of magnitude as there are model
>> classes. My biased opinion is that data should be modelled on the
>> database level where you have some theorethical foundations about what
>> are "good shapes" for the data. It also doesn't hurt to at least make
>> the DB schema somewhat document the data model - the data an application
>> gathers / stores is the most likely part of it to persist in the long
>> term. Of course, you're probably in a better position to judge this.
>>
>
> I guess my problem with 100's perhaps a 1000 tables is db performance.
> I don't have oracle, only mysql or postgres. Performance wise, wouldnt
> it be better to have a few thousand rows in one table as opposed to a
> few dozen spread across 100's of tables?
>

The history of computing knows only one answer to that question:
benchmark. (And not even that's much of an answer.)

However, I have very, very few doubts that separate tables would be
faster in a read-mostly scenario. The records for that model are
filtered at insert time instead of query time, and deciding between
tables is probably way faster than on any ad hoc type index column.

Generally, a well-designed schema will probably buy you more performance
in the long run than any sort of peephole hand-optimizations. You want
to minimize the number of round-trip hits on the database and the volume
of the datasets retrieved for that (often conflicting goals), and a
clear schema helps do just that.

David Vallner

Mat Schaffer

12/28/2006 4:24:00 AM

On Dec 27, 2006, at 8:19 PM, Ryan Glover wrote:
> All the objects are the same thing, so to speak. As a silly
> example of
> what I am trying to do, let's say that each user an array named fruit.
> There are many types of fruit and each have their own unique
> attributes
> (# of grapes, fuzziness index of peach, banana radius) but they are
> all
> fruit. My program is just like this, except I have hundreds of fruit.
> Perhaps even a thousand. Ideally I would like to be able to add more
> fruit as the program evolves just by adding a new fruit subclass and a
> new fruit view.

Well, I'm guessing fruit is just an example. But I think you could
still group the fruits into subcategories or something to that effect
and end up with reasonable inheritance. For example, berries
(strawberry, raspberry) , pitted fruits (peach, concord grape), not
sure what you'd call a banana but I'd bet an encyclopedia would have
some info. But I think you get the idea. To get the specific kind
of fruit, you could just use a type field and get single table
inheritance on the subcategory.

It'll depend on your data model. But I'm inclined to agree with the
rest of group that there are either relationships or duck-typing
advantages that you're not utilizing.

And David has a point of schemas that include things like float1,
etc... do often show up on the daily WTF. That's might be reason
enough to stay away.

Thinking another way, what about having a fruit table and a trait
table. The trait table could reference a large number of
trait_types. The trait table would reference a fruit, a trait_type
and a value. You could probably get away with just 3 or 4 value
columns in the trait table. Just a thought.
-Mat

Paulo Köch

12/28/2006 4:38:00 AM

Stephen Richards

12/31/2006 5:09:00 PM

Ryan Glover wrote:
> Hello David,
>
>> Why are they deriving from a single class anyway?
>
> All the objects are the same thing, so to speak. As a silly example of
> what I am trying to do, let's say that each user an array named fruit.
> There are many types of fruit and each have their own unique attributes
> (# of grapes, fuzziness index of peach, banana radius) but they are all
> fruit. My program is just like this, except I have hundreds of fruit.
> Perhaps even a thousand. Ideally I would like to be able to add more
> fruit as the program evolves just by adding a new fruit subclass and a
> new fruit view.
>

Deriving a subclass for each type of different fruit isn't a scalable or
viable solution.

I presume your app will use a database, so why not use it to do all the
work.

A table for fruit containing all the attributes that remain constant
accross all fruit, (name, origin perhaps).

A table of attributes: diameter, colour, fuzziness index, length,
quantity_in_bunch. There could also be a field on this table to
determine the type: numeric, string, boolean, etc.

A join table to determine which fruit get which attribute.

then when you get an order, the app generates an ordered fruit attribute
for each attribute for that particular fruit and prompts the user to
fill in the data.

Any new fruit that come along in the future just get added to the
database with the correct attributes, and there is no extra work to do
in the app - it's all there.

Does that sort of architecture meet the requirement?

sr

uNmaivirumbi

12/14/2011 11:14:00 PM

On Dec 14, 3:25 pm, use...@mantra.com and/or www.mantra.com/jai (Dr.
Jai Maharaj) wrote:
> In article <85a6fbbf-474f-4603-a259-ea38bef02...@e2g2000vbb.googlegroups.com>,

> Is it that Hindus are easily fooled by Congress, Christians, Muslims
> and communists, or that being so fooled is an appearance created by
> the fact that there just aren't many true Hindus left to oppose the
> anti-Hindu forces? After centuries of invasions, terrorism massacres,
> forced conversion and illegal occupation by outsiders the numbers of
> true Hindus have been reduced significantly. But the good news is
> that the seed of Sanatan Dharm remains intact to carry us Hindus
> through the great ages to the next Satyug.

Good thoughts. Hope there is a real resurgence of sanathana dharma

Thanks

comp.lang.ruby

Modified Single Table Inheritance

Ryan Glover

Eric Hodel

Mat Schaffer

David Vallner

Ryan Glover

David Vallner

Mat Schaffer

Paulo Köch

Stephen Richards

uNmaivirumbi

x Login to ForumsZone