Austin Ziegler
2/18/2005 2:15:00 PM
On Fri, 18 Feb 2005 22:09:47 +0900, Robert Klemme <bob.news@gmx.net>
wrote:
> "Austin Ziegler" <halostatue@gmail.com > schrieb im Newsbeitrag
> news:9e7db91105021804385d91a8ac@mail.gmail.com...
>> On Fri, 18 Feb 2005 18:34:51 +0900, Phil Tomson
>> <ptkwt@aracnet.com> wrote:
>> [...]
>>> "[...]This is a far different approach than UML's general,
>>> wide-purpose models. For example, UML class diagrams can be used
>>> for conceptual modeling, object-oriented analysis modeling,
>>> object-oriented design modeling, logical data modeling and
>>> physical data modeling.[...]"
>> FWIW, the author of that statement is wrong. UML cannot be
>> effectively used for either logical or physical data modeling.
>> It's too based in object modeling. People who use UML for ER/data
>> modeling are making a huge mistake. There's far better modeling
>> methodologies out there than UML.
> Just curios: what do you think are the major shortcomings of URL
> that make it inappropriate for ER modeling?
I laid out much of this in a response to Markus in the first Ilias
thread, but I'll repeat it here.
Everything. UML has a bunch of problems, ranging mostly from the
fact that it tries to be capable of modeling everything all the
time. That means that to attempt to do ER/data modeling, you have to
repurpose the class diagram. Except that tables don't have methods,
so you have to ignore that entirely. And that UML uses C/C++/Java
types explicitly, which aren't as rich as or don't map well to SQL
databases by and large.
Object oriented mechanisms are generally good at what I call "one-
way browsing." That is to say that there's generally only one way to
access certain pieces of information in a given object model. A
perfect example would be the case of a cellular phone company's
billing policies. Typically, a customer (C) will have one or more
rate plans (P) that can also be assigned to other customers; a
standard many-to-many relationship.
The object model for this will typically design C and P as separate
classes, but C will contain a "list" of P objects. The reverse
relationship (where P contains a "list" of C object) will generally
not be implemented (indeed, it is usually seen as a BAD THING for a
component to have to know about its parents), which means that in an
object model (or an OO database, because its object model *is* its
data model) will only be able to browse P objects one way -- through
C objects. To find which set of C contain P[35], you have to browse
through each C object.
A relational model, however, will do the logical model as an
explicit many-to-many case, which generally forces the physical
model to have C, P, and CP tables. Now, because the relationship is
stored externally of either C or P, it's very easy to say:
SELECT COUNT(*)
FROM cp
WHERE p = 'P35';
And you know immediately how many customers would be affected if you
change the definition of P35.
UML is based on Jacobsen, Booch and Rumbaugh; they are all object
oriented theorists. And they're wrong when it comes to the supremacy
of OO design over proper ER design. But UML encodes that bias and
forces you to make choices that you should not have to make in your
ER modeling if you're making the mistake of using UML for that
instead of the established models for ER. (Indeed, for some types of
database design -- typically data warehousing, ER isn't appropriate
either -- you need what is called Dimensional Modeling, and I know
very little about that.)
Note that I'd *never* suggest that ERD should be used for object
modeling. UML Class Diagrams are perfectly good for that. But that's
about *all* they're good for, and they're not always good for that
in all languages because they make assumptions about the language
that will be used. Products like ArcStyle might be able to help with
that, but they'll still never produce a good database schema from an
object model. Never.
-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca