Asp Forum - Representing Undirected Edges (advice, please

Gavin Kistner

5/8/2005 4:24:00 PM

Summary
==================================
How do you represent and handle bidirectional links between items,
given that you will necessarily need two distinct variables for each
end? How then do you handle a directed traversal of those links?

(Sorry for the length of this email; the concept is simple but I
wanted to be clear about the trouble I'm having.)

Background
==================================
I got all caught up in the quiz I gave last week, researching graph/
network theory. As a result, I'm writing a generic library for
finding paths. More on this library when I get closer to finishing.

One thing that's causing me a bit of trouble is how to represent non-
directed edges between nodes. Let me explain the concept for those
not familiar with the terminology, and then the problem.

A non-directed edge between nodes is a fancy way of saying "ObjectA
links to ObjectB, and vice versa". So while a directed edge would be:
ObjectA -------------> ObjectB or ObjectA
<------------- ObjectB
an undirected edge is simply:
ObjectA <------------> ObjectB

Edges also may have other information associated with them (such as a
'weight' for the edge), so I currently have them pulled out into
their own class:

class Edge
attr_accessor :start_node, :end_node, :weight

def initialize( start_node, end_node, directed=false,
weight=nil )
@start_node, @end_node, @weight, @directed = start_node,
end_node, weight, directed
end

def directed?
@directed
end
end

To store both ends, I need two variables, and here's where the
trouble starts creeping in. In a directed edge it's necessary to
distinguish which node is the start and which is the end, but in an
undirected edge I want the class to be agnostic about this. So, for
example, I now have to redefine the == method to account for non-
directionality:

def ==( other_edge )
( !@directed == !other_edge.directed? )
and
( @weight == other_edge.weight )
and
(
( @start_node == other_edge.start_node && @end_node
== edge.end_node )
or
( !@directed && @start_node == other_edge.end_node
&& @end_node == edge.start_node )
)
end

So far I'm still staying afloat.

The Problem
==================================
Any path that links a few nodes together should be able to be
represented by its edges. So, for example, the path from 'a' to 'd'
a <-----> b <-----> c <-----> d
is represented by the ordered list of three edges:
<Edge @start_node=a @end_node=b>
<Edge @start_node=b @end_node=c>
<Edge @start_node=c @end_node=d>

If I want to recreate the list of nodes visited, I can just loop
through the edges and pick out the nodes. ... Except I can't, because
if these are non-directional edges, the list might look like:
<Edge @start_node=b @end_node=a>
<Edge @start_node=b @end_node=c>
<Edge @start_node=d @end_node=c>

* I can't just 'flip' the edges while creating the path, because the
same graph might be crawled multiple times to create multiple paths.

* I could use code to determine the correct node order (figuring out
which ends touch which) but that will be slightly messy and slower.
(Premature optimization is the root of all evil, but you gotta strive
for a good design at least.)

* I could store both the edge list AND the node list for a path, but
that's not very DRY or normalized.

* I could create a subclass of Edge which is a Link (since
essentially the problem is that I'm using undirected information to
represent directed information), but then I'd be duplicating
information. (Later changes to the weight of an edge would not be
reflected in the Link used in a Path.)

* Perhaps I could create a Link class which references edges and
expresses directionality. (This thought just occurred to me.)

I know I've hit this problem a few times before in programming, and
have never been pleased with any of my solutions.

Thanks for any insights/opinions! :)

--
"When I am working on a problem I never think about beauty. I only
think about how to solve the problem. But when I have finished, if
the solution is not beautiful, I know it is wrong."
- R. Buckminster Fuller

5 Answers

Gavin Kistner

5/8/2005 6:06:00 PM

Thanks for the prompt reply! Response inline.

On May 8, 2005, at 11:32 AM, Aleksi wrote:
> You never explicitly write what is the problem here. It's very hard
> to suggest solutions around it.

Sorry, you're right. I sort of stated it in the summary. It's not so
much a "how can I accomplish this?" question as a "what is the 'best'
way to solve it?"

> Then list [a, b, c, d] would be right. One can deduct nodes_on_path
> list from non_directional_edges_on_path if there are no loops,
> nodes are not visited twice or other fancy issues.

I think even if there are loops and revisitations it can be deduces,
as long as non_directional_edges_on_path is ordered.

> If your definition of path is "a list of successive nodes connected
> by non-directional edges, so that one can visit all nodes or edges
> or both in order they appear in path" then your list could be of form
>
> [a, <Edge b a>, b, <Edge b c>, c, <Edge d c>, d]
>
> or
>
> [[a,b,c,d], [<Edge b a>, <Edge b c>, <Edge d c>]
>
> or something else.
>
> So by stating problem clearly you probably have your answer in it
> already.
>
>
>> * I could store both the edge list AND the node list for a path,
>> but that's not very DRY or normalized.
>>
>
> The two latter forms I've written are not repeating oneself. But if
> you're strongly against that you can verify you _must repeat
> yourself_ if the solution requires it.

Correct, they're not really a violation of DRY, but if the node list
can be deduced from the ordered edge list (which it can) then it _is_
non-normalized; if my code somehow screws up and changes either the
node list or the edge list without properly updating the other, I
have an inconsistent state stored.

> 1) If you store just nodes_on_path, then you can find which edge
> has both ends and recover edges_on_path. Except it doesn't work
> with aforementioned special cases or even with multiple edges
> between same two nodes.

Aye - the multiple-edges-between-the-same-nodes case is one that I
allow and that would mess this case up.

> Where's the repetition?

I suppose it's trivial, but it's the extra data structure and two
more pointers to the 'real' start and end. I think it's becoming
apparent that I *am* succumbing to premature optimization. :)

> If there's beef in here, feel free to forward to the list. If not,
> I don't mind :).

Thanks, I will just for the shared knowledge.

(Original email follows in full for the record.)

Begin forwarded message:

> From: Aleksi <foobar@fuzzball.org.net>
> Date: May 8, 2005 11:32:45 AM MDT
> To: Gavin Kistner <gavin@refinery.com>
> Subject: Re: Representing Undirected Edges (advice, please)
>
>
> Gavin Kistner wrote:
>
>> The Problem
>> ==================================
>> Any path that links a few nodes together should be able to be
>> represented by its edges. So, for example, the path from 'a' to 'd'
>> a <-----> b <-----> c <-----> d
>> is represented by the ordered list of three edges:
>> <Edge @start_node=a @end_node=b>
>> <Edge @start_node=b @end_node=c>
>> <Edge @start_node=c @end_node=d>
>> If I want to recreate the list of nodes visited, I can just loop
>> through the edges and pick out the nodes. ... Except I can't,
>> because if these are non-directional edges, the list might look
>> like:
>> <Edge @start_node=b @end_node=a>
>> <Edge @start_node=b @end_node=c>
>> <Edge @start_node=d @end_node=c>
>>
>
> You never explicitly write what is the problem here. It's very hard
> to suggest solutions around it.
>
> For example, the latter list _is_ correct path through the graph,
> that is a list of (non-directional) edges that you progress
> through. That means that edge[1].node1 or node2 must be same as
> either edge[0].node1 or node2. And this condition is fulfilled with
> each step.
>
> If you want to use the path to state which were the nodes that were
> on the path, then you're using _different definition_ what is the
> path. Then list [a, b, c, d] would be right. One can deduct
> nodes_on_path list from non_directional_edges_on_path if there are
> no loops, nodes are not visited twice or other fancy issues.
>
> If your definition of path is "a list of successive nodes connected
> by non-directional edges, so that one can visit all nodes or edges
> or both in order they appear in path" then your list could be of form
>
> [a, <Edge b a>, b, <Edge b c>, c, <Edge d c>, d]
>
> or
>
> [[a,b,c,d], [<Edge b a>, <Edge b c>, <Edge d c>]
>
> or something else.
>
> So by stating problem clearly you probably have your answer in it
> already.
>
>
>> * I could store both the edge list AND the node list for a path,
>> but that's not very DRY or normalized.
>>
>
> The two latter forms I've written are not repeating oneself. But if
> you're strongly against that you can verify you _must repeat
> yourself_ if the solution requires it.
>
> 1) If you store just nodes_on_path, then you can find which edge
> has both ends and recover edges_on_path. Except it doesn't work
> with aforementioned special cases or even with multiple edges
> between same two nodes.
>
> 2) If you store just edges_on_path, you can't find correct order in
> case of loop (is it a->b->a->b->c or b->a->b->a->c).
>
> If you store both it might be easier to see why it's not repeating
> information by giving all things on graph an index. All nodes run
> 1..n (a=1, b=2, c=3, d=4) and all edges n+1..m (b-a=5, b-c=6, d-
> c=7). Now you can make path as an indexing array:
>
> path = [1, 6, 2, 6, 3, 7, 4]
>
> and edges_on_path = path.find_all? {|i| i <= 4}, and
> nodes_on_path = path.find_all? {|i| i => 5}.
>
> Where's the repetition?
>
> If there's beef in here, feel free to forward to the list. If not,
> I don't mind :).
>
> - Aleksi

--
(-, /\ \/ / /\/

ptkwt

5/8/2005 6:26:00 PM

In article <18E98197-BD8D-4B3A-BDF3-01D68B60DE51@refinery.com>,
Gavin Kistner <gavin@refinery.com> wrote:
>--Apple-Mail-2--176720861
>Content-Transfer-Encoding: 7bit
>Content-Type: text/plain;
> charset=US-ASCII;
> delsp=yes;
> format=flowed
>
>Summary
>==================================
>How do you represent and handle bidirectional links between items,
>given that you will necessarily need two distinct variables for each
>end? How then do you handle a directed traversal of those links?
>
>(Sorry for the length of this email; the concept is simple but I
>wanted to be clear about the trouble I'm having.)
>
>
>Background
>==================================
>I got all caught up in the quiz I gave last week, researching graph/
>network theory. As a result, I'm writing a generic library for
>finding paths. More on this library when I get closer to finishing.
>
>One thing that's causing me a bit of trouble is how to represent non-
>directed edges between nodes. Let me explain the concept for those
>not familiar with the terminology, and then the problem.
>
>A non-directed edge between nodes is a fancy way of saying "ObjectA
>links to ObjectB, and vice versa". So while a directed edge would be:
> ObjectA -------------> ObjectB or ObjectA
><------------- ObjectB
>an undirected edge is simply:
> ObjectA <------------> ObjectB
>
>Edges also may have other information associated with them (such as a
>'weight' for the edge), so I currently have them pulled out into
>their own class:
>
> class Edge
> attr_accessor :start_node, :end_node, :weight
>
> def initialize( start_node, end_node, directed=false,
>weight=nil )
> @start_node, @end_node, @weight, @directed = start_node,
>end_node, weight, directed
> end
>
> def directed?
> @directed
> end
> end
>
>To store both ends, I need two variables, and here's where the
>trouble starts creeping in. In a directed edge it's necessary to
>distinguish which node is the start and which is the end, but in an
>undirected edge I want the class to be agnostic about this. So, for
>example, I now have to redefine the == method to account for non-
>directionality:
>
> def ==( other_edge )
> ( !@directed == !other_edge.directed? )
> and
> ( @weight == other_edge.weight )
> and
> (
> ( @start_node == other_edge.start_node && @end_node
>== edge.end_node )
> or
> ( !@directed && @start_node == other_edge.end_node
>&& @end_node == edge.start_node )
> )
> end
>

The thing that strikes me here is that you probably want to have an
base edge class (or maybe actually a Mixin module to put it in more
Ruby-ish terms) that would define the basic functionality of an edge and
then have undirected and directed edge classes that derive from it.

In my experience, you either have a directed graph or an undirected graph
- I've never run into a situation where a graph can have both directed
and undirected edges (I suppose it's possible, and maybe that's what
you're trying to plan for, but in my experience in working with graphs
they're either one or another).

Seperating your directed graphs and undirected graphs could greatly
simplify things.

>So far I'm still staying afloat.
>
>
>The Problem
>==================================
>Any path that links a few nodes together should be able to be
>represented by its edges. So, for example, the path from 'a' to 'd'
> a <-----> b <-----> c <-----> d
>is represented by the ordered list of three edges:
> <Edge @start_node=a @end_node=b>
> <Edge @start_node=b @end_node=c>
> <Edge @start_node=c @end_node=d>
>
>If I want to recreate the list of nodes visited, I can just loop
>through the edges and pick out the nodes. ... Except I can't, because
>if these are non-directional edges, the list might look like:
> <Edge @start_node=b @end_node=a>
> <Edge @start_node=b @end_node=c>
> <Edge @start_node=d @end_node=c>
>
>* I can't just 'flip' the edges while creating the path, because the
>same graph might be crawled multiple times to create multiple paths.
>
>* I could use code to determine the correct node order (figuring out
>which ends touch which) but that will be slightly messy and slower.
>(Premature optimization is the root of all evil, but you gotta strive
>for a good design at least.)
>
>* I could store both the edge list AND the node list for a path, but
>that's not very DRY or normalized.
>
>* I could create a subclass of Edge which is a Link (since
>essentially the problem is that I'm using undirected information to
>represent directed information), but then I'd be duplicating
>information. (Later changes to the weight of an edge would not be
>reflected in the Link used in a Path.)
>
>* Perhaps I could create a Link class which references edges and
>expresses directionality. (This thought just occurred to me.)
>
>
> I know I've hit this problem a few times before in programming, and
>have never been pleased with any of my solutions.
>
>Thanks for any insights/opinions! :)
>

One idea:
Graphs are often represented by hashes of lists, like so:

graph = {
a => [b,c,d],
b => [a,e],
c => [a,e],
d => [a],
e => [b,c]
}

So, for example, a connects with nodes b,c and d. You can find all the
nodes that a connects to by iterating through the list of graph[a]. You
can then traverse further down in the graph by taking every node in a's
list and recursively iterating through all of their children.

If you use a Hash for representing the graph, you really don't need an
Edge class. You can then have a DirectedGraph class and an
UndirectedGraph class each of which contains the connection hash - how
you traverse through the hash will be a bit different depending on
whether it's directed or not. The hash of lists representation is simple
and compact.

It can, however be useful to have an Edge class (for weighted edges, as
you point out).

Take a look at some other OO graph class libraries out there such as the
Boost Graph library. There are lots of them out there and different ones
emphasize different things: some emphasize the nodes (as in the hash
representation) while others emphasize the edges (each has it's
advantages/disadvantages).

Phil

Ara.T.Howard

5/8/2005 6:28:00 PM

Horst Duchene

5/9/2005 8:05:00 AM

Gavin Kistner schrieb:
> Summary
> ==================================
> How do you represent and handle bidirectional links between items,
> given that you will necessarily need two distinct variables for each

> end? How then do you handle a directed traversal of those links?
>
Did you already take a look at RGL (http://rgl.rubyfor...)? The
edge concept is treated here:
http://rgl.rubyfor.../classes/RGL/Edge.html.

Cheers
Horst

Gavin Kistner

5/9/2005 1:27:00 PM

On May 9, 2005, at 2:09 AM, Horst Duchene wrote:
> Did you already take a look at RGL (http://rgl.rubyfor...)? The
> edge concept is treated here:
> http://rgl.rubyfor.../classes/RGL/Edge.html.

I hadn't, thanks :)

Though RGL sublcasses directed versus undirected edges, it mostly
does what I do. The problem (as this thread has helped me realize) is
that I was trying to re-use undirected edges directly to represent a
directed path.

comp.lang.ruby

Representing Undirected Edges (advice, please

Gavin Kistner

Gavin Kistner

ptkwt

Ara.T.Howard

Horst Duchene

Gavin Kistner

x Login to ForumsZone