Asp Forum - Re: Representing Undirected Edges (advice, please

Nuralanur

5/11/2005 10:28:00 PM

Dear Jacob, Dear pathfinders,

I've now understood your point about multiple connections and I have to
confess that
I start liking it more (maybe as much as you are starting to like it
less...as I understand
from your last email). I would particularly like to have such a
multiply-connected graph
model when flying from one city to another by one of several air carrier
companies..

I can't follow your third argument about the amount of computation, though.
I want to prepare my point by saying one thing first.
I think it is important to introduce a loss function which maps each path
to a value how nice/suitable/etc. it is subject to one criterion or a
collection
of criteria which are somehow weighted ( I know I am repeating myself).
I find it really important not to associate loss to a node or a connection,
but to a path.
This may actually remove the obfuscation you were right to complain about.
If a solution is to be constructed, the
value of this function needs to be calculated iteratively. To do this, all
the connections
from the last node of the part-path obtained so far to the next node need to
be evaluated -
and that's regardless of whether you choose letters as nodes and multiple
connections
between them or songs and single connections.
So I don't think you can save on computation using a graph with a smaller,
fixed number
of nodes, because it's the number of connections that's critical.

With respect to the starting and ending question, one does indeed have to
work
a lot - start the algorithm anew from each song that starts by an 'A'.
Dijkstra's algorithm
would find the loss function value of the connection to any song ending in
'Z' , in 'one go', though.
But I want to come back to the multi-connected graph, because it would
really be nice
to know something more about what's happening when one works with such
graphs.
Do you have a proposal for an algorithm that could deal with such a graph?
I'd really like to know about that.

Ah, regarding Steven's post, I recall an authoritative three-volume book
about
combinatorial problems, where one may certainly find anything of the sort:

Alexander Schrijver, Combinatorial Optimization - Polyhedra and Efficiency,
Springer-Verlag, 2003, ISBN 3-540-44389-4.

I also found out that they sell a CD with C code with it ...
So a quick and dirty solution may be to use code from it in Ruby with SWIG,
but I agree
that only a program in Ruby is really a good program...

Best regards,

Axel

1 Answer

Jacob Fugal

5/11/2005 11:28:00 PM

On 5/11/05, Nuralanur@aol.com <Nuralanur@aol.com> wrote:
> Dear Jacob, Dear pathfinders,
>
> I've now understood your point about multiple connections and I have to
> confess that I start liking it more (maybe as much as you are starting to
> like it less...as I understand from your last email).

Welcome aboard :) Don't take my last email wrong, I still like the
idea of a multiply-connected graph for conceptual reasons, I'm just no
longer asserting that it's impossible to convert a multiply-connected
graph to a singly-connected graph with comparable performance.

> I would particularly like to have such a multiply-connected graph model
> when flying from one city to another by one of several air carrier
> companies..

Very good example.

> I can't follow your third argument about the amount of computation, though.

Not knowing which computation arguments you're refering too, do you
want to reiterate the question? I'll see if I can make myself more
clear then.

> I want to prepare my point by saying one thing first. I think it is important
> to introduce a loss function which maps each path to a value how
> nice/suitable/etc. it is subject to one criterion or a collection of criteria
> which are somehow weighted ( I know I am repeating myself).
> I find it really important not to associate loss to a node or a connection,
> but to a path. This may actually remove the obfuscation you were right to
> complain about.

I am totally in agreement here, that weights (or losses, as you term
them) need to be on the edge. The obfuscation point I brought up is
due to that though, rather than being removed from it. If you make the
songs nodes and connections between songs the edges, then the weight
must belong on the connection. But, the value used for the weight is
derived from the song, so its misleading to have information from the
song (node) labeled to the connection (edge).

> If a solution is to be constructed, the value of this function needs to be
> calculated iteratively. To do this, all the connections from the last node of
> the part-path obtained so far to the next node need to be evaluated -
> and that's regardless of whether you choose letters as nodes and multiple
> connections between them or songs and single connections.

Ok so far.

> So I don't think you can save on computation using a graph with a smaller,
> fixed number of nodes, because it's the number of connections that's critical.

Also agreed. Where I think I may have been unclear is in two points:

1) Although the number of nodes doesn't really affect running time of
the path finder to a significant degree, it does impact memory
consumption, especially if the number of nodes is increased by an
order of magnitude. If we're simply trading nodes for edges, that's
not a problem because the number of total objects remains the same,
but in this case not only does the number of nodes increase
dramatically, but so does the number of edges, which brings us to
point two...

2) The crux of my argument on the inefficiency of converting the
multiply-connected graph to a singly-connected graph by swapping nodes
and edges is that the number of edges also increases dramatically. As
I said, with N songs and M letters and uniform distribution of letters
at the beginning and end of songs, there are N edges in the
multiply-connected graph, but (N^2)/M edges in the converted
singly-connected graph. That's an increase by a factor of N/M, which
is significant if N >> M (which in our case it is).

So no, the increase in nodes from M to N doesn't really matter to
running time, just memory consumption. But the increase from N to
(N^2)/M edges is significant to both memory consumption *and* running
time.

> With respect to the starting and ending question, one does indeed have to
> work a lot - start the algorithm anew from each song that starts by an 'A'.
> Dijkstra's algorithm would find the loss function value of the connection
> to any song ending in 'Z' , in 'one go', though.

You can set it up that way, but it also introduces gotchas. Most
importantly, consider the following:

Assume connect from 'A' to 'Z' using 5 songs. How many edges are
concerned? Only 4, since songs are nodes. The path is
[a]->[b]->[c]->[d]->[e], for some 5 songs. This obviously generalizes
to K-1 connections if the path involves K songs.
But what is our weighting criterion? The weighting criterion is length
of song -- so, obviously we should assume we need to consider the sum
of K weights along this path. But the weights are connected to the
edges, of which we only have K-1. So where's the last weight come
from? If you want to use a general solving algorithm, you need to add
a node with appropriate edges to mark either the start or goal state.
You don't need both, but you do need at least one (if you use both,
you can weight the edges out of the start or into the goal as 0 to
essentially ignore their weight).

So we're still stuck with needing to modify the graph for each
different set of start/goal states.

> But I want to come back to the multi-connected graph, because it would
> really be nice to know something more about what's happening when one
> works with such graphs.
>
> Do you have a proposal for an algorithm that could deal with such a graph?
> I'd really like to know about that.

Well, I don't know of any algorithms with efficiency comparable to
Djikstra's that could handle the multiply connected graph, but a
standard best-first search should work fine.

For those that haven't heard of best-first search, it's similar to
depth-first or breadth-first except that it involves (and takes
advantage of) weights. The *-first family of searches can be
generalized by the following algorithm:

def search( graph, start_node )
# initialize queue with start state (start node and empty path)
queue = Queue.new
queue << [ start_node, [], 0.0 ]

# initialize set of seen nodes, so we don't go in circles
seen = {}

until queue.empty?
# remove next state from queue for evaluation
node, path, score = *(queue.shift)

# check current node against goal state
return [ path, score ] if yield node

# mark node visited so it doesn't get pushed on queue again
seen[ node ] = true

# expand state to all successors
successors = node.outbound_edges.collect do |edge|
[ edge.other_node, [ *path, edge ], score + edge.cost ]
end

# filter out seen nodes
successors.reject! { |state| seen[ state[0] ] }

# add successors to queue and continue with next state
successors.each { |state| queue << state }
end

# nothing left in the queue, no possible solution
return nil
end

The only difference between depth-first, breadth-first and best-first
searches is the implementation of Queue. In depth-first, the queue is
a stack and the most recently added state is the next to be
considered. In breadth-first, the queue is a stand FIFO, and the least
recently added is next to be considered. In best-first, the queue is a
priority queue, and the next state considered is that with the "best"
score (best can mean smallest, highest or closest to some number, or
anything else you want).

Wow, look at the time! I've got to get going, but if you have any
questions on the above, go ahead and shoot me an email and I'll be
happy to explain later tonight or tomorrow.

Jacob Fugal

comp.lang.ruby

Re: Representing Undirected Edges (advice, please

Nuralanur

Jacob Fugal

x Login to ForumsZone