[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Collections of structured-data objects: what approach?

Graham Wideman

8/25/2006 5:59:00 AM

Folks:

I'm new to Ruby, though not to programming in general. So I'm looking around
for some of the mechanisms I'm used to finding, and one of them is an
apparatus for Collections (as for example used in C++, OP, VB etc).

I'm hoping that someone can point me in the right direction as to where this
functionality can be found, or failing that suggest the most advantageous
starting point to create it based on more primitive classes.

Main features:

1. The contained objects are user-defined types having multiple fields (data
members). This in itself seems no problem for Ruby.

2. Order/Retrieve by index. The collection should stay ordered by insertion
order and support retrieval by integer index. (Sorted collection is a
separate issue, of course.)

3. Insert/Append/Remove/Delete. (List behavior) Allow appending (at end) or
inserting at arbitrary location, and removal or deletion of arbitrary
members.

4. Find by key: (Dictionary behavior etc) Ability to retrieve member objects
by supplying a value to match against one of the object's fields. Often this
is simple a Name field on each object, which functions as a key, but at
other times one might want lookup based on some other field or fields.

So far I've read plenty that seems related: arrays, hashes, Enumerable, and
several related chapters in PickAxe and Ruby for Rails. Although
"collections" are mentioned in many of these sources, I've not yet hit on a
treatment of the ruby way for the Collection basics described above (eg: R
on R's Collection chapter's description of "insert" is simply replacing the
Nth element of an array).

Needless to say, I'd much appreciate comments illuminating this area!

Thanks

Graham

---------------------------------------------------
Graham Wideman
Microsoft Visio MVP
---------------------------------------------------
Book/Tools:
Visio 2003 Developer's Survival Pack
Resources for programmable diagramming at:
http://www.diagram...


13 Answers

Alex Young

8/25/2006 7:40:00 AM

0

Graham Wideman wrote:
> Folks:
>
> I'm new to Ruby, though not to programming in general. So I'm looking around
> for some of the mechanisms I'm used to finding, and one of them is an
> apparatus for Collections (as for example used in C++, OP, VB etc).
>
> I'm hoping that someone can point me in the right direction as to where this
> functionality can be found, or failing that suggest the most advantageous
> starting point to create it based on more primitive classes.
You can get away with just using an Array for almost all of this.

> Main features:
>
> 1. The contained objects are user-defined types having multiple fields (data
> members). This in itself seems no problem for Ruby.
> 2. Order/Retrieve by index. The collection should stay ordered by insertion
> order and support retrieval by integer index. (Sorted collection is a
> separate issue, of course.)
> 3. Insert/Append/Remove/Delete. (List behavior) Allow appending (at end) or
> inserting at arbitrary location, and removal or deletion of arbitrary
> members.
> 4. Find by key: (Dictionary behavior etc) Ability to retrieve member objects
> by supplying a value to match against one of the object's fields. Often this
> is simple a Name field on each object, which functions as a key, but at
> other times one might want lookup based on some other field or fields.
None of these are a problem for Array:

a = [{:a => 1, :b => 2}, {:c => 4, :d => 5}]
a[1] # => {:c=>4, :d=>5}
a << {:e => 6, :f => 7}
# => [{:b=>2, :a=>1}, {:c=>4, :d=>5}, {:e=>6, :f=>7}]
a.select {|x| x[:c] == 4} # => [{:c=>4, :d=>5}]

b = [1,2,3,4]
b.insert(2,5) # => [1, 2, 5, 3, 4]



> So far I've read plenty that seems related: arrays, hashes, Enumerable, and
> several related chapters in PickAxe and Ruby for Rails. Although
> "collections" are mentioned in many of these sources, I've not yet hit on a
> treatment of the ruby way for the Collection basics described above (eg: R
> on R's Collection chapter's description of "insert" is simply replacing the
> Nth element of an array).
If so, that's simply wrong. Array#insert(n, element) makes the inserted
element the n'th and pushes everything after it along one.

--
Alex

Robert Klemme

8/25/2006 8:39:00 AM

0

On 25.08.2006 07:59, Graham Wideman wrote:
> Folks:
>
> I'm new to Ruby, though not to programming in general. So I'm looking around
> for some of the mechanisms I'm used to finding, and one of them is an
> apparatus for Collections (as for example used in C++, OP, VB etc).
>
> I'm hoping that someone can point me in the right direction as to where this
> functionality can be found, or failing that suggest the most advantageous
> starting point to create it based on more primitive classes.
>
> Main features:
>
> 1. The contained objects are user-defined types having multiple fields (data
> members). This in itself seems no problem for Ruby.
>
> 2. Order/Retrieve by index. The collection should stay ordered by insertion
> order and support retrieval by integer index. (Sorted collection is a
> separate issue, of course.)
>
> 3. Insert/Append/Remove/Delete. (List behavior) Allow appending (at end) or
> inserting at arbitrary location, and removal or deletion of arbitrary
> members.
>
> 4. Find by key: (Dictionary behavior etc) Ability to retrieve member objects
> by supplying a value to match against one of the object's fields. Often this
> is simple a Name field on each object, which functions as a key, but at
> other times one might want lookup based on some other field or fields.
>
> So far I've read plenty that seems related: arrays, hashes, Enumerable, and
> several related chapters in PickAxe and Ruby for Rails. Although
> "collections" are mentioned in many of these sources, I've not yet hit on a
> treatment of the ruby way for the Collection basics described above (eg: R
> on R's Collection chapter's description of "insert" is simply replacing the
> Nth element of an array).
>
> Needless to say, I'd much appreciate comments illuminating this area!

For the sake of discussion I present a different approach. As long as
you don't need 4 (lookup by key) an Array is perfect. However, with the
lookup IMHO a new data structure is preferred because then consistency
can be handled internally. I'll attach a half grown quick hack to
demonstrate what I mean.

Kind regards

robert

# untested and incomplete sample
require 'set'

class IndexedCollection
include Enumerable

def initialize(enum=[])
@data = []
@indexes = {}
add_all enum
end

def add_index(field)
field = field.to_sym
idx = @indexes[field] = Hash.new {|h,k| h[k] = Set.new}

@data.each do |e|
idx[e.send(field)] << e
end

self
end

def lookup(field, value)
field = field.to_sym
idx = @indexes[field] or
raise ArgumentError, "No such index: #{field}"

idx[value].to_a
end

def add_all(enum)
enum.each {|e| self << e}
self
end

def <<(e)
@data << e

@indexes.each do |field, index|
index[e.send(field)] << e
end

self
end

def each(&b)
@data.each(&b)
self
end

def size() @data.size end

end

Graham Wideman

8/25/2006 8:46:00 AM

0

Alex:

Ah-ha! Thanks for getting me on the right track.

Couple of points (for others following along...)

> You can get away with just using an Array for almost all of this.
>
> a = [{:a => 1, :b => 2}, {:c => 4, :d => 5}]
> a[1] # => {:c=>4, :d=>5}
> a << {:e => 6, :f => 7}
> # => [{:b=>2, :a=>1}, {:c=>4, :d=>5}, {:e=>6, :f=>7}]
> a.select {|x| x[:c] == 4} # => [{:c=>4, :d=>5}]
>
> b = [1,2,3,4]
> b.insert(2,5) # => [1, 2, 5, 3, 4]

Now THAT's the set of examples I needed, and that should be in the major
Ruby docs.

And I'd add Array#delete

(... though I'm not yet clear on the sematics. Does this simply remove the
object from the array, or also delete it's data? Ie: if there's another
reference to this object, is the data still there? To be tested...)

>> [...Ruby for Rail's Collection chapter's description of "insert"...]
> If so, that's simply wrong.

Yep, so it seems!

PickAxe's "Containers" etc chapter manages to get us all the way to lambda
functions and closures, but its "SongList" example is too simple to be as
useful as your examples above.

Meanwhile, Ruby for Rails' "Collections, containers" etc chapter says the
following (emphasis mine):

-----------------------------
11.2.2 Inserting, retrieving and removing array elements
Because an array is an ordered collection, any object you add to the array
goes either at the beginning, at the end, or somewhere in the middle. The
most general technique for INSERTING one or more items into an array is the
setter method []= (square brackets...
-----------------------------

.... and goes on to give "insert" examples which do not insert, they simply
assign to an array slot. (And there's no mention of the insert method). So
I had written off array as being the complete answer, since its supposed
insert syntax "obviously" didn't really know how to insert.

Now that you've pointed out that there's an insert method called, ahem,
"insert", I see that it actually IS the array class that handles Collection
duty, and I can focus there for some more serious RTFM-ing.

Thanks again for the quick clarification!

Graham

--
---------------------------------------------------
Graham Wideman
Microsoft Visio MVP
---------------------------------------------------
Book/Tools:
Visio 2003 Developer's Survival Pack
Resources for programmable diagramming at:
http://www.diagram...


Alex Young

8/25/2006 9:09:00 AM

0

Graham Wideman wrote:
<snip>
> And I'd add Array#delete
>
> (... though I'm not yet clear on the sematics. Does this simply remove the
> object from the array, or also delete it's data? Ie: if there's another
> reference to this object, is the data still there? To be tested...)
a = "foo"
b = [a, "bar", "qux", a]
b.delete(a)
b # => ["bar", "qux"]
a # => "foo"

So no, Array#delete doesn't delete the data itself, only the reference
to it. You may also want:

b.delete_at(0)
b # => ["qux"]

<snip>
> Now that you've pointed out that there's an insert method called, ahem,
> "insert", I see that it actually IS the array class that handles Collection
> duty, and I can focus there for some more serious RTFM-ing.
Array is surprisingly powerful, especially when you take the inject
method into account. It's a real workhorse.

> Thanks again for the quick clarification!
No worries :-)

--
Alex

Graham Wideman

8/25/2006 7:23:00 PM

0

Robert:

Question and then a comment:

Question: What does the keyword "self" do when sitting on a line by itself,
as in several of your methods?

Comment:

> For the sake of discussion I present a different approach. As long as
> you don't need 4 (lookup by key) an Array is perfect. However, with the
> lookup IMHO a new data structure is preferred because then consistency
> can be handled internally. I'll attach a half grown quick hack to
> demonstrate what I mean.

If I'm reading you correctly, you are saying a bit more than this:

1. For lookup (or for that matter if one want to implement a
SortedCollection), then it's desirable to have an index for the array on the
key field(s), which can be a ruby hash. (Ie: one *could* use Enumerable#find
or find_all, but on larger collections that's slow?)

2. But if you are going to implement an index using a helper hash, then you
want a way to keep the hash up-to-date as items are inserted or deleted from
the array.

3. So you advocate a class to wrap these together so that "consistency can
be handled internally".

Thanks for your reply,

Graham
--
---------------------------------------------------
Graham Wideman
Microsoft Visio MVP
---------------------------------------------------
Book/Tools:
Visio 2003 Developer's Survival Pack
Resources for programmable diagramming at:
http://www.diagram...


Graham Wideman

8/25/2006 7:50:00 PM

0

Alex:

> So no, Array#delete doesn't delete the data itself, only the reference to
> it. You may also want:
>
> b.delete_at(0)
> b # => ["qux"]
[...]
> Array is surprisingly powerful, especially when you take the inject method
> into account. It's a real workhorse.

Thanks,

Graham


Simen

8/25/2006 8:07:00 PM

0

On 8/25/06, Graham Wideman <checkforrealaddress@diagramantics.com> wrote:
> Question: What does the keyword "self" do when sitting on a line by itself,
> as in several of your methods?
>

Ruby returns the last value of a method. self on a line by itself returns self.

Graham Wideman wrote:
>> You can get away with just using an Array for almost all of this.
>>
>> a = [{:a => 1, :b => 2}, {:c => 4, :d => 5}]
>> a[1] # => {:c=>4, :d=>5}
>> a << {:e => 6, :f => 7}
>> # => [{:b=>2, :a=>1}, {:c=>4, :d=>5}, {:e=>6, :f=>7}]
>> a.select {|x| x[:c] == 4} # => [{:c=>4, :d=>5}]
>>
>> b = [1,2,3,4]
>> b.insert(2,5) # => [1, 2, 5, 3, 4]
>
>Now THAT's the set of examples I needed, and that should be in the major
>Ruby docs.

It is in the main Ruby docs. It's spread between the methods in use,
because we couldn't anticipate every single combination of methods you
might need to use and document them all in one place.

--
- Simen

Robert Klemme

8/25/2006 10:41:00 PM

0

Graham Wideman wrote:
> Robert:
>
> Question and then a comment:
>
> Question: What does the keyword "self" do when sitting on a line by itself,
> as in several of your methods?

Simen answered that one. As an additional note, traditionally #each
returns self. Also, by doing this you can chain methods.

> Comment:
>
>> For the sake of discussion I present a different approach. As long as
>> you don't need 4 (lookup by key) an Array is perfect. However, with the
>> lookup IMHO a new data structure is preferred because then consistency
>> can be handled internally. I'll attach a half grown quick hack to
>> demonstrate what I mean.
>
> If I'm reading you correctly, you are saying a bit more than this:
>
> 1. For lookup (or for that matter if one want to implement a
> SortedCollection), then it's desirable to have an index for the array on the
> key field(s), which can be a ruby hash. (Ie: one *could* use Enumerable#find
> or find_all, but on larger collections that's slow?)

find is O(n) which can seriously hurt your applications performance if
the Array grows beyond a certain limit. Hash lookup on the other hand
is O(1).

> 2. But if you are going to implement an index using a helper hash, then you
> want a way to keep the hash up-to-date as items are inserted or deleted from
> the array.
>
> 3. So you advocate a class to wrap these together so that "consistency can
> be handled internally".

Right. You got it. That's what OO is about.

Kind regards

robert

Graham Wideman

8/26/2006 6:41:00 PM

0

Simen:

Thanks for your reply....

>>Now THAT's the set of examples I needed, and that should be in the major
>>Ruby docs.
>
> It is in the main Ruby docs. It's spread between the methods in use,
> because we couldn't anticipate every single combination of methods you
> might need to use and document them all in one place.

I was more commenting on the major ruby docs like pickaxe, Ruby for Rails,
FAQs and so on. While some mention collections, this mostly just introduces
basic features of hashes and arrays. None of the discussion that I've run
across so far actually puts it all together into a comprehensive discussion
of how the collections functionality familiar from other languages
(Java, C#, C++, OP, VB etc) can be acquired from array and hash plus
whatever.

Yes, I agree that all the features and methods are documented in the
language guide (eg: in pickaxe), but one needs to know which set of those
features are to be assembled to create the familiar collection
functionality, in the absence of a big flashing sign on an actual Collection
class. (Especially bearing in mind that other languages/libraries go to some
pains to distinguish the concepts of Array and Collection -- not to say that
any particular other language is the most righteous).

Anyhow, for my part I'm a great deal further along than I was a couple of
days ago, thanks in part to helpful responses here and elsewhere.

Thanks all!

Graham


Graham Wideman

8/26/2006 6:50:00 PM

0

Robert:

Thanks, and...

> Right. You got it. That's what OO is about.

Well, I already had the OO part, just needed the ruby version of it :-) plus
connecting the dots might help others stumbling along this way, and you
never know it might turn out that your "consistency" comment was referring
to something else I didn't see.

Anyhow, this discussion has been most helpful to alert me to all the
capabilities that Array has, and calibrating my sense that a little bit of
assembly required to implement collections with various features, they're
not just in some library that "of course" everyone knows about and uses :-)

Graham