[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How to do this complicated logic in ruby

Valentino Lun

2/16/2009 10:00:00 AM

Dear all

I have an array with size around 1000, I want to perform some data
checking and correction in this array.

For instance, the first record of this array is a hash, as follow
my_array[0] = {"server"=>"AHN", "hosp"=>"AHN", "loc"=>"PC1",
"pspec"=>"ANA", "number"=>"1", "pcat"=>"1"}

server hosp loc pspec pcat
AHN AHN PC1 ANA 1
PWH AHN PC1 ANA 1
NDH AHN PC1 ANA 2 <= This pcat value need update in
array1
TMH AHN PC1 ANA 2 <= This pcat value need update in
array1
.......
.....
...
(around 1000 records)

When keys hosp, loc, pspec has the same values, their pcat must be
identical. So, there is problem in the last two records, the key pcat
should be 1, because the pcat is correct if array["server"] equal to
array["hosp"].

I cannot figure out the logic to doing this in ruby (even in other
language). Can someone give me some hints on this? Thanks

Many thanks
Valentino
--
Posted via http://www.ruby-....

9 Answers

Julian Leviston

2/16/2009 10:55:00 AM

0

Loop the array, changing the values of the hash as you go based on
some conditional. It's not complex at all. What are you finding
difficult?

Blog: http://random8.ze...
Learn rails: http://sensei.ze...

On 16/02/2009, at 9:00 PM, Valentino Lun <sumwo@yahoo.com> wrote:

> Dear all
>
> I have an array with size around 1000, I want to perform some data
> checking and correction in this array.
>
> For instance, the first record of this array is a hash, as follow
> my_array[0] = {"server"=>"AHN", "hosp"=>"AHN", "loc"=>"PC1",
> "pspec"=>"ANA", "number"=>"1", "pcat"=>"1"}
>
> server hosp loc pspec pcat
> AHN AHN PC1 ANA 1
> PWH AHN PC1 ANA 1
> NDH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> TMH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> .......
> .....
> ...
> (around 1000 records)
>
> When keys hosp, loc, pspec has the same values, their pcat must be
> identical. So, there is problem in the last two records, the key pcat
> should be 1, because the pcat is correct if array["server"] equal to
> array["hosp"].
>
> I cannot figure out the logic to doing this in ruby (even in other
> language). Can someone give me some hints on this? Thanks
>
> Many thanks
> Valentino
> --
> Posted via http://www.ruby-....
>

Martin DeMello

2/16/2009 10:57:00 AM

0

On Mon, Feb 16, 2009 at 3:30 PM, Valentino Lun <sumwo@yahoo.com> wrote:
> Dear all
>
> I have an array with size around 1000, I want to perform some data
> checking and correction in this array.
>
> For instance, the first record of this array is a hash, as follow
> my_array[0] = {"server"=>"AHN", "hosp"=>"AHN", "loc"=>"PC1",
> "pspec"=>"ANA", "number"=>"1", "pcat"=>"1"}
>
> server hosp loc pspec pcat
> AHN AHN PC1 ANA 1
> PWH AHN PC1 ANA 1
> NDH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> TMH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> .......
> .....
> ...
> (around 1000 records)
>
> When keys hosp, loc, pspec has the same values, their pcat must be
> identical. So, there is problem in the last two records, the key pcat
> should be 1, because the pcat is correct if array["server"] equal to
> array["hosp"].

Simple way:

1. Have a 'signature' for each row, composed of the hosp, loc and
pspec. Could be as simple as

def signature(ary, row)
%w(hosp loc pspec).map {|k| ary[row][k]}.join(",")
end

2. Collect all the rows with the same signature

verify = Hash.new {|h,k| h[k] = []}
ary.each_with_index {|row, i|
h[signature(ary, row)] << [i, row['pcat']]
}

3. See if there are any problems

verify.each_pair {|k, v|
if v.length > 1
fix_array_for(v)
end
}

4. Write fix_array_for(v)

Note that v is an array of pairs of [index, pcat]. So for your
example, it would be
[[0,1], [1,1], [2,2], [3,2]]

you basically need to iterate over that array, see which pcat is
right, then iterate over it once more and set all the pcats to the
right value.

There are probably more efficient ways to do all this, but this has
the advantage of being straightforward.

martin

Luiz Vitor Martinez Cardoso

2/16/2009 11:13:00 AM

0

Using Symbols here make a big sense. Try to structure your array like:

my_array[0] =3D {:server =3D> "AHN", :hosp =3D>"AHN", :loc =3D>"PC1",
:pspec=3D>"ANA", :number=3D>"1", :pcat=3D>"1"}

And for all the values that are frequently repeated use Symbols. Basically
when you use Symbols you create one object and all the times that you use
one object with the same name you create a referece to this object and NOT
another object. Making that you will free memory.

Regards,
Luiz Vitor.

On Mon, Feb 16, 2009 at 7:57 AM, Martin DeMello <martindemello@gmail.com>wr=
ote:

> On Mon, Feb 16, 2009 at 3:30 PM, Valentino Lun <sumwo@yahoo.com> wrote:
> > Dear all
> >
> > I have an array with size around 1000, I want to perform some data
> > checking and correction in this array.
> >
> > For instance, the first record of this array is a hash, as follow
> > my_array[0] =3D {"server"=3D>"AHN", "hosp"=3D>"AHN", "loc"=3D>"PC1",
> > "pspec"=3D>"ANA", "number"=3D>"1", "pcat"=3D>"1"}
> >
> > server hosp loc pspec pcat
> > AHN AHN PC1 ANA 1
> > PWH AHN PC1 ANA 1
> > NDH AHN PC1 ANA 2 <=3D This pcat value need update in
> > array1
> > TMH AHN PC1 ANA 2 <=3D This pcat value need update in
> > array1
> > .......
> > .....
> > ...
> > (around 1000 records)
> >
> > When keys hosp, loc, pspec has the same values, their pcat must be
> > identical. So, there is problem in the last two records, the key pcat
> > should be 1, because the pcat is correct if array["server"] equal to
> > array["hosp"].
>
> Simple way:
>
> 1. Have a 'signature' for each row, composed of the hosp, loc and
> pspec. Could be as simple as
>
> def signature(ary, row)
> %w(hosp loc pspec).map {|k| ary[row][k]}.join(",")
> end
>
> 2. Collect all the rows with the same signature
>
> verify =3D Hash.new {|h,k| h[k] =3D []}
> ary.each_with_index {|row, i|
> h[signature(ary, row)] << [i, row['pcat']]
> }
>
> 3. See if there are any problems
>
> verify.each_pair {|k, v|
> if v.length > 1
> fix_array_for(v)
> end
> }
>
> 4. Write fix_array_for(v)
>
> Note that v is an array of pairs of [index, pcat]. So for your
> example, it would be
> [[0,1], [1,1], [2,2], [3,2]]
>
> you basically need to iterate over that array, see which pcat is
> right, then iterate over it once more and set all the pcats to the
> right value.
>
> There are probably more efficient ways to do all this, but this has
> the advantage of being straightforward.
>
> martin
>
>


--=20
Regards,

Luiz Vitor Martinez Cardoso
cel.: (11) 8187-8662
blog: rubz.org
engineer student at maua.br

"Posso nunca chegar a ser o melhor engenheiro do mundo, mas tenha certeza d=
e
que eu vou lutar com todas as minhas for=C3=A7as para ser o melhor engenhei=
ro que
eu puder ser"

Martin DeMello

2/16/2009 11:41:00 AM

0

On Mon, Feb 16, 2009 at 4:43 PM, Luiz Vitor Martinez Cardoso
<grabber@gmail.com> wrote:
> Using Symbols here make a big sense. Try to structure your array like:
>
> my_array[0] = {:server => "AHN", :hosp =>"AHN", :loc =>"PC1",
> :pspec=>"ANA", :number=>"1", :pcat=>"1"}
>
> And for all the values that are frequently repeated use Symbols. Basically
> when you use Symbols you create one object and all the times that you use
> one object with the same name you create a referece to this object and NOT
> another object. Making that you will free memory.

Even better: http://www.codeforpeople.com/lib/ruby/ar...

martin

Robert Klemme

2/16/2009 12:08:00 PM

0

2009/2/16 Valentino Lun <sumwo@yahoo.com>:
> Dear all
>
> I have an array with size around 1000, I want to perform some data
> checking and correction in this array.
>
> For instance, the first record of this array is a hash, as follow
> my_array[0] = {"server"=>"AHN", "hosp"=>"AHN", "loc"=>"PC1",
> "pspec"=>"ANA", "number"=>"1", "pcat"=>"1"}
>
> server hosp loc pspec pcat
> AHN AHN PC1 ANA 1
> PWH AHN PC1 ANA 1
> NDH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> TMH AHN PC1 ANA 2 <= This pcat value need update in
> array1
> .......
> .....
> ...
> (around 1000 records)
>
> When keys hosp, loc, pspec has the same values, their pcat must be
> identical. So, there is problem in the last two records, the key pcat
> should be 1, because the pcat is correct if array["server"] equal to
> array["hosp"].
>
> I cannot figure out the logic to doing this in ruby (even in other
> language). Can someone give me some hints on this? Thanks

IMHO this is plainly the wrong data structure for the task. Since you
identify entries by their hosp, loc, pspec you should *index* the
whole thing by these columns. Also, since your Hashes seem to be
uniform I would rather define a particular type for this, e.g.

Entry = Struct.new :server, :hosp, :loc, :pspec, :pcat

EntryKey = Struct.new :server, :hosp, :loc do
def self.create(entry)
new(*members.map {|m| entry[m]})
end
end

index = Hash.new {|h,k| h[k] = []}
# loop reading input
entry = ...
index[EntryKey.create(entry)] << entry

# now you can process them or do it while reading

See also Martin's reply which goes into the same direction just with a
different approach.

Cheers

robert


--
remember.guy do |as, often| as.you_can - without end

Martin DeMello

2/16/2009 12:21:00 PM

0

On Mon, Feb 16, 2009 at 5:37 PM, Robert Klemme
<shortcutter@googlemail.com> wrote:
> EntryKey = Struct.new :server, :hosp, :loc do
> def self.create(entry)
> new(*members.map {|m| entry[m]})
> end
> end
>
> index = Hash.new {|h,k| h[k] = []}
> # loop reading input
> entry = ...
> index[EntryKey.create(entry)] << entry
>
> # now you can process them or do it while reading
>
> See also Martin's reply which goes into the same direction just with a
> different approach.

The different approach is mostly due to the fact that I'm
uncomfortable using objects with mutable fieds as hash keys. I prefer
to explicitly map them to a string, and then use that string as a hash
key.

martin

Robert Klemme

2/16/2009 4:40:00 PM

0

2009/2/16 Martin DeMello <martindemello@gmail.com>:
> On Mon, Feb 16, 2009 at 5:37 PM, Robert Klemme
> <shortcutter@googlemail.com> wrote:
>> EntryKey = Struct.new :server, :hosp, :loc do
>> def self.create(entry)
>> new(*members.map {|m| entry[m]})
>> end
>> end
>>
>> index = Hash.new {|h,k| h[k] = []}
>> # loop reading input
>> entry = ...
>> index[EntryKey.create(entry)] << entry
>>
>> # now you can process them or do it while reading
>>
>> See also Martin's reply which goes into the same direction just with a
>> different approach.
>
> The different approach is mostly due to the fact that I'm
> uncomfortable using objects with mutable fieds as hash keys. I prefer
> to explicitly map them to a string, and then use that string as a hash
> key.

Hehe, that would be something *I* would be uncomfortable with. :-) It
is interesting that you advertise this approach as a more robust one.
Because IMHO this is more on the hackish side of things because
instead of using a structured type you lump everything into a single
unstructured object. This can break awfully (i.e. in your example, if
fields contain "," in different places).

The nice thing about Struct is that it defines #==, #eql? and #hash
properly making generated classes suitable as Hash keys. If you are
afraid of mutations you can always freeze keys.

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end

Pit Capitain

2/16/2009 6:44:00 PM

0

2009/2/16 Valentino Lun <sumwo@yahoo.com>:
> I cannot figure out the logic to doing this in ruby (even in other
> language). Can someone give me some hints on this? Thanks

While I agree on what the others have said, that you should create a
better data structure, here's a way to do what you wanted with your
array of hashes. But look at the other posts. It's easy to build good
data structures in Ruby.

# create a key for the given record to be used in the pcat hash
def pcat_key(record)
[record["hosp"], record["loc"], record["psec"]]
end

# build hash with valid pcat values
pcat = {}
my_array.each do |record|
next unless record["server"] == record["hosp"]
pcat[pcat_key(record)] = record["pcat"]
end

# look for invalid records
my_array.each do |record|
next if record["pcat"] == pcat[pcat_key(record)]
# do something with the invalid record
p record
end

Regards,
Pit

Valentino Lun

2/17/2009 9:01:00 AM

0

Dear all

Thank you for your help. Finally, I used about 5 hours (>_<) to figure
out my solution and it works..But it takes long time to execute.

Below is my code to share with you all, and I am seeking your expert
advices if any optimization can be done. Thank you.


# data collection about 5000 records for each variable (lis, gcrs)
lis = ActiveRecord::Base.connection.execute("select * from lis_requests
order by hosp, spec, loc, pspec")
gcrs = ActiveRecord::Base.connection.execute("select * from
gcrs_requests order by hosp, spec, loc, pspec")

def find_correct_pcat(arr)

server_ref = {"AHN" => "AHN", "TPH" => "AHN",
"NDH" => "NDH", "BBH" => "NDH", "CHS" => "NDH",
"PWH" => "PWH", "SH" => "PWH"}

arr.each do |x|
return x["pcat"] if x["server"] == server_ref[x["hosp"]]
end

#if not, then find the pcat with the largest "number"
a.sort_by {|y| y["number"].to_i}.last["pcat"]

end

# The result will put in this hash
result = {}

#looping in all index key and get the result.
lis.collect {|x| [x["hosp"],x["spec"],x["loc"],x["pspec"]]}.uniq.each do
|index_key|

lis_record = lis.select {|x| x["hosp"] == index_key[0] and x["spec"]
== index_key[1] and x["loc"] == index_key[2] and x["pspec"] ==
index_key[3]}
gcrs_record = gcrs.select {|x| x["hosp"] == index_key[0] and x["spec"]
== index_key[1] and x["loc"] == index_key[2] and x["pspec"] ==
index_key[3]}
lis_req_count = lis_record.inject(0) {|sum,n| sum + n["number"].to_i }
gcrs_req_count = gcrs_record.inject(0) {|sum,n| sum + n["number"].to_i
}

if lis_record.collect {|x| x["pcat"]}.uniq.size == 1
pcat = lis_record.first["pcat"]
else
pcat = find_correct_pcat(lis_record)
end

result[index_key] = [pcat, gcrs_req_count, lis_req_count]

end

Thanks again
Valentino
--
Posted via http://www.ruby-....