MonkeeSage
12/27/2007 2:45:00 PM
On Dec 27, 7:17 am, Esmail <ebonak_de...@hotmail.com> wrote:
> Hi Jordan,
>
> I didn't know about each_with_index until after I posted my last
> message and read more on Ruby .. clearly I have to do more reading,
> but I have found one of the best ways to learn is to do :-)
>
> > There's no built-in way that I'm aware of. You have to iterate over
> > the array yourself. If you want all the indices you could something
> > like...
>
> > indices = []
> > ['aaaa', '>bbbb', '>cccc'].each_with_index { | e, i |
> > indices << i if e =~ /^>/
> > }
> > p indices # => [1, 2]
>
> > But given the description of what you're trying to do in the other
> > thread, you probably just want to use Array#reject...
>
> > a = ['aaaa', '>bbbb', 'cccc'].reject { | e | e =~ /^>/ }
> > p a # => ["aaaa", "cccc"]
>
> This would delete only the one element, but I am trying to delete a range
> of data (a record). I may have duplicate records, so I am trying to get
> rid of them. They have different identifiers, each starting with a '>'.
> Here's a test file that mimics this:
>
> >88888/Bla08/the/rest8
> 888888888888888
> 888888888888888
> 888888888888888
> 888888888888888
> 888888888888888
> 88888 -- last line --
> >77777/Bla07/the/rest7
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 77777 -- last line --
> >66666/Bla06/the/rest6
> 666666666666666
> 666666666666666
> 666666666666666
> 666666666666666
> 666666666666666
> 66666 -- last line --
> >77777/Bla07/the/rest7
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 777777777777777
> 77777 -- last line --
> >
>
> (I add the last > and later remove it)
>
> So, this is what I came up with (with suggestions from you):
>
> ######################################
> # delete duplicate records
> ######################################
> def deleteDuplicates(data, dups)
>
> dups.each do |name|
> puts "\n****deleting duplicate \"#{name}\"...\n"
> s = data.index(name)
> e = 0
> data[s+1..-1].each_with_index{ |v, i|
> if v =~ /^>/
> e = i
> break
> end
> }
>
> puts "deleting ... ", data[s..s+e], "..done"
> data.slice!(s..s+e)
> end
>
> data
> end
> ######################################
>
> What do you think? It seems to work, but I'm always interested in
> learning to do things better.
>
> Thanks again!
>
> Esmail
Hi Esmail,
A couple points:
- It's not very efficient to do all that iteration and slicing.
- The regexp won't work since #each and #each_with_index iterate over
lines and not characters (so v == " >...", so /^ >/ would be needed).
- #index returns nil if there is no matching index (error when you get
to s+1 in that case).
How about using Array#uniq, as in:
def no_dups(path)
IO.read(path).split(" >").uniq.join(" >")
end
fixed = no_dups("testfile")
puts fixed
# =>
>88888/Bla08/the/rest8
888888888888888
888888888888888
888888888888888
888888888888888
888888888888888
88888 -- last line --
>77777/Bla07/the/rest7
777777777777777
777777777777777
777777777777777
777777777777777
777777777777777
77777 -- last line --
>66666/Bla06/the/rest6
666666666666666
666666666666666
666666666666666
666666666666666
666666666666666
66666 -- last line --
>
Regards,
Jordan