[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Stripping columns / puts'ing columns

Marc Hoeppner

7/12/2007 8:39:00 AM

Hi again,

still on my quest to learn ruby ;) And yet another question - more a
general "how-to-approach" thing, though.

Let's say I have a text file with a number of rows (it's a molecular
sequence alignment, so each row = one sequence).
I am only interested in those columns in which a certain character
appears - no matter whether in all row or only one. The whole column
should then be puts'ed.

After thinking about it for a moment, my idea was to store each row in
an array, check each array position for the presence of that particular
character and, if it is, found, to puts this array field [n] and the
corresponding field from all other arrays. Now, there are a couple of
potential problems such as hitting the same column multiple times
because I search multiple arrays... But anyways, since I am rather new
to programming and ruby in particular, that will take me a while - so I
was wondering if there is a more elegant/efficient approach to this,
before I waste my time on this only to find out it doesnt work or could
be done more efficiently (wasted quite some time on another program the
other day only to find out that it can be done with grep...yuck).

/Marc

--
Posted via http://www.ruby-....

6 Answers

Harry Kakueki

7/12/2007 8:55:00 AM

0

On 7/12/07, Marc Hoeppner <marc.hoeppner@molbio.su.se> wrote:
>
> Let's say I have a text file with a number of rows (it's a molecular
> sequence alignment, so each row = one sequence).
> I am only interested in those columns in which a certain character
> appears - no matter whether in all row or only one. The whole column
> should then be puts'ed.
>
> After thinking about it for a moment, my idea was to store each row in
> an array, check each array position for the presence of that particular
> character and, if it is, found, to puts this array field [n] and the
> corresponding field from all other arrays. Now, there are a couple of
> potential problems such as hitting the same column multiple times
> because I search multiple arrays... But anyways, since I am rather new
> to programming and ruby in particular, that will take me a while - so I
> was wondering if there is a more elegant/efficient approach to this,
> before I waste my time on this only to find out it doesnt work or could
> be done more efficiently (wasted quite some time on another program the
> other day only to find out that it can be done with grep...yuck).
>
> /Marc
>
Does this do what you want?

arr = [[1,7,3,4],[5,6,7,8],[9,7,3,4]]
new_arr = arr.transpose

p new_arr.select {|x| x.include?(7)}

Harry


--
A Look into Japanese Ruby List in English
http://www.ka...

Stefano Crocco

7/12/2007 9:03:00 AM

0

Alle giovedì 12 luglio 2007, Marc Hoeppner ha scritto:
> Hi again,
>
> still on my quest to learn ruby ;) And yet another question - more a
> general "how-to-approach" thing, though.
>
> Let's say I have a text file with a number of rows (it's a molecular
> sequence alignment, so each row = one sequence).
> I am only interested in those columns in which a certain character
> appears - no matter whether in all row or only one. The whole column
> should then be puts'ed.
>
> After thinking about it for a moment, my idea was to store each row in
> an array, check each array position for the presence of that particular
> character and, if it is, found, to puts this array field [n] and the
> corresponding field from all other arrays. Now, there are a couple of
> potential problems such as hitting the same column multiple times
> because I search multiple arrays... But anyways, since I am rather new
> to programming and ruby in particular, that will take me a while - so I
> was wondering if there is a more elegant/efficient approach to this,
> before I waste my time on this only to find out it doesnt work or could
> be done more efficiently (wasted quite some time on another program the
> other day only to find out that it can be done with grep...yuck).
>
> /Marc

If I understand your problem correctly, you can try this:

max_length = rows.map{|r| r.length}.max
cols = rows.map{|r| r.split('') + Array.new(max_length - r.length) }.transpose
interesting_cols = cols.select{ |c| c.include?(interesting_character) }

Your idea is correct: transform each row from a string to an array of
characters. So, youave a nested array, with column elements inside rows.
Since you need to select character basing on columns, we need to transpose
the outer array. This will leave you with a nested array where each element
of the outer array is a column, on which you can use select. The only problem
is that to use this approach, your rows need to be of the same length. If
they aren't, then you need to add to the array obtained split another array,
made of empty strings, whose size is the difference between the length of the
longest line and the length of the line. At the end, you may need to remove
those elements (you can do this with

interesting_cols.each{|c| c.delete('')}

). If all your lines have the same length, then you don't need to do this and
the first two lines of my code become

cols = rows.map{|r| r.split('')}.transpose

I hope this helps

Stefano

Marc Hoeppner

7/12/2007 9:07:00 AM

0

Stefano Crocco wrote:

> I hope this helps
>
> Stefano

Hi and thanks a lot - yes I guess that helps. The rows are of equal
length, so that wont be an issue.

Cheers,
Marc

--
Posted via http://www.ruby-....

Harry Kakueki

7/12/2007 9:11:00 AM

0

> Does this do what you want?
>
> arr = [[1,7,3,4],[5,6,7,8],[9,7,3,4]]
> new_arr = arr.transpose
>
> p new_arr.select {|x| x.include?(7)}
>
The rows need to have the same number of elements.

Harry

--
A Look into Japanese Ruby List in English
http://www.ka...

Robert Klemme

7/12/2007 9:30:00 AM

0

2007/7/12, Marc Hoeppner <marc.hoeppner@molbio.su.se>:
> Hi again,
>
> still on my quest to learn ruby ;) And yet another question - more a
> general "how-to-approach" thing, though.
>
> Let's say I have a text file with a number of rows (it's a molecular
> sequence alignment, so each row = one sequence).
> I am only interested in those columns in which a certain character
> appears - no matter whether in all row or only one. The whole column
> should then be puts'ed.
>
> After thinking about it for a moment, my idea was to store each row in
> an array, check each array position for the presence of that particular
> character and, if it is, found, to puts this array field [n] and the
> corresponding field from all other arrays. Now, there are a couple of
> potential problems such as hitting the same column multiple times
> because I search multiple arrays... But anyways, since I am rather new
> to programming and ruby in particular, that will take me a while - so I
> was wondering if there is a more elegant/efficient approach to this,
> before I waste my time on this only to find out it doesnt work or could
> be done more efficiently (wasted quite some time on another program the
> other day only to find out that it can be done with grep...yuck).

I'm missing one crucial bit of information: can your files grow so
large that they do not fit into memory? If that's the case you need a
two or more phased approach.

Kind regards

robert

Marc Hoeppner

7/12/2007 9:48:00 AM

0

Robert Klemme wrote:
> 2007/7/12, Marc Hoeppner <marc.hoeppner@molbio.su.se>:
>>
>> other day only to find out that it can be done with grep...yuck).
> I'm missing one crucial bit of information: can your files grow so
> large that they do not fit into memory? If that's the case you need a
> two or more phased approach.
>
> Kind regards
>
> robert

I am using single protein alignments, so that any single row should
never exceed ~ 2000 characters (or in other word my sequences are
rarely longer than a couple hundred characters), with 20 rows tops. So
that should not be an issue, I guess.

--
Posted via http://www.ruby-....