F. Senault
7/19/2007 10:57:00 AM
Le 19 juillet à 12:31, Marc Hoeppner a écrit :
> Hi,
>
> I am not quite sure about how to approach the following problem:
>
> I have a long (long long long) string of letters, a genomic sequence
> (600k characters+).
> Now, what I want to do is to extract certain parts of this string, based
> on the position.
> So for example lets say I want all characters from position 2340 to
> 5436.
For example :
>> str = "abcdefghijklmnopqrstuvwxyz"
=> "abcdefghijklmnopqrstuvwxyz"
The simplest way to do answer you question is :
>> str[5..11]
=> "fghijkl"
You may want to try the other variants :
>> str[5, 6]
=> "fghijkl"
>> str[/f.*l/]
=> "fghijkl"
>> str['jghijkl']
=> "fghijkl"
If you need to parse it char per char, you can use a multitude of
methods :
>> str[5..10].each_byte { |b| puts b.chr }
f
g
h
i
j
k
=> "fghijk"
>> str[5..10].split(//)
=> ["f", "g", "h", "i", "j", "k"]
>> str[5..10].split(//).each { |c| puts c }
f
g
h
i
j
k
=> ["f", "g", "h", "i", "j", "k"]
Etc.
I didn't try with very long strings, now, but I don't see why the ranges
methods of access wouldn't be acceptable. (Of course, the regular
expression will be slower.)
Fred
--
I can try to get away but i've strapped myself in
I can try to scratch away the sound in my ears
I can see it killing away all my bad parts (Nine inch Nails,
I don't want to listen but it's all too clear The Becoming)