Jean-Michel Pichavant
2/12/2010 4:49:00 PM
PeroMHC wrote:
> Hi All, I have a simple problem that I hope somebody can help with. I
> have an input file (a fasta file) that I need to edit..
>
> Input file format
>
>
>> name 1
>>
> tactcatacatac
>
>> name 2
>>
> acggtggcat
>
>> name 3
>>
> gggtaccacgtt
>
> I need to concatenate the sequences.. make them look like
>
>
>> concatenated
>>
> tactcatacatacacggtggcatgggtaccacgtt
>
> thanks. Matt
>
A solution using regexp:
found = []
for line in open('seqfile.txt'):
found += re.findall('^[acgtACGT]+$', line)
print found
> ['tactcatacatac', 'acggtggcat', 'gggtaccacgtt']
print ''.join(found)
> 'tactcatacatacacggtggcatgggtaccacgtt'
JM