[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: Sort Big File Help

MRAB

3/3/2010 7:59:00 PM

mk wrote:
> John Filben wrote:
>> I am new to Python but have used many other (mostly dead) languages in
>> the past. I want to be able to process *.txt and *.csv files. I can
>> now read that and then change them as needed â?? mostly just take a
>> column and do some if-then to create a new variable. My problem is
>> sorting these files:
>>
>> 1.) How do I sort file1.txt by position and write out
>> file1_sorted.txt; for example, if all the records are 100 bytes long
>> and there is a three digit id in the position 0-2; here would be some
>> sample data:
>>
>> a. 001JohnFilben��
>>
>> b. 002Joe Smithâ?¦..
>
> Use a dictionary:
>
> linedict = {}
> for line in f:
> key = line[:3]
> linedict[key] = line[3:] # or alternatively 'line' if you want to
> include key in the line anyway
>
> sortedlines = []
> for key in linedict.keys().sort():
> sortedlines.append(linedict[key])
>
> (untested)
>
> This is the simplest, and probably inefficient approach. But it should
> work.
>
[snip]
Simpler would be:

lines = f.readlines()
lines.sort(key=lambda line: line[ : 3])

or even:

lines = sorted(f.readlines(), key=lambda line: line[ : 3]))

1 Answer

Arnaud Delobelle

3/3/2010 8:59:00 PM

0

MRAB <python@mrabarnett.plus.com> writes:

> mk wrote:
>> John Filben wrote:
>>> I am new to Python but have used many other (mostly dead) languages
>>> in the past. I want to be able to process *.txt and *.csv files.
>>> I can now read that and then change them as needed â?? mostly just
>>> take a column and do some if-then to create a new variable. My
>>> problem is sorting these files:
>>>
>>> 1.) How do I sort file1.txt by position and write out
>>> file1_sorted.txt; for example, if all the records are 100 bytes
>>> long and there is a three digit id in the position 0-2; here would
>>> be some sample data:
>>>
>>> a. 001JohnFilben��
>>>
>>> b. 002Joe Smithâ?¦..
>>
>> Use a dictionary:
>>
>> linedict = {}
>> for line in f:
>> key = line[:3]
>> linedict[key] = line[3:] # or alternatively 'line' if you want
>> to include key in the line anyway
>>
>> sortedlines = []
>> for key in linedict.keys().sort():
>> sortedlines.append(linedict[key])
>>
>> (untested)
>>
>> This is the simplest, and probably inefficient approach. But it
>> should work.
>>
> [snip]
> Simpler would be:
>
> lines = f.readlines()
> lines.sort(key=lambda line: line[ : 3])
>
> or even:
>
> lines = sorted(f.readlines(), key=lambda line: line[ : 3]))

Or even:

lines = sorted(f)

--
Arnaud