[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

linecache and glob

Joe Chiang

1/4/2008 3:03:00 AM

hi everyone happy new year!
im a newbie to python
i have a question
by using linecache and glob
how do i read a specific line from a file in a batch and then insert
it into database?

because it doesn't work! i can't use glob wildcard with linecache

>>> import linecache
>>> linecache.getline(glob.glob('/etc/*', 4)

doens't work

is there any better methods??? thank you very much in advance

jo3c
5 Answers

Jeremy Dillworth

1/4/2008 3:39:00 AM

0

Hello,

Welcome to Python!

glob returns a list of filenames, but getline is made to work on just
one filename.
So you'll need to iterate over the list returned by glob.

>>> import linecache, glob
>>> for filename in glob.glob('/etc/*'):
>>> print linecache.getline(filename, 4)

Maybe you could explain more about what you are trying to do and we
could help more?

Hope this helps,

Jeremy



On Jan 3, 10:02 pm, jo3c <JO3chi...@gmail.com> wrote:
> hi everyone happy new year!
> im a newbie to python
> i have a question
> by using linecache and glob
> how do i read a specific line from a file in a batch and then insert
> it into database?
>
> because it doesn't work! i can't use glob wildcard with linecache
>
> >>> import linecache
> >>> linecache.getline(glob.glob('/etc/*', 4)
>
> doens't work
>
> is there any better methods??? thank you very much in advance
>
> jo3c

Joe Chiang

1/4/2008 6:11:00 AM

0

i have a 2000 files with header and data
i need to get the date information from the header
then insert it into my database
i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
to get the date on line 4 in the txt file i use
linecache.getline('/mydata/myfile.txt/, 4)

but if i use
linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

i am running out of ideas

thanks in advance for any help

jo3c

Shane Geiger

1/4/2008 7:05:00 AM

0

import linecache
import glob

# reading from one file
print linecache.getline('notes/python.txt',4)
'http://www.python.org/doc/cu...\n'

# reading from many files
for filename in glob.glob('/etc/*'):
print linecache.getline(filename,4)




jo3c wrote:
> hi everyone happy new year!
> im a newbie to python
> i have a question
> by using linecache and glob
> how do i read a specific line from a file in a batch and then insert
> it into database?
>
> because it doesn't work! i can't use glob wildcard with linecache
>
>
>>>> import linecache
>>>> linecache.getline(glob.glob('/etc/*', 4)
>>>>
>
> doens't work
>
> is there any better methods??? thank you very much in advance
>
> jo3c
>


--
Shane Geiger
IT Director
National Council on Economic Education
sgeiger@ncee.net | 402-438-8958 | http://ww...

Leading the Campaign for Economic and Financial Literacy

Fredrik Lundh

1/4/2008 9:26:00 AM

0

jo3c wrote:

> i have a 2000 files with header and data
> i need to get the date information from the header
> then insert it into my database
> i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
> to get the date on line 4 in the txt file i use
> linecache.getline('/mydata/myfile.txt/, 4)
>
> but if i use
> linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

glob.glob returns a list of filenames, so you need to call getline once
for each file in the list.

but using linecache is absolutely the wrong tool for this; it's designed
for *repeated* access to arbitrary lines in a file, so it keeps all the
data in memory. that is, all the lines, for all 2000 files.

if the files are small, and you want to keep the code short, it's easier
to just grab the file's content and using indexing on the resulting list:

for filename in glob.glob('/mydata/*/*/*.txt'):
line = list(open(filename))[4-1]
... do something with line ...

(note that line numbers usually start with 1, but Python's list indexing
starts at 0).

if the files might be large, use something like this instead:

for filename in glob.glob('/mydata/*/*/*.txt'):
f = open(filename)
# skip first three lines
f.readline(); f.readline(); f.readline()
# grab the line we want
line = f.readline()
... do something with line ...

</F>

Joe Chiang

1/8/2008 3:27:00 AM

0

On Jan 4, 5:25 pm, Fredrik Lundh <fred...@pythonware.com> wrote:
> jo3c wrote:
> > i have a 2000 files with header and data
> > i need to get the date information from the header
> > then insert it into my database
> > i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
> > to get the date on line 4 in the txt file i use
> > linecache.getline('/mydata/myfile.txt/, 4)
>
> > but if i use
> > linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work
>
> glob.glob returns a list of filenames, so you need to call getline once
> for each file in the list.
>
> but using linecache is absolutely the wrong tool for this; it's designed
> for *repeated* access to arbitrary lines in a file, so it keeps all the
> data in memory. that is, all the lines, for all 2000 files.
>
> if the files are small, and you want to keep the code short, it's easier
> to just grab the file's content and using indexing on the resulting list:
>
> for filename in glob.glob('/mydata/*/*/*.txt'):
> line = list(open(filename))[4-1]
> ... do something with line ...
>
> (note that line numbers usually start with 1, but Python's list indexing
> starts at 0).
>
> if the files might be large, use something like this instead:
>
> for filename in glob.glob('/mydata/*/*/*.txt'):
> f = open(filename)
> # skip first three lines
> f.readline(); f.readline(); f.readline()
> # grab the line we want
> line = f.readline()
> ... do something with line ...
>
> </F>

thank you guys, i did hit a wall using linecache, due to large file
loading into memory.. i think this last solution works well for me
thanks