[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How/Where is a Stream Stored?

Randy Kramer

3/2/2005 8:27:00 PM

Background: In order to do the parsing I've talked about in another thread, in
many circumstances I need to know the number of spaces before and after the
current token. I'm trying to think about efficient ways to do that--one
might be to do a preprocess pass through the text to figure out how many
spaces separate various tokens then store the tokens and spaces between them
in a temporary in memory data structure, or I'll need a way to backtrack from
the found position of some token to find how many spaces separate it from the
previous token.

I'm thinking that maybe a stream ("on" the input file?) might be a way to do
the backtracking (by moving the pos back from the current position, either
one character at a time or several (and then read forward to the first
space)).

I'm wondering how efficient an operation that is--are such stream operations
performed on the disk file itself, or is the stream somehow buffered in
memory and the operations performed there. (Or, am I hopelessly
confused? ;-)

Randy Kramer

Aside: In another thread I'm going to ask about efficient storage for the
other alternative.


3 Answers

Robert Klemme

3/2/2005 10:41:00 PM

0


"Randy Kramer" <rhkramer@gmail.com> schrieb im Newsbeitrag
news:200503021525.43694.rhkramer@gmail.com...
> Background: In order to do the parsing I've talked about in another
> thread, in
> many circumstances I need to know the number of spaces before and after
> the
> current token. I'm trying to think about efficient ways to do that--one
> might be to do a preprocess pass through the text to figure out how many
> spaces separate various tokens then store the tokens and spaces between
> them
> in a temporary in memory data structure, or I'll need a way to backtrack
> from
> the found position of some token to find how many spaces separate it from
> the
> previous token.
>
> I'm thinking that maybe a stream ("on" the input file?) might be a way to
> do
> the backtracking (by moving the pos back from the current position, either
> one character at a time or several (and then read forward to the first
> space)).
>
> I'm wondering how efficient an operation that is--are such stream
> operations
> performed on the disk file itself, or is the stream somehow buffered in
> memory and the operations performed there. (Or, am I hopelessly
> confused? ;-)
>
> Randy Kramer
>
> Aside: In another thread I'm going to ask about efficient storage for the
> other alternative.

:-)

If your files aren't bit then I guess the most efficient way is to slurp
them into mem and use String#scan on that - especially so since the sub
strings share the same string buffer underneath so you get essentially just
one copy of the file in mem AFAIK.

Kind regards

robert

Guillaume Marcais

3/3/2005 10:56:00 PM

0

On Thu, 2005-03-03 at 05:26 +0900, Randy Kramer wrote:
> I'm thinking that maybe a stream ("on" the input file?) might be a way to do
> the backtracking (by moving the pos back from the current position, either
> one character at a time or several (and then read forward to the first
> space)).

It depends on the type of stream. You can backtrack easily with a file,
but you can't with non-seekable stream (like stdin, network socket,
etc.). So using IO#seek would prevent your program to work as a UNIX
filter (reading from stdin, writing to stdout). Might be or not be a
great deal to you, your call...

> I'm wondering how efficient an operation that is--are such stream operations
> performed on the disk file itself, or is the stream somehow buffered in
> memory and the operations performed there. (Or, am I hopelessly
> confused? ;-)

All disk operations on recent OS are cached. Backtracking a small amount
is likely not to generate any disk activity.

Guillaume.




Randy Kramer

3/4/2005 2:22:00 AM

0

On Thursday 03 March 2005 05:56 pm, Guillaume Marcais wrote:
> On Thu, 2005-03-03 at 05:26 +0900, Randy Kramer wrote:
> It depends on the type of stream. You can backtrack easily with a file,
> but you can't with non-seekable stream (like stdin, network socket,
> etc.). So using IO#seek would prevent your program to work as a UNIX
> filter (reading from stdin, writing to stdout). Might be or not be a
> great deal to you, your call...

...

> All disk operations on recent OS are cached. Backtracking a small amount
> is likely not to generate any disk activity.

Guillaume,

Thanks!

Randy Kramer