[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Strip is not stripping trailing whitespace

Taylor Strait

12/28/2006 6:29:00 AM

I have files with city names which have one or two trailing whitespaces:

Adelanto <-
Agoura Hills <-
Alameda <-
Albany <-
Alhambra <-
Aliso Viejo <-

My method just iterates and strips!

def trim(state)
diskfile = File.new(state + "-cleaned.txt", "w")
$stdout = diskfile

IO.foreach(state + ".txt") do |line|
line.strip!
puts line
end

diskfile.close
$stdout = STDOUT
end

The output successfully removes leading whitespace but not trailing
whitespace. What am I doing wrong? I would chop! but the number of
trailing whitespace characters varies and my attempt at a while loop to
check and chop! was unsuccessful.

--
Posted via http://www.ruby-....

13 Answers

William James

12/28/2006 6:53:00 AM

0

Taylor Strait wrote:
> I have files with city names which have one or two trailing whitespaces:
>
> Adelanto <-
> Agoura Hills <-
> Alameda <-
> Albany <-
> Alhambra <-
> Aliso Viejo <-
>
> My method just iterates and strips!
>
> def trim(state)
> diskfile = File.new(state + "-cleaned.txt", "w")
> $stdout = diskfile
>
> IO.foreach(state + ".txt") do |line|
> line.strip!
> puts line
> end
>
> diskfile.close
> $stdout = STDOUT
> end
>
> The output successfully removes leading whitespace but not trailing
> whitespace. What am I doing wrong? I would chop! but the number of
> trailing whitespace characters varies and my attempt at a while loop to
> check and chop! was unsuccessful.
>
> --
> Posted via http://www.ruby-....

Perhaps there are some control characters at the lines' ends.

The way that you're writing to a file seems roundabout
and peculiar to me.

def trim(state)
open(state + "-cleaned.txt", "w") do |out|
IO.foreach(state + ".txt") do |line|
# The next line will show control characters.
p line
out.puts line.strip
end
end
end

Morton Goldberg

12/28/2006 7:38:00 AM

0

On Dec 28, 2006, at 1:29 AM, Taylor Strait wrote:

> I have files with city names which have one or two trailing
> whitespaces:
>
> Adelanto <-
> Agoura Hills <-
> Alameda <-
> Albany <-
> Alhambra <-
> Aliso Viejo <-
>
> My method just iterates and strips!
>
> def trim(state)
> diskfile = File.new(state + "-cleaned.txt", "w")
> $stdout = diskfile
>
> IO.foreach(state + ".txt") do |line|
> line.strip!
> puts line
> end
>
> diskfile.close
> $stdout = STDOUT
> end
>
> The output successfully removes leading whitespace but not trailing
> whitespace. What am I doing wrong? I would chop! but the number of
> trailing whitespace characters varies and my attempt at a while
> loop to
> check and chop! was unsuccessful.

Strip is working as it should. Your input lines don't end is spaces,
but with a line end code. The easy way to do what you want is
something like the following:

<code>
#! /usr/bin/env ruby -w

PREFOX = "/Users/mg/Desktop/test"
SUFFIX = ".txt"
File.open(PREFOX + "-cleaned" + SUFFIX, "w") do |out_file|
File.open(PREFOX + SUFFIX) do |in_file|
in_file.each { |line| out_file.puts line.chomp.strip }
end
end
</code>

Note the use of String#chomp. Also, I'm recommending that you use
File.open and that you don't mess with $stdout. File#open
automatically takes care of closing the files it opens.

Regards, Morton

Taylor Strait

12/28/2006 7:39:00 AM

0

def trim(state)
open(state + "-cleaned.txt", "w") do |out|
IO.foreach(state + ".txt") do |line|
# The next line will show control characters.
p line
out.puts line.strip
end
end
end

outputs:

"Yorba Linda\240\240\n"
"Yountville \n"
"Yreka\240\240\n"
"Yuba City\240\240\n"
"Yucaipa\240\240\n"
"Yucca Valley "
=> nil

but alas the generated file still has trailing whitespaces. What should
I do to remove the \240 and \n? Is that not what strip! does?

--
Posted via http://www.ruby-....

Taylor Strait

12/28/2006 7:45:00 AM

0

> Note the use of String#chomp

Thanks, Morton. That was the key. I just appended two .chomp methods
and it fixed it right up. I appreciate everyone's help!

--
Posted via http://www.ruby-....

William James

12/28/2006 8:18:00 AM

0

Taylor Strait wrote:
> > Note the use of String#chomp
>
> Thanks, Morton. That was the key. I just appended two .chomp methods
> and it fixed it right up.

It couldn't have.

strip removes newlines, because they are whitespace.
On the other hand, neither chomp nor strip removes "\240".

>> "foo \n".strip
=> "foo"
>> ("foo " + 0240.chr).chomp
=> "foo \240"

Taylor Strait

12/28/2006 8:31:00 AM

0

> It couldn't have.
>
> strip removes newlines, because they are whitespace.
> On the other hand, neither chomp nor strip removes "\240".
>
>>> "foo \n".strip
> => "foo"
>>> ("foo " + 0240.chr).chomp
> => "foo \240"

This is why I shouldn't code at 3:30am! I had used chop instead, which
of course truncated the text in rare cases. How can I remove \240?

--
Posted via http://www.ruby-....

Eric Hodel

12/28/2006 8:52:00 AM

0

On Dec 28, 2006, at 24:31, Taylor Strait wrote:
>> It couldn't have.
>>
>> strip removes newlines, because they are whitespace.
>> On the other hand, neither chomp nor strip removes "\240".
>>
>>>> "foo \n".strip
>> => "foo"
>>>> ("foo " + 0240.chr).chomp
>> => "foo \240"
>
> This is why I shouldn't code at 3:30am! I had used chop instead,
> which
> of course truncated the text in rare cases. How can I remove \240?

String#gsub

line.gsub(/[\240]/, '')

--
Eric Hodel - drbrain@segment7.net - http://blog.se...

I LIT YOUR GEM ON FIRE!


Taylor Strait

12/28/2006 8:53:00 AM

0

> How can I remove \240?

Time for a break. The answer is: delete(0240.chr)

--
Posted via http://www.ruby-....

Carlos

12/28/2006 8:56:00 AM

0

[Taylor Strait <taylorstrait@gmail.com>, 2006-12-28 09.31 CET]
> > It couldn't have.
> >
> > strip removes newlines, because they are whitespace.
> > On the other hand, neither chomp nor strip removes "\240".
> >
> >>> "foo \n".strip
> > => "foo"
> >>> ("foo " + 0240.chr).chomp
> > => "foo \240"
>
> This is why I shouldn't code at 3:30am! I had used chop instead, which
> of course truncated the text in rare cases. How can I remove \240?

line.delete! "\240"

or, if you want to delete them at the end only,

line.sub! /\240+$/, ""

Good luck.

--

William James

12/28/2006 8:57:00 AM

0

Taylor Strait wrote:
> > It couldn't have.
> >
> > strip removes newlines, because they are whitespace.
> > On the other hand, neither chomp nor strip removes "\240".
> >
> >>> "foo \n".strip
> > => "foo"
> >>> ("foo " + 0240.chr).chomp
> > => "foo \240"
>
> This is why I shouldn't code at 3:30am! I had used chop instead, which
> of course truncated the text in rare cases. How can I remove \240?
>
> --
> Posted via http://www.ruby-....

Remove ASCII 128-- 255 at end of line:

irb(main):016:0> "foo\240\250 \n".strip.
irb(main):017:0* sub(/[#{128.chr}-#{255.chr}]+$/,"")
=> "foo"