[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Problem removing new line characters on Mac OS X

Singeo

5/17/2007 7:36:00 AM

Hi, I'm pretty new to Ruby. I've got a text file where I need to
remove some new line characters. I've tried everything I can think of
to do this with no success, including:

line.gsub!("/r","")
line.gsub!("/n","")
line=line.chomp

I can't seem to get the new line character to be recognised and dealt
with. Any advice appreciated.

Thanks

16 Answers

Harry Kakueki

5/17/2007 7:51:00 AM

0

On 5/17/07, Singeo <singeo.sg@gmail.com> wrote:
> Hi, I'm pretty new to Ruby. I've got a text file where I need to
> remove some new line characters. I've tried everything I can think of
> to do this with no success, including:
>
> line.gsub!("/r","")
> line.gsub!("/n","")
> line=line.chomp
>
> I can't seem to get the new line character to be recognised and dealt
> with. Any advice appreciated.
>
> Thanks
>
>
>

line.chomp! doesn't work?
Would you show some code?

Harry

--

A Look into Japanese Ruby List in English
http://www.ka...

Dan Zwell

5/17/2007 8:38:00 AM

0

Singeo wrote:
> Hi, I'm pretty new to Ruby. I've got a text file where I need to
> remove some new line characters. I've tried everything I can think of
> to do this with no success, including:
>
> line.gsub!("/r","")
> line.gsub!("/n","")
> line=line.chomp
>
> I can't seem to get the new line character to be recognised and dealt
> with. Any advice appreciated.
>
> Thanks
>
>
>

It looks like you should be using backslashes. If you want to match both
newlines and carriage returns, you can use:

line.gsub!(/[\n\r]/, "")

-Dan

Hermann Martinelli

5/17/2007 9:02:00 AM

0

Singeo wrote:
> Hi, I'm pretty new to Ruby. I've got a text file where I need to
> remove some new line characters. I've tried everything I can think of
> to do this with no success, including:
>
> line.gsub!("/r","")
> line.gsub!("/n","")
> line=line.chomp

In case your problem is just your Ruby syntax:

1. Replace the forward slashes (like in "/r") by
backward slashes ("\r" in your above mentioned
solution.

2. Make the first parameter to the gsub! method
a Regexp instead of a string. The API docs say:
"... if it is a String then no regular expression
metacharacters will be interpreted ...".

This is why neither "/r" (1) nor "\r" (2) will
work.



If chomp does not work, you may be using a Mac
file under Linux or Windows. In that case you may
want to try something like

line.gsub!(/\015/, '')


Hermann

Singeo

5/17/2007 9:17:00 AM

0

Apologies for the mis-understanding, I have been using backslashes.
Here's my code, as you'll see from the resulting file there are a
bunch of new line characters in the file I'd like to get rid of.
Thanks for the help so far.

require("rubygems")
require("scrubyt")
require ("open-uri")
require 'time'
require 'date'

psi = Scrubyt::Extractor.define do
fetch("http://app.nea.gov.sg/...)

record("/html/body/div/table/tr/td/table/tbody/tr/td/div",
{ :generalize => true }) do
title("/strong[1]/font[1]")
item("/table/tbody/tr/td/table/tbody/tr", { :generalize => true })
do
region("/td[1]")
psi("/td[7]")
aqd("/td[8]")
end
end
end

f = open("psiregions.xml", File::CREAT|File::TRUNC|File::RDWR) {|f|
psi.to_xml.write(f, 1)
}

# Create the RSS file.
rssfile = File.new("sgpsi.xml", "w")
rssfile.puts('<?xml version="1.0" encoding="UTF-8"?>')
rssfile.puts('<rss version="2.0">')
rssfile.puts(' <channel>')
rssfile.puts(' <link>http://app.nea.gov.sg/psi/</lin...)
rssfile.puts(' <description>Singapore PSI Readings</description>')
#rssfile.puts(' <title>Singapore PSI Readings' + Time.now.rfc2822
+ '</title>')
rssfile.puts(' <lastBuildDate>' + Time.now.rfc2822 + '</
lastBuildDate>')
rssfile.puts(' <webMaster>singeo@singeo.com.sg</webMaster>')

File.open('psiregions.xml', 'r') do |f1|
while line = f1.gets
line=line.strip
line=line.chomp
line.gsub!(/[\n]/, "")
line.gsub!(/<root>/, "")
line.gsub!(/<\/root>/, "")
line.gsub!(/<record>/, "")
line.gsub!(/<\/record>/, "")
line.gsub!("24-hr", "Singapore 24-hr")
line.gsub!("<region>Region</region>", "")
line.gsub!("<region>Sulphur Dioxide</region>", "")
line.gsub!(/<region>/, "<title>")
line.gsub!(/<\/region>/,":")
line.gsub!(/<psi>/, " PSI Level ")
line.gsub!(/<\/psi>/, "")
line.gsub!(/<aqd>/, " - ")
line.gsub!(/<\/aqd>/, "</title>")
line.gsub!(/<item>/, "<item><pubDate>" + Time.now.rfc2822 + "</
pubDate>")
rssfile.puts line
end
end


rssfile.puts('</channel>')
rssfile.puts('</rss>')
rssfile.close


On May 17, 4:38 pm, Dan Zwell <dzw...@gmail.com> wrote:
> Singeo wrote:
> > Hi, I'm pretty new to Ruby. I've got a text file where I need to
> > remove some new line characters. I've tried everything I can think of
> > to do this with no success, including:
>
> > line.gsub!("/r","")
> > line.gsub!("/n","")
> > line=line.chomp
>
> > I can't seem to get the new line character to be recognised and dealt
> > with. Any advice appreciated.
>
> > Thanks
>
> It looks like you should be using backslashes. If you want to match both
> newlines and carriage returns, you can use:
>
> line.gsub!(/[\n\r]/, "")
>
> -Dan


Singeo

5/17/2007 9:22:00 AM

0

Hi Hermann, just tried your suggestion of:

line.gsub!(/\015/, '')

still no success. I'm creating and running the file on a Mac.

On May 17, 5:05 pm, Hermann Martinelli <martine...@yahoo.com> wrote:
> Singeo wrote:
> > Hi, I'm pretty new to Ruby. I've got a text file where I need to
> > remove some new line characters. I've tried everything I can think of
> > to do this with no success, including:
>
> > line.gsub!("/r","")
> > line.gsub!("/n","")
> > line=line.chomp
>
> In case your problem is just your Ruby syntax:
>
> 1. Replace the forward slashes (like in "/r") by
> backward slashes ("\r" in your above mentioned
> solution.
>
> 2. Make the first parameter to the gsub! method
> a Regexp instead of a string. The API docs say:
> "... if it is a String then no regular expression
> metacharacters will be interpreted ...".
>
> This is why neither "/r" (1) nor "\r" (2) will
> work.
>
> If chomp does not work, you may be using a Mac
> file under Linux or Windows. In that case you may
> want to try something like
>
> line.gsub!(/\015/, '')
>
> Hermann


Robert Dober

5/17/2007 9:40:00 AM

0

<snip>

Sebastian Hungerecker

5/17/2007 9:53:00 AM

0

Singeo wrote:
> Here's my code, as you'll see from the resulting file there are a
> bunch of new line characters in the file I'd like to get rid of.
> [...]
> line=line.chomp
> [...]
> rssfile.puts line

puts adds a newline to the end of the string it writes. If you don't want that
behaviour (which you obviously don't), use print instead.


--
Ist so, weil ist so
Bleibt so, weil war so

Hermann Martinelli

5/17/2007 10:20:00 AM

0

Singeo wrote:
> Hi Hermann, just tried your suggestion of:
>
> line.gsub!(/\015/, '')
>
> still no success. I'm creating and running the file on a Mac.

Are you shure that you it is not successful?

It would be good to know how you read the lines,
how you (not) remove the carriage returns,
and how you perhaps put the lines together
(adding again \r characters by mistake?).

Are you removing the carriage returns line
by line (in which case the chomp should be perfect)
or are you trying it as a whole, i.e. do you have
not only one line but a whole file in 'line'?

Rather than an answer to these questions I would
prefer to see some more code of the whole part
from opening the file to writing back or putting
out the strings.

Hermann

Singeo

5/17/2007 10:33:00 AM

0

Hermann, I followed Sebatian's advice to use print instead of puts and
that solved my problem. But I would still like to understand how to
remove the newline characters. Here's my code as it currently stands
(with "rssfile.print line" in place of "rssfile.puts line"), hopefully
it will help you see how I was trying to tackle the problem.

require("rubygems")
require("scrubyt")
require ("open-uri")
require 'time'
require 'date'

psi = Scrubyt::Extractor.define do
fetch("http://app.nea.gov.sg/...)

record("/html/body/div/table/tr/td/table/tbody/tr/td/div",
{ :generalize => true }) do
title("/strong[1]/font[1]")
item("/table/tbody/tr/td/table/tbody/tr", { :generalize => true })
do
region("/td[1]")
psi("/td[7]")
aqd("/td[8]")
end
end
end

f = open("psiregions.xml", File::CREAT|File::TRUNC|File::RDWR) {|f|
psi.to_xml.write(f, 1)
}

# Create the RSS file.
rssfile = File.new("sgpsi.xml", "w")
rssfile.puts('<?xml version="1.0" encoding="UTF-8"?>')
rssfile.puts('<rss version="2.0">')
rssfile.puts(' <channel>')
rssfile.puts(' <link>http://app.nea.gov.sg/psi/</lin...)
rssfile.puts(' <description>Singapore PSI Readings</description>')
#rssfile.puts(' <title>Singapore PSI Readings' + Time.now.rfc2822
+ '</title>')
rssfile.puts(' <lastBuildDate>' + Time.now.rfc2822 + '</
lastBuildDate>')
rssfile.puts(' <webMaster>singeo@singeo.com.sg</webMaster>')

File.open('psiregions.xml', 'r') do |f1|
while line = f1.gets
line=line.strip
line.gsub!(/<root>/, "")
line.gsub!(/<\/root>/, "")
line.gsub!(/<record>/, "")
line.gsub!(/<\/record>/, "")
line.gsub!("24-hr", "Singapore 24-hr")
line.gsub!("<region>Region</region>", "")
line.gsub!("<region>Sulphur Dioxide</region>", "")
line.gsub!(/<region>/, "<title>")
line.gsub!(/<\/region>/,":")
line.gsub!(/<psi>/, " PSI Level ")
line.gsub!(/<\/psi>/, "")
line.gsub!(/<aqd>/, " - ")
line.gsub!(/<\/aqd>/, "</title>")
line.gsub!(/<item>/, "<item><pubDate>" + Time.now.rfc2822 + "</
pubDate>")
line.gsub!(/<\/item>/, "</item>\n")
rssfile.print line
end
end

rssfile.puts('')
rssfile.puts('</channel>')
rssfile.puts('</rss>')

rssfile.close


On May 17, 6:20 pm, Hermann Martinelli <hermann.martine...@yahoo.com>
wrote:
> Singeo wrote:
> > Hi Hermann, just tried your suggestion of:
>
> > line.gsub!(/\015/, '')
>
> > still no success. I'm creating and running the file on a Mac.
>
> Are you shure that you it is not successful?
>
> It would be good to know how you read the lines,
> how you (not) remove the carriage returns,
> and how you perhaps put the lines together
> (adding again \r characters by mistake?).
>
> Are you removing the carriage returns line
> by line (in which case the chomp should be perfect)
> or are you trying it as a whole, i.e. do you have
> not only one line but a whole file in 'line'?
>
> Rather than an answer to these questions I would
> prefer to see some more code of the whole part
> from opening the file to writing back or putting
> out the strings.
>
> Hermann


Hermann Martinelli

5/17/2007 10:44:00 AM

0

Hi Siingeo (btw: is that your first name?),

Singeo wrote:
> ... I followed Sebatian's advice to use print instead of puts and
> that solved my problem.

Running out of time now, but I see that Sebastian
has given the correct answer already. I was on the
same track, which is why I was asking for the code
to see how you read and write the lines.

> But I would still like to understand how to
> remove the newline characters. Here's my code as it currently stands
> (with "rssfile.print line" in place of "rssfile.puts line"), hopefully
> it will help you see how I was trying to tackle the problem.

You may have done well removing the newlines (to be
more precise: the carriage returns or "\r" characters),
but then when writing the lines you add it back again:

Hermann