[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

text wrap

ishamid

12/9/2006 11:07:00 PM

Hi,

I have a ruby script that's doing what I need, and I would like to add
one feature. The script outputs a text file, and I would like Ruby to
take the final text file and wrap it at 67 characters, instead of doing
it manually in the editor.

Is there an available method or something that I can invoke? If so, can
you give me an example of its use (I'm still very much a novice!)

Best
Idris

5 Answers

Paul Lutus

12/9/2006 11:31:00 PM

0

ishamid wrote:

> Hi,
>
> I have a ruby script that's doing what I need, and I would like to add
> one feature. The script outputs a text file, and I would like Ruby to
> take the final text file and wrap it at 67 characters, instead of doing
> it manually in the editor.
>
> Is there an available method or something that I can invoke? If so, can
> you give me an example of its use (I'm still very much a novice!)

Idris, I can offer various methods to break text up into individual lines,
but I want to ask you to think this over before doing it.

If you take a text document containing paragraphs (each a continuous string
with no embedded linefeeds) and break the paragraphs into lines of specific
lengths, you are throwing away information that is very difficult -- almost
impossible in some cases -- to recover.

Most environments can produce nicely formatted paragraphs for you, while you
are editing the text and while printing it. It is generally preferred that
the paragraph-to-line conversion take place at the time of display or
printing, not before, and that the text be retained in its original form.

I am not unaware of the irony of posting this advice in a medium (Usenet)
that breaks paragraphs into lines right away, before transmitting the
message, and this trait is shared by the standard e-mail protocol. Both
these behaviors (e.g. Usenet and e-mail breaking up paragraphs) are now
widely recognized as mistakes, unfortunately they cannot really be
corrected at this late date. But newer protocols, and all decent word
processing document formats, do not do this to their content, for excellent
reasons.

Consider this. Let's say you break a document up into 67-character lines,
permanently, and save it in that form. Later on, you discover you need to
print the document with 80-character lines. You are out of luck -- the
damage has been done.

There are various schemes to rejoin broken lines into paragraphs once again,
but all of them have corner cases where they fail. It is generally agreed
to be better not to have broken the paragraphs in the first place.

If, after reading this, you still want to break paragraphs into lines, post
again, and someone will offer suggestions about how to proceed.

--
Paul Lutus
http://www.ara...

ishamid

12/10/2006 12:15:00 AM

0

Hi Paul,

On Dec 9, 4:31 pm, Paul Lutus <nos...@nosite.zzz> wrote:
> ishamid wrote:
> > Hi,
>
> > I have a ruby script that's doing what I need, and I would like to add
> > one feature. The script outputs a text file, and I would like Ruby to
> > take the final text file and wrap it at 67 characters, instead of doing
> > it manually in the editor.
>
> > Is there an available method or something that I can invoke? If so, can
> > you give me an example of its use (I'm still very much a novice!)

> Idris, I can offer various methods to break text up into individual lines,
> but I want to ask you to think this over before doing it.

Ok, see below...

> If you take a text document containing paragraphs (each a continuous string
> with no embedded linefeeds) and break the paragraphs into lines of specific
> lengths, you are throwing away information that is very difficult -- almost
> impossible in some cases -- to recover.

> Most environments can produce nicely formatted paragraphs for you, while you
> are editing the text and while printing it. It is generally preferred that
> the paragraph-to-line conversion take place at the time of display or
> printing, not before, and that the text be retained in its original form.

Perhaps my case is different. I am converting OOo xml to TeX for
further processing. Since TeX input is one-dimensional, TeX-the-engine
does not, akaik, care if the input is wrapped or not. I generally edit
my TeX-documents wrapped and editing a mile-long paragraph is a pain.
If I run the script and wrap from the editor then, when I run the
script again, I have to wrap from the editor again.

Does this case seem like an exception to your point or no?

>
> I am not unaware of the irony of posting this advice in a medium (Usenet)
> that breaks paragraphs into lines right away, before transmitting the
> message, and this trait is shared by the standard e-mail protocol. Both
> these behaviors (e.g. Usenet and e-mail breaking up paragraphs) are now
> widely recognized as mistakes, unfortunately they cannot really be
> corrected at this late date. But newer protocols, and all decent word
> processing document formats, do not do this to their content, for excellent
> reasons.
>
> Consider this. Let's say you break a document up into 67-character lines,
> permanently, and save it in that form. Later on, you discover you need to
> print the document with 80-character lines. You are out of luck -- the
> damage has been done.

But TeX does not care about line lengths in an editor; it just
processes a single paragraph as it's told, ignoring wrapping
completely.

I notice that David Kastrup posts here; he is an expert on TeX
text-editing and can correct me if I'm wrong.

> There are various schemes to rejoin broken lines into paragraphs once again,
> but all of them have corner cases where they fail. It is generally agreed
> to be better not to have broken the paragraphs in the first place.
>
> If, after reading this, you still want to break paragraphs into lines, post
> again, and someone will offer suggestions about how to proceed.

Do my reasons make more since now, or should I consider something else?

Thank you so much for the care you put into answering my question; it
is appreciated!

Best
Idris

Paul Lutus

12/10/2006 12:37:00 AM

0

ishamid wrote:

/ ...

> Perhaps my case is different. I am converting OOo xml to TeX for
> further processing.

Do you mean an OpenOffice content.xml file, unpacked from a .odt document?
In that case, why not just insert some conveniently placed linefeeds to put
the tags on separate lines?

It happens I do this regularly, because I am constantly playing with
OpenOffice XML content. Here is my script to beautify content.xml files:

-------------------------------------------

#!/usr/bin/ruby -w

# beuatifies an XML file, usually only for analysis,
# because splitting one up and indenting its lines may
# make it unusable to the originating program.

def beautifyXML(data)
tab = 0
xml = ""
data.gsub!(%r{<},"\n<")
data.gsub!(%r{>},">\n")
data.gsub!(%r{\n+},"\n")
data.split("\n").each { |record|
record.strip!
outc = record.scan(%r{(</|/>)}).length
inc = record.scan(%r{<\w}).length
net = inc - outc
tab += (net < 0)?net:0
xml += (" " * tab) + record + "\n"
tab += (net > 0)?net:0
}
if(tab != 0)
$stderr.puts "Error: tag mismatch: #{tab}"
end
xml
end

# stream in/stream out

print beautifyXML(readlines.join("\n"))

-------------------------------------------

I want to emphasize this is a quickie script, not a thoroughly tested
application, and its sole purpose is to aid in understanding the syntax of
an XML file. I should also say that it may not work if the output is put
back into an OpenOffice document after beautification.

To use the script:

$ (script name) < content.xml > output.xml

This idea of beautifying an XML file so it is easier to understand and
process is quite different than your original inquiry. In many cases, XML
files are perfectly fine after they have been indented and made readable,
and nothing is lost.

But ... between some of the XML tag pairs emitted by this script may be
long, long lines of text. Those lines should not be broken up without first
considering the implications (the essence of my first post).

Also, one defect in the above script is that the first line of the script,
the XML identifying header, is not on the first line. This is something I
don't care about, because I don't try to use the output of this script for
anything but analysis. This error should easy to fix if you plan to use the
output as legitimate XML.

--
Paul Lutus
http://www.ara...

Daniel Finnie

12/10/2006 1:05:00 AM

0

text = whatever you need to wrap

To print it:
0.step(text.length, 67) {|x| puts text[x, x+67]}

To put it in another variable:
wrapped = ""
0.step(text.length, 67) {|x| wrapped << "\n" << text[x, x+67]}

Dan

ishamid wrote:
> Hi,
>
> I have a ruby script that's doing what I need, and I would like to add
> one feature. The script outputs a text file, and I would like Ruby to
> take the final text file and wrap it at 67 characters, instead of doing
> it manually in the editor.
>
> Is there an available method or something that I can invoke? If so, can
> you give me an example of its use (I'm still very much a novice!)
>
> Best
> Idris
>
>
>

William James

12/10/2006 1:42:00 AM

0


ishamid wrote:
> Hi,
>
> I have a ruby script that's doing what I need, and I would like to add
> one feature. The script outputs a text file, and I would like Ruby to
> take the final text file and wrap it at 67 characters, instead of doing
> it manually in the editor.
>
> Is there an available method or something that I can invoke? If so, can
> you give me an example of its use (I'm still very much a novice!)
>
> Best
> Idris

X = 10 # Width is 10 characters.
str =
"This\nis a test of the emergency broadcasting servicings I
asseverate"
p str.gsub(/\n/," ").scan(/\S.{0,#{X-2}}\S(?=\s|$)|\S+/)