[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: uniforma-0.0.1 - converter for text formats

Victor 'Zverok' Shepelev

9/17/2007 5:05:00 PM

From: micathom [mailto:micathom@gmail.com]
Sent: Monday, September 17, 2007 10:50 AM
>
>> Library for parsing "simple text" formats (RD, Textile, Markdown, etc.)
>and
>> generating output in various formats (including simple text, html/xml and
>> more complex ones).
>
>I wrote deplate[1], which has similar goals (well, with the exception
>of
>source quality maybe ;-).
>
>The point here is of course that simple formats are easy to parse, so
>the question is how simple do you mean with "simple".

I meant, those whose parsers can be defined with some easy common DSL :)

>
>If "simple" includes cross references, footnotes, endnotes, headers,
>footers, table of contents/tables/figures etc., I think you'll
>probably
>need:
>
> - a general way to define counters and lists

yep, something already exists, something more will.

> - some notion of metadata (like index, footnotes, labels, section
> names etc.)

yep, planned.

> - make it possible to locate text at some random position in the
> output document (eg for headers & footers), e.g. move text to
>the
> top of the document, after packages are required but before the
> start of the body etc. deplate defines "slots" for this which
> allows users to place the element at any position they want.

If I understand you correctly, now Uniforma's generators does this "hack"
(placing fearst "heading" paragraph in generated html <title> tag):

lib\uniforma\generators\html.rb (lines 6-17):

---
pre(:document) do |document|
title = document.find_first(:heading)
if title
%Q{
<html>
<head><title>#{title.text}</title></head>
<body>
}
else
%Q{<html><body>\n}
end
end
---

So, I don't think about "random" positions, but only about some
pre(:document) and post(:document) actions, which has access to overall
document.

This approach seems natural enough for me.

> - on the long (or intermediate-distance) run, you might also
>think
> of some plugin-mechanism (e.g. e-mail obfuscation that may be
> loaded when converting the document without being hard-coded,
> although this could also be done by post-processing the output).

yep. I've thought about syntax like ("plug-in" to rewrite some urls):

Uniforma.textile('mydocument.text').to_html do
rewrite(:href, %r{http://somesite\.com}) do |url|
url.gsub(/somesite/, 'othersite')
end
end

>> * non-line based formats parsers (in fact, it also has one "toy" parser
>for
>> HTML, which even works! on not-very-complex HTML documents)
>
>>From a pragmatic point of view, using hpricot and writing and map
>classes on its output could be the better strategy.

From the pragmatic point of view, I've already had HTMLSax-related stuff
before started to work on Uniforma :)

I'll think about converting this part to use Hpricot, but later, when core
thing (parsers and generators) will work well.

V.