Austin Ziegler
10/14/2004 1:42:00 PM
On Thu, 14 Oct 2004 21:22:43 +0900, Alexey Verkhovsky <alex@verk.info> wrote:
> Hi all,
>
> I am writing some sort of BBS in Ruby (on Rails). I downloaded and
> included RedCloth for template rendering (in 5 lines of code and 15
> lines of test - wow!). It's cool, but allows to include any HTML.
>
> Now, I don't want to let some kiddie include some <javascript/> that
> would make an innocent BBS thread pop 50 new browsers - no matter how
> cool it might seem.
>
> I wonder if there is any existing code to sanitize user inputs by
> replacing dangerous HTML tags (like the aforementioned <javascript/>),
> that I could use with RedCloth to alleviate this risk.
>
> Ditto for plain text inputs (user names, subjects and other such).
There is some work that I'm doing with Ruwiki that is currently in CVS
that covers this -- it currently covers it too well, but it does cover
it. (I just fixed this.)
# Find HTML tags
SIMPLE_TAG_RE = %r{<[^<>]+?>} # Ensure that only the tag is grabbed.
HTML_TAG_RE = %r{\A< # Tag must be at start of match.
(/)? # Closing tag?
([\w:]+) # Tag name
(?:\s+ # Space
([^>]+) # Attributes
(/)? # Singleton tag?
)? # The above three are optional
>}x
ATTRIBUTES_RE = %r{([\w:]+)(=(?:\w+|"[^"]+?"|'[^']+?'))?}x
ALLOWED_ATTR = %w(style title type lang dir class id cite datetime abbr) +
%w(colspan rowspan compact start media)
ALLOWED_HTML = %w(abbr acronym address b big blockquote br caption cite) +
%w(code col colgroup dd del dfn dir div dl dt em h1 h2 h3) +
%w(h4 h5 h6 hr i ins kbd kbd li menu ol p pre q s samp) +
%w(small span span strike strong style sub sup table tbody) +
%w(td tfoot th thead tr tt u ul var)
# Clean the content of unsupported HTML and attributes. This includes
# XML namespaced HTML. Sorry, but there's too much possibility for
# abuse.
def clean(content)
content = content.gsub(SIMPLE_TAG_RE) do |tag|
tagset = HTML_TAG_RE.match(tag)
if tagset.nil?
tag = Ruwiki.clean_entities(tag)
else
closer, name, attributes, single = tagset.captures
if ALLOWED_HTML.include?(name.downcase)
unless closer or attributes.nil?
attributes = attributes.scan(ATTRIBUTES_RE).map do |set|
if ALLOWED_ATTR.include?(set[0].downcase)
set.join
else
""
end
end.compact.join(" ")
tag = "<#{closer}#{name} #{attributes}#{single}>"
else
tag = "<#{closer}#{name}>"
end
else
tag = Ruwiki.clean_entities(tag)
end
end
tag
end
end
Ruwiki.clean_entities converts all instances of & => &, < => <,
and > => >.
-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca
: as of this email, I have [ 5 ] Gmail invitations