[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Problem when removing accents from a String

César Díaz

4/12/2009 5:06:00 PM

Hi,

I've spend several time trying to replace latin characters in strings.
What I want is something like this:

Given the word "camión", I would like to get the following string:
"camion".

I've found several solutions in the Internet but they only work in the
Rails console, and when I try to launch my Rails application on Mongrel
(I am using Netbeans) nothing is working and from "camión" I am getting
"camin" instead of "camion".

For example, I've tried with these snippets:


1.*********************************
def nice_slug(str)

accents = {
['á','à','â','ä','ã'] => 'a',
['�','�','�','�','Á'] => 'A',
['é','è','ê','ë'] => 'e',
['Ã?','Ã?','Ã?','Ã?'] => 'E',
['í','ì','î','ï'] => 'i',
['Í','�','�','Ï'] => 'I',
['ó','ò','ô','ö','õ'] => 'o',
['Ã?','Ã?','Ã?','Ã?','Ã?'] => 'O',
['ú','ù','û','ü'] => 'u',
['Ã?','Ã?','Ã?','Ã?'] => 'U',
['ç'] => 'c', ['�'] => 'C',
['ñ'] => 'n', ['�'] => 'N'
}
accents.each do |ac,rep|
ac.each do |s|
str = str.gsub(s, rep)
end
end
str = str.gsub(/[^a-zA-Z0-9 ]/,"")

str = str.gsub(/[ ]+/," ")


str = str.gsub(/ /,"-")

str = str.downcase

end


2.*************************
"camión".parameterize

It throws the following error:

undefined method `normalize' for "cami�n:String

c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/inflector.rb:283:in
`transliterate'
c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/inflector.rb:262:in
`parameterize'
c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/core_ext/string/inflections.rb:106:in
`parameterize'

3.*************************************
"camión".mb_chars.decompose.scan(/[a-zA-Z0-9]/).join

It throws the following error:

undefined method `decompose' for "cami�n:String


I am using Ruby 1.8.6, Rails 2.2, Mongrel 1.1.5 and Windows XP.


Please, some help.
--
Posted via http://www.ruby-....

1 Answer

Axel Etzold

4/12/2009 5:58:00 PM

0

Dear César,

have you also tried iconv (http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/classes/... )?
A quick fix might be to use it to convert between different, incompatible
encodings, ignoring the accents (which occur in utf-8 or iso-8859-1 encoding, but not in the target ascii encoding).

I can't try it out on Windows XP right now, but you can see some discussion and examples here:

http://www.ruby-forum.com/t...

Another option might be to use the HTML entities gem.
It gives you the names of the accented character, and you can then
remove the accents in a small routine using regexps,

require 'rubygems'
require 'htmlentities'
coder = HTMLEntities.new
string = "<élan>"
res=coder.encode(string, :named) # => "&lt;&eacute;lan&gt;"

and replace the name of the accent with a regexp:

res.gsub(/&(.)(acute|grave|circ|tilde|cedil|ring|slash);/,'\1')

This works well for the western European and northern European languages,
but it fails for some accents of middle Eastern European languages, such as the Polish ogoneks (http://en.wikipedia.org/w...) and the carons of Czech : http://en.wikipedia.org/...) ...


Best regards,

Axel
--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K1...