Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
Problem when removing accents from a String
César Díaz
4/12/2009 5:06:00 PM
Hi,
I've spend several time trying to replace latin characters in strings.
What I want is something like this:
Given the word "camión", I would like to get the following string:
"camion".
I've found several solutions in the Internet but they only work in the
Rails console, and when I try to launch my Rails application on Mongrel
(I am using Netbeans) nothing is working and from "camión" I am getting
"camin" instead of "camion".
For example, I've tried with these snippets:
1.*********************************
def nice_slug(str)
accents = {
['á','à ','â','ä','ã'] => 'a',
['Ã?','Ã?','Ã?','Ã?','Ã'] => 'A',
['é','è','ê','ë'] => 'e',
['Ã?','Ã?','Ã?','Ã?'] => 'E',
['Ã','ì','î','ï'] => 'i',
['Ã','Ã?','Ã?','Ã'] => 'I',
['ó','ò','ô','ö','õ'] => 'o',
['Ã?','Ã?','Ã?','Ã?','Ã?'] => 'O',
['ú','ù','û','ü'] => 'u',
['Ã?','Ã?','Ã?','Ã?'] => 'U',
['ç'] => 'c', ['�'] => 'C',
['ñ'] => 'n', ['�'] => 'N'
}
accents.each do |ac,rep|
ac.each do |s|
str = str.gsub(s, rep)
end
end
str = str.gsub(/[^a-zA-Z0-9 ]/,"")
str = str.gsub(/[ ]+/," ")
str = str.gsub(/ /,"-")
str = str.downcase
end
2.*************************
"camión".parameterize
It throws the following error:
undefined method `normalize' for "cami�n:String
c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/inflector.rb:283:in
`transliterate'
c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/inflector.rb:262:in
`parameterize'
c:/ruby/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/active_support/core_ext/string/inflections.rb:106:in
`parameterize'
3.*************************************
"camión".mb_chars.decompose.scan(/[a-zA-Z0-9]/).join
It throws the following error:
undefined method `decompose' for "cami�n:String
I am using Ruby 1.8.6, Rails 2.2, Mongrel 1.1.5 and Windows XP.
Please, some help.
--
Posted via
http://www.ruby-...
.
1 Answer
Axel Etzold
4/12/2009 5:58:00 PM
0
Dear César,
have you also tried iconv (
http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/classes/...
)?
A quick fix might be to use it to convert between different, incompatible
encodings, ignoring the accents (which occur in utf-8 or iso-8859-1 encoding, but not in the target ascii encoding).
I can't try it out on Windows XP right now, but you can see some discussion and examples here:
http://www.ruby-forum.com/t...
Another option might be to use the HTML entities gem.
It gives you the names of the accented character, and you can then
remove the accents in a small routine using regexps,
require 'rubygems'
require 'htmlentities'
coder = HTMLEntities.new
string = "<élan>"
res=coder.encode(string, :named) # => "<élan>"
and replace the name of the accent with a regexp:
res.gsub(/&(.)(acute|grave|circ|tilde|cedil|ring|slash);/,'\1')
This works well for the western European and northern European languages,
but it fails for some accents of middle Eastern European languages, such as the Polish ogoneks (
http://en.wikipedia.org/w...
) and the carons of Czech :
http://en.wikipedia.org/...
) ...
Best regards,
Axel
--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!*
http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K1...
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
Problem when removing accents from a String
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password