[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Parsing text with regular expression

Sebastian probst Eide

4/29/2007 8:43:00 PM

Hi
I am writing a class that parses text. It checks each word and counts
how many times they occur in the text. It also checks for 'special'
words, that being words that are capitalized, all upper case or in mixed
case, and ads a flag to those words and checks that the words that are
not special fulfill a certain length requirement. The information is
stored in a hash like this:

{'word' => {:count => 1, :special => false}, 'other_word' => {:count=>
3, :special => true}}

Everything is working fine so far. The thing I am struggling to
implement though is the following:
I want to be able to check the context the 'special' words are in to see
if a capitalized special word maybe only is capitalized because it is
the first word in a new sentence or something like that.

I thought I could check by looking for something like this:

text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
and if I got something else than 0 as a result it would mean that the
word is in the beginning of a sentence. But how do I insert a variable
into the regular expression? Or is there a different much cleverer way
to do this sort of check?

Currently I am scanning for each word like this:

_inn.scan(/\w{2,}[-\w]?/i) do |word|
...
end

and then doing the checking of the words inside that iterator.

Hope you have understood my problem and that you can point me in the
right direction.

best regards
Sebastian

--
Posted via http://www.ruby-....

2 Answers

Tim Hunter

4/29/2007 8:51:00 PM

0

Sebastian probst Eide wrote:
> I thought I could check by looking for something like this:
>
> text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
> and if I got something else than 0 as a result it would mean that the
> word is in the beginning of a sentence. But how do I insert a variable
> into the regular expression?
Use #{}, like this

word = "hello"

test =~ /[[:punct:]]\s?#{word}/

"word" can be any regular expression.

Sebastian probst Eide

4/29/2007 8:55:00 PM

0

Timothy Hunter wrote:
> Sebastian probst Eide wrote:
>> I thought I could check by looking for something like this:
>>
>> text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
>> and if I got something else than 0 as a result it would mean that the
>> word is in the beginning of a sentence. But how do I insert a variable
>> into the regular expression?
> Use #{}, like this
>
> word = "hello"
>
> test =~ /[[:punct:]]\s?#{word}/
>
> "word" can be any regular expression.

Huh... that was the first thing I tried... must have done something else
wrong too in the same expression because it didn't work... I'll try
again.
Thanks Timothy

Sebastian

--
Posted via http://www.ruby-....