yermej
5/1/2008 2:32:00 PM
On May 1, 3:41 am, Brubix <bruno.bazz...@tin.it> wrote:
> I have some text documents containing series of questions and answers.
> I need to extract all questions and answers to load a database.
> No problem in reading the text document or writing to the database.
> I need help with the regexps to parse the document (please note that
> some questions and answers extend on two or more lines).
>
> This is an example of the documents I have to deal with:
qa = "*1) Is this the first question ?
a. Yes
b. No
*2) Is this question composed
of two lines ?
a. Yes, indeed
b. Maybe
c. I dont' know"
qa.scan(/(\*\d+\).*?\?.*?)([^*]*)/m).map do |q, ans|
[q, ans.scan(/\n([a-z]\..*)/).flatten]
end
=> [["*1) Is this the first question ?", ["a. Yes", "b. No"]], ["*2)
Is this question composed\nof two lines ?", ["a. Yes, indeed", "b.
Maybe", "c. I dont' know "]]]
That makes various assumptions about the appearance of '*' and how
answers start/end, but it should be a start.