[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

excessively verbose request for help with regex and arrays

Simon Schuster

8/16/2007 4:09:00 AM

text = "(20:29:55) awhilewhileaway: I also need to assemble the
cover/back, and figure out the innards of the aimlog
formatting/keyword searches"

what I want is essentially 3 fields, the text itself, the speaker
(stripped of the ":") and the date information, however as far as the
date information goes, it will be part of a short-stepped process,
which will only need to reference the previous one, so all data can
keep overwriting within two variables, as in: time_since_last -
time_current = time_it_took ... I think. I'm new to programming and
left math in highschool, so it's a weird (but very fun) place for my
mind to be. :) still working it out.

so I would like two of the fields of this array to be hashes..
array[0] being a hash and having a numerical value, array[1] being a
hash and having personA or personB value, and then array[2] being a
string. that works, I think...?

wow! I had no idea I knew this much when I started the e-mail. :) any
hints/solutions for me to play around with? it's the parentheses of
the regex that kind of has me stuck, mostly, as well as how to deal
with clock arithmetic when it rolls over at midnight, I foresee that
being confusing.

6 Answers

Eric Hodel

8/16/2007 4:25:00 AM

0

On Aug 15, 2007, at 21:08, Simon Schuster wrote:
> text = "(20:29:55) awhilewhileaway: I also need to assemble the
> cover/back, and figure out the innards of the aimlog
> formatting/keyword searches"
>
> what I want is essentially 3 fields, the text itself, the speaker
> (stripped of the ":") and the date information, however as far as the
> date information goes, it will be part of a short-stepped process,
> which will only need to reference the previous one, so all data can
> keep overwriting within two variables, as in: time_since_last -
> time_current = time_it_took ... I think. I'm new to programming and
> left math in highschool, so it's a weird (but very fun) place for my
> mind to be. :) still working it out.

Well, you haven't explained what you really want to do with your data
yet, so that all sounds quite a bit complicated. Why not start out
with just a simple split on space:

time, speaker, content = text.split ' ', 3

Then you can parse the time:

require 'time'
time = Time.parse time

Cut off the ':' on the speaker:

speaker = speaker.sub(/:$/, '')

and you'll be left with:

p time, speaker, content
Wed Aug 15 20:29:55 -0700 2007
"awhilewhileaway"
"I also need to assemble the cover/back, and figure out the innards
of the aimlog formatting/keyword searches"

--
Poor workers blame their tools. Good workers build better tools. The
best workers get their tools to do the work for them. -- Syndicate Wars



Alex Gutteridge

8/16/2007 4:42:00 AM

0

On 16 Aug 2007, at 13:08, Simon Schuster wrote:

> text = "(20:29:55) awhilewhileaway: I also need to assemble the
> cover/back, and figure out the innards of the aimlog
> formatting/keyword searches"
>
> what I want is essentially 3 fields, the text itself, the speaker
> (stripped of the ":") and the date information, however as far as the
> date information goes, it will be part of a short-stepped process,
> which will only need to reference the previous one, so all data can
> keep overwriting within two variables, as in: time_since_last -
> time_current = time_it_took ... I think. I'm new to programming and
> left math in highschool, so it's a weird (but very fun) place for my
> mind to be. :) still working it out.
>
> so I would like two of the fields of this array to be hashes..
> array[0] being a hash and having a numerical value, array[1] being a
> hash and having personA or personB value, and then array[2] being a
> string. that works, I think...?
>
> wow! I had no idea I knew this much when I started the e-mail. :) any
> hints/solutions for me to play around with? it's the parentheses of
> the regex that kind of has me stuck, mostly, as well as how to deal
> with clock arithmetic when it rolls over at midnight, I foresee that
> being confusing.

I'm not sure I fully understand what you want to do, perhaps you
should post a more complete set of data. The part where you describe
Arrays of Hashes is a bit confusing as well, can you describe your
data structure using code rather than English? The first regexp part
is easy enough though:

text = "(20:29:55) awhilewhileaway: I also need to assemble"
text.scan(/(\(.+?\)) (.+?): (.+)/){|time,name,data|
p time
p name
p data
}

This doesn't check for multi-line strings, names with ':' in and
other weirdness though. So be careful with real world data.

If you have an Array of text lines then you can just iterate through
(I use map below), scan each one and store the data in another Array
(no need for Hashes unless I misunderstand you):

irb(main):031:0> text_a = ["(20:29:55) awhilewhileaway: I also need
to assemble","(20:39:55) away: I also need embl"]
=> ["(20:29:55) awhilewhileaway: I also need to assemble",
"(20:39:55) away: I also need embl"]
irb(main):032:0> res = text_a.map{|l| l.scan(/(\(.+?\)) (.+?): (.+)/)
[0]}
=> [["(20:29:55)", "awhilewhileaway", "I also need to assemble"],
["(20:39:55)", "away", "I also need embl"]]
irb(main):033:0> res[0]
=> ["(20:29:55)", "awhilewhileaway", "I also need to assemble"]
irb(main):034:0> res[0][0]
=> "(20:29:55)"

Hope that helps.

Alex Gutteridge

Bioinformatics Center
Kyoto University



Simon Schuster

8/16/2007 4:46:00 AM

0

thanks, but since the time is only going to be used for arithmetic
parsing it for additional information isn't helpful, and the roll-over
will be problematic.

(23:54:45) - (00:03:45) != 00:09:00

as for the use of the data, basically, at this stage, I'm working on
formatting aimlogs into "bookish" dialogue, with an eventual goal of
utilizing lulu.com's API to generate books behind my back. :D maybe
thinking about making a gaim plugin if it turns out, with many more
ideas for what else I could do, but not nearly the ruby rigors I need
to actualize them (yet!!) :P

> Well, you haven't explained what you really want to do with your data
> yet, so that all sounds quite a bit complicated. Why not start out
> with just a simple split on space:
>
> time, speaker, content = text.split ' ', 3
>
> Then you can parse the time:
>
> require 'time'
> time = Time.parse time
>
> Cut off the ':' on the speaker:
>
> speaker = speaker.sub(/:$/, '')
>
> and you'll be left with:
>
> p time, speaker, content
> Wed Aug 15 20:29:55 -0700 2007
> "awhilewhileaway"
> "I also need to assemble the cover/back, and figure out the innards
> of the aimlog formatting/keyword searches"
>
> --
> Poor workers blame their tools. Good workers build better tools. The
> best workers get their tools to do the work for them. -- Syndicate Wars
>
>
>
>

Alex Gutteridge

8/16/2007 5:06:00 AM

0

On 16 Aug 2007, at 13:46, Simon Schuster wrote:

> thanks, but since the time is only going to be used for arithmetic
> parsing it for additional information isn't helpful, and the roll-over
> will be problematic.
>
> (23:54:45) - (00:03:45) != 00:09:00

Does this help?

Parse the two dates like Eric suggested. If the second (later) time
is less than the first then add a day to it (60*60*24 seconds). Then
subtract one from the other to get the difference in seconds.

irb(main):023:0> t1 = Time.parse('23:54:45')
=> Thu Aug 16 23:54:45 +0900 2007
irb(main):024:0> t2 = Time.parse('00:03:45')
=> Thu Aug 16 00:03:45 +0900 2007
irb(main):025:0> t2 += (60 * 60 * 24) if t2 < t1
=> Fri Aug 17 00:03:45 +0900 2007
irb(main):026:0> diff = t2 - t1
=> 540.0

Alex Gutteridge

Bioinformatics Center
Kyoto University



Simon Schuster

8/16/2007 5:21:00 AM

0

yes, this helps a lot! I should have assumed that parsing the time
would enable arithmetic like Fri - Thurs, instead I assumed I'd have
to put it all to integers.

I will have to think over more of exactly what I'm trying to do with
the arrays/hashes, after I read more about hashes, I think. thanks!

On 8/15/07, Alex Gutteridge <alexg@kuicr.kyoto-u.ac.jp> wrote:
> On 16 Aug 2007, at 13:46, Simon Schuster wrote:
>
> > thanks, but since the time is only going to be used for arithmetic
> > parsing it for additional information isn't helpful, and the roll-over
> > will be problematic.
> >
> > (23:54:45) - (00:03:45) != 00:09:00
>
> Does this help?
>
> Parse the two dates like Eric suggested. If the second (later) time
> is less than the first then add a day to it (60*60*24 seconds). Then
> subtract one from the other to get the difference in seconds.
>
> irb(main):023:0> t1 = Time.parse('23:54:45')
> => Thu Aug 16 23:54:45 +0900 2007
> irb(main):024:0> t2 = Time.parse('00:03:45')
> => Thu Aug 16 00:03:45 +0900 2007
> irb(main):025:0> t2 += (60 * 60 * 24) if t2 < t1
> => Fri Aug 17 00:03:45 +0900 2007
> irb(main):026:0> diff = t2 - t1
> => 540.0
>
> Alex Gutteridge
>
> Bioinformatics Center
> Kyoto University
>
>
>
>

Kangaroo Court Australia

2/16/2011 7:00:00 AM

0

Real live INCEPTION test on the INTERNATIONAL CRIMINAL COURT

Go watch this video

http://www.facebook.com/video/video.php?v=169...

http://www.youtube.com/watch?v=2... (audio removed)

http://kangaroocourtaustralia.x24hr.com/cms/index.php?/topic...
ion-planting-of-the-idea-of-basic-human-rights-in-the-international-cr
iminal-court/

then go sign this petition:

Human Rights for Everyone or Human Rights for No one !
http://humanrights.change.org/petitions/view/human_rights_fo...
_or_human_rights_for_no_one_

PS. must sign the petition for it to work ;)

http://kangaroocourtaustralia.x24hr.com/cms/index.php?/topic...
ion-planting-of-the-idea-of-basic-human-rights-in-the-international-cr
iminal-court/