Asp Forum - Counting Tabs and splitting by that number

Nick Bo

9/28/2008 9:53:00 PM

Basically i have a document which I am opening and then i am reading
each line of the file and having to split it up into two arrays and then
into a hash in which i have to get some sort of output like this:

application/activemessage has no extensions
application/andrew-inset has extensions ez
application/applefile has no extensions
application/atom has extensions atom
application/atomcat+xml has extensions atomcat
application/atomicmail has no extensions
application/atomserv+xml has extensions atomsrv
application/batch-SMTP has no extensions
application/beep+xml has no extensions
application/cals-1840 has no extensions

I have determined that if there are no tabs in the document then the
file has no extension so what i did was an if statement in the beginning
to see if the line contained the tab if not then it would save false to
the position in the array that i was at in the each loop.

file.each_line do |line|
next if line[0] == ?#
next if line == "\n"
string = line
if string.include?("\t") == false
mimeValue[i] = false
mimeKey[i]=string.split
else

#THIS IS WHERE MY ISSUE IS NOW
mimeKey[i], mimeValue[i] = string.split("\t\t\t")
end

My problem now that sometimes teh document is split by tabs changing in
number one line may have 3 tabs other may have 5 and one might just have
just 1. So I am in a rut now How do i determine how many tabs are in
the line(string variable) thus so i can split the two parts into their
appropriate arrays. I was thinking I could do some kind of recurssion
which would test to see if tab and if so then add 1 to count and then be
able to do something like

mimeKey[i], mimeValue[i] = string.split(#{tabCount}*("\t"))

I know there is alot in my message so here is a summary:

HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t
--
Posted via http://www.ruby-....

6 Answers

Siep Korteling

9/28/2008 10:45:00 PM

Nick Bo wrote:
> Basically i have a document which I am opening and then i am reading
(...)
>
> I know there is alot in my message so here is a summary:
>
> HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t

Split on \t anyway and dump all empty results, like this:

str = 'beep+xml\t\t\t atom'
res = str.split('\t').reject{|item|item.empty?}
p res

hth,

Siep
--
Posted via http://www.ruby-....

brabuhr

9/28/2008 10:54:00 PM

On Sun, Sep 28, 2008 at 5:52 PM, Nick Bo <bornemann1@nku.edu> wrote:
> #THIS IS WHERE MY ISSUE IS NOW
> mimeKey[i], mimeValue[i] = string.split("\t\t\t")
>
> My problem now that sometimes teh document is split by tabs changing in
> number one line may have 3 tabs other may have 5 and one might just have
> just 1.
>
> mimeKey[i], mimeValue[i] = string.split(#{tabCount}*("\t"))
>
> I know there is alot in my message so here is a summary:
>
> HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t

Your tabs are consecutive and you don't actually care how many there are?
string.split(/\t+/)
?

Nick Bo

9/28/2008 11:06:00 PM

incorrect if i do it that way then if i have 5 tabs in between the two
parts i want to separate then i get 4 blank arrays. giving me a total of
6 arrays.
eg = "abcdefg \t\t\t\t\t hi"
eg.split("\t) --> ["abcdefg ", "", "", "", " i"
eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
matche the pattern given to the split at all so it makes whole thing
part of the array.
--
Posted via http://www.ruby-....

Bill Kelly

9/28/2008 11:19:00 PM

From: "Nick Bo" <bornemann1@nku.edu>
>
> eg = "abcdefg \t\t\t\t\t hi"
> eg.split("\t) --> ["abcdefg ", "", "", "", " i"
> eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
> matche the pattern given to the split at all so it makes whole thing
> part of the array.

Huh?

>> eg = "abcdefg \t\t\t\t\t hi"
=> "abcdefg \t\t\t\t\t hi"
>> eg.split(/\t+/)
=> ["abcdefg ", " hi"]

Regards,

Bill

Nick Bo

9/28/2008 11:38:00 PM

Bill Kelly wrote:
> From: "Nick Bo" <bornemann1@nku.edu>
>>
>> eg = "abcdefg \t\t\t\t\t hi"
>> eg.split("\t) --> ["abcdefg ", "", "", "", " i"
>> eg.split("/\t+/) just gives me ["abcdefg \t\t\t\t\t i"] cause it dont
>> matche the pattern given to the split at all so it makes whole thing
>> part of the array.
>
> Huh?
>
>>> eg = "abcdefg \t\t\t\t\t hi"
> => "abcdefg \t\t\t\t\t hi"
>>> eg.split(/\t+/)
> => ["abcdefg ", " hi"]
>
>
> Regards,
>
> Bill

it wouldnt give me the two, i so wish it did but i found a way around it
this is my solution and it works perfect
eg = "abcdefg \t\t\t\t\t\t hi"
splitArray = eg.split("\t")
splitArray = splitArray.delete("")

loop
arrayKey[i] = splitArray[0]
arrayValue[i] = splitArray[1]

Thanks for everyones help
--
Posted via http://www.ruby-....

Mark Thomas

9/29/2008 2:44:00 PM

> it wouldnt give me the two, i so wish it did but i found a way around it
> this is my solution and it works perfect
> eg = "abcdefg \t\t\t\t\t\t hi"
> splitArray = eg.split("\t")
> splitArray = splitArray.delete("")

IMO, the regex solution is better

splitArray = eg.split(/\t+/)

I think you put it in quotes. Leave the quotes out.

-- Mark.

comp.lang.ruby

Counting Tabs and splitting by that number

Nick Bo

Siep Korteling

brabuhr

Nick Bo

Bill Kelly

Nick Bo

Mark Thomas

x Login to ForumsZone