[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

simple match of items in 2 arrays

Charles L. Snyder

1/8/2006 8:46:00 PM

Hi

Inexperienced ruby user question:

I have 2 large csv files:
1) Zipdata = basically a frequency table of zipcodes:
02115, 10
64108, 9
99234, 8 etc

2) Ziplookup (ziplu) = several column table of data assoc with
zipcodes:
02115, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
55021, +55.5454, - 64,8585, Kansas City, MO, Jackson Cty

I am trying to take each entry in the zipdata and compare the zipcode
to the lookup table
When a match is found, combine the data into a new line /
multidimensional hash table eg -
02115, 10, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty

Here is what I have:
outf = File.read "ZIP_CODES.txt"
ziplu=[]
zipdata = []
result = []

outf.each {|e| ziplu.push e.chomp}

# open the csv file of zip code occurences and their frequency (zips),
and add each zipcode to an array (zipdata)
doc = File.read "zips.txt"
doc.each {|f| zipdata.push f}

#compare each line from zipsdata to the lookup file and put into new
array
result.push(zipdata.each {|a| ziplu.find_all{|x| /a/ =~ x})

I can't get the comparison to work, and don't know what format to put
the result into (hash, array of arrays, file) - from which I can
easily output the results (ie., lat long, city, state) by zipcode...

Any guidance appreciated..

CLS

6 Answers

Christer Nilsson

1/9/2006 10:16:00 AM

0

Charles L. Snyder wrote:
> I can't get the comparison to work, and don't know what format to put
> the result into (hash, array of arrays, file) - from which I can
> easily output the results (ie., lat long, city, state) by zipcode...
>
> Any guidance appreciated..

Charles,

I don't konw if this is the best way of doing it, but it works at least.
btw, your file has errors. It the code supposed to deal with them?
e.g. commas in the latitude, spaces after minus.
Have you considered using a database table?

Christer

class ZipCodeReader
ZipCode = Struct.new(:zipcode, :lat, :lng, :address, :state, :city,
:freq)
def read(filename)
@zipcode = {}
IO.foreach(filename) do |line|
arr = line.chomp.split(",")
zipcode = ZipCode.new(arr[0], arr[1].to_f, arr[2].to_f,
arr[3].lstrip, arr[4].lstrip, arr[5].lstrip, 0)
@zipcode[arr[0]] = zipcode
end
end

def readfreq(filename)
IO.foreach(filename) do |line|
arr = line.chomp.split(",")
@zipcode[arr[0]].freq = arr[1].lstrip.to_i unless
@zipcode[arr[0]].nil?
end
end

def dump
arr=@zipcode.sort
arr.each {|x| print x.to_s + "\n"}
end
end

zipcode=ZipCodeReader.new
zipcode.read("zipcode.txt")
zipcode.readfreq("zipdata.txt")
zipcode.dump


--
Posted via http://www.ruby-....


Robin Stocker

1/9/2006 10:28:00 AM

0

Hi,

You can use the CSV module. As Christer already pointed out, you have
errors in your CSV file.

Robin


require 'csv'
require 'yaml'


zips = {}

ziplookup = CSV.parse(File.read('ziplu'), ', ')
zipdata = CSV.parse(File.read('zipdata'), ', ')

zipdata.each do |data|
zip_code = data.first
h = {}
zips[zip_code] = h
h[:frequency] = data[1].to_i
lookup = ziplookup.find{ |lookup| lookup.first == zip_code }
next unless lookup
# something like this:
h[:lat] = lookup[1].gsub(',', '.').to_f
h[:long] = lookup[2].gsub(',', '.').to_f
h[:city] = lookup[3]
end

y zips


Charles L. Snyder wrote:
> Hi
>
> Inexperienced ruby user question:
>
> I have 2 large csv files:
> 1) Zipdata = basically a frequency table of zipcodes:
> 02115, 10
> 64108, 9
> 99234, 8 etc
>
> 2) Ziplookup (ziplu) = several column table of data assoc with
> zipcodes:
> 02115, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
> 55021, +55.5454, - 64,8585, Kansas City, MO, Jackson Cty
>
> I am trying to take each entry in the zipdata and compare the zipcode
> to the lookup table
> When a match is found, combine the data into a new line /
> multidimensional hash table eg -
> 02115, 10, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
>
> Here is what I have:
> outf = File.read "ZIP_CODES.txt"
> ziplu=[]
> zipdata = []
> result = []
>
> outf.each {|e| ziplu.push e.chomp}
>
> # open the csv file of zip code occurences and their frequency (zips),
> and add each zipcode to an array (zipdata)
> doc = File.read "zips.txt"
> doc.each {|f| zipdata.push f}
>
> #compare each line from zipsdata to the lookup file and put into new
> array
> result.push(zipdata.each {|a| ziplu.find_all{|x| /a/ =~ x})
>
> I can't get the comparison to work, and don't know what format to put
> the result into (hash, array of arrays, file) - from which I can
> easily output the results (ie., lat long, city, state) by zipcode...
>
> Any guidance appreciated..
>
> CLS



kbass

1/9/2006 12:33:00 PM

0

Robin Stocker wrote:
> Hi,
>
> You can use the CSV module. As Christer already pointed out, you have
> errors in your CSV file.
>
> Robin
>
>
> require 'csv'
> require 'yaml'
>
>
> zips = {}
>
> ziplookup = CSV.parse(File.read('ziplu'), ', ')
> zipdata = CSV.parse(File.read('zipdata'), ', ')
>
> zipdata.each do |data|
> zip_code = data.first
> h = {}
> zips[zip_code] = h
> h[:frequency] = data[1].to_i
> lookup = ziplookup.find{ |lookup| lookup.first == zip_code }
> next unless lookup
> # something like this:
> h[:lat] = lookup[1].gsub(',', '.').to_f
> h[:long] = lookup[2].gsub(',', '.').to_f
> h[:city] = lookup[3]
> end
>
> y zips
>
>
> Charles L. Snyder wrote:
>
>> Hi
>>
>> Inexperienced ruby user question:
>>
>> I have 2 large csv files:
>> 1) Zipdata = basically a frequency table of zipcodes:
>> 02115, 10
>> 64108, 9
>> 99234, 8 etc
>>
>> 2) Ziplookup (ziplu) = several column table of data assoc with
>> zipcodes:
>> 02115, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
>> 55021, +55.5454, - 64,8585, Kansas City, MO, Jackson Cty
>>
>> I am trying to take each entry in the zipdata and compare the zipcode
>> to the lookup table
>> When a match is found, combine the data into a new line /
>> multidimensional hash table eg -
>> 02115, 10, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
>>
>> Here is what I have:
>> outf = File.read "ZIP_CODES.txt"
>> ziplu=[]
>> zipdata = []
>> result = []
>>
>> outf.each {|e| ziplu.push e.chomp}
>>
>> # open the csv file of zip code occurences and their frequency (zips),
>> and add each zipcode to an array (zipdata)
>> doc = File.read "zips.txt"
>> doc.each {|f| zipdata.push f}
>>
>> #compare each line from zipsdata to the lookup file and put into new
>> array
>> result.push(zipdata.each {|a| ziplu.find_all{|x| /a/ =~ x})
>>
>> I can't get the comparison to work, and don't know what format to put
>> the result into (hash, array of arrays, file) - from which I can
>> easily output the results (ie., lat long, city, state) by zipcode...
>>
>> Any guidance appreciated..
>>
>> CLS
>
>
>



Robert Klemme

1/9/2006 12:47:00 PM

0

Robin Stocker wrote:
> Hi,
>
> You can use the CSV module. As Christer already pointed out, you have
> errors in your CSV file.
>
> Robin
>
>
> require 'csv'
> require 'yaml'
>
>
> zips = {}
>
> ziplookup = CSV.parse(File.read('ziplu'), ', ')
> zipdata = CSV.parse(File.read('zipdata'), ', ')
>
> zipdata.each do |data|
> zip_code = data.first
> h = {}
> zips[zip_code] = h
> h[:frequency] = data[1].to_i
> lookup = ziplookup.find{ |lookup| lookup.first == zip_code }
> next unless lookup
> # something like this:
> h[:lat] = lookup[1].gsub(',', '.').to_f
> h[:long] = lookup[2].gsub(',', '.').to_f
> h[:city] = lookup[3]
> end
>
> y zips

I'd change this a bit:
- read only lookup zips into mem and process the other line by line
(saves mem)
- use a hash for lookup

# untested
require 'csv'

zips = File.open('ziplu') do |io|
h = {}
io.each_line do |line|
zip, freq = CSV.parse_line
h[zip]=freq
end
h
end

File.open('zipdata') do |io|
io.each_line do |line|
rec = CSV.parse_line line
freq = zips[rec[0]] or next
rec[1,0]=freq
puts rec.join ','
end
end

HTH

Kind regards

robert

>
>
> Charles L. Snyder wrote:
>> Hi
>>
>> Inexperienced ruby user question:
>>
>> I have 2 large csv files:
>> 1) Zipdata = basically a frequency table of zipcodes:
>> 02115, 10
>> 64108, 9
>> 99234, 8 etc
>>
>> 2) Ziplookup (ziplu) = several column table of data assoc with
>> zipcodes:
>> 02115, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
>> 55021, +55.5454, - 64,8585, Kansas City, MO, Jackson Cty
>>
>> I am trying to take each entry in the zipdata and compare the zipcode
>> to the lookup table
>> When a match is found, combine the data into a new line /
>> multidimensional hash table eg -
>> 02115, 10, +43.59906, -75,99343, Hoboken, NJ, Johnson Cty
>>
>> Here is what I have:
>> outf = File.read "ZIP_CODES.txt"
>> ziplu=[]
>> zipdata = []
>> result = []
>>
>> outf.each {|e| ziplu.push e.chomp}
>>
>> # open the csv file of zip code occurences and their frequency
>> (zips), and add each zipcode to an array (zipdata)
>> doc = File.read "zips.txt"
>> doc.each {|f| zipdata.push f}
>>
>> #compare each line from zipsdata to the lookup file and put into new
>> array
>> result.push(zipdata.each {|a| ziplu.find_all{|x| /a/ =~ x})
>>
>> I can't get the comparison to work, and don't know what format to put
>> the result into (hash, array of arrays, file) - from which I can
>> easily output the results (ie., lat long, city, state) by zipcode...
>>
>> Any guidance appreciated..
>>
>> CLS

Charles L. Snyder

1/10/2006 1:34:00 AM

0

Thanks everyone - Robert's method works for me with minor alteration.
The errors in the example were not in the actual file.

BTW:
Christer's answer gives me:
"undefined method `lstrip' for nil:NilClass (NoMethodError)"

Robin's method gives me:
private method `gsub' called for nil:NilClass (NoMethodError)

Thanks to all !

Christer Nilsson

1/10/2006 1:42:00 AM

0

Charles L. Snyder wrote:
> Thanks everyone - Robert's method works for me with minor alteration.
> The errors in the example were not in the actual file.
>
> BTW:
> Christer's answer gives me:
> "undefined method `lstrip' for nil:NilClass (NoMethodError)"
>
> Robin's method gives me:
> private method `gsub' called for nil:NilClass (NoMethodError)
>
> Thanks to all !

CLS,

can you give me some problem input?

Christer

--
Posted via http://www.ruby-....