M. Edward (Ed) Borasky
10/29/2007 3:41:00 AM
John Carter wrote:
> Haven't tried, but just on general principles I do Lots of data
> mining. But I don't use databases.
>
> Databases are vastly complicated by the need to handle updates,
> insertes, deletes and transactions.
>
> For data mining working with a flat file snapshot can be a 100 times
> faster!
I haven't found that to be the case for the kind of data mining I do.
Ruby, Perl, etc. are great for extracting the data into something like
CSV format, but once you've got a CSV, or a bunch of CSVs, it's a lot
faster to copy them into a database (PostgreSQL COPY is your friend --
INSERT is *way* too slow), index them and just throw queries at them.