[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Finding out what row DBI chokes on?

Francis Hwang

12/30/2004 6:35:00 PM

Some times I deal with a legacy table with screwy datetime info, so you
get fields that in MySQL are '0000-00-00' or other kinds of invalid
values ... I pull this out with DBI I get an ArgumentError that simply
says "argument out of range", which I believe is being caused by DBI
calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something like that.

Now, it makes sense to me that DBI doesn't want to deal with this, but
in some cases it might be convenient for me to get a message that tells
me specifically what row has the bad value. Does anybody know how I
might do this, short of hacking DBI directly?

Francis Hwang
http://f...



4 Answers

Shashank Date

12/31/2004

0

Hi Francis,

--- Francis Hwang <sera@fhwang.net> wrote:

> Some times I deal with a legacy table with screwy
> datetime info, so you
> get fields that in MySQL are '0000-00-00' or other
> kinds of invalid
> values ... I pull this out with DBI I get an
> ArgumentError that simply
> says "argument out of range", which I believe is
> being caused by DBI
> calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something
> like that.

Is it possible to do validation at the SQL level
itself? What does the SQL generates the result look
like?

-- shanko





__________________________________
Do you Yahoo!?
Dress up your holiday email, Hollywood style. Learn more.
http://celebrity.mail...


Francis Hwang

1/1/2005 9:21:00 PM

0


On Dec 30, 2004, at 7:00 PM, Shashank Date wrote:

> Hi Francis,
>
> --- Francis Hwang <sera@fhwang.net> wrote:
>
>> Some times I deal with a legacy table with screwy
>> datetime info, so you
>> get fields that in MySQL are '0000-00-00' or other
>> kinds of invalid
>> values ... I pull this out with DBI I get an
>> ArgumentError that simply
>> says "argument out of range", which I believe is
>> being caused by DBI
>> calling Time.gm( 0, 0, 0, 0, 0, 0 ) or something
>> like that.
>
> Is it possible to do validation at the SQL level
> itself? What does the SQL generates the result look
> like?
>
> -- shanko

Perhaps I didn't explain my problem very clearly. None of my current
code is generating bad datetimes, but at work I deal with a database
with legacy data going back 7 years, and from time to time I come
across bad datetimes that were inserted years before. DBI can't handle
these at all. Here's test code to show it:

1) First I create a table and insert a bogus datetime:

mysql> create table test ( dt datetime );
Query OK, 0 rows affected (0.26 sec)

mysql> insert into test( dt ) values( '2005-01-01' );
Query OK, 1 row affected (0.13 sec)

mysql> insert into test( dt ) values( '2004-01-01' );
Query OK, 1 row affected (0.37 sec)

mysql> insert into test( dt ) values( '0000-00-00' );
Query OK, 1 row affected (0.21 sec)

mysql> select * from test;
+---------------------+
| dt |
+---------------------+
| 2005-01-01 00:00:00 |
| 2004-01-01 00:00:00 |
| 0000-00-00 00:00:00 |
+---------------------+
3 rows in set (0.24 sec)

2) Then I try to query this through DBI:

irb(main):001:0> require 'dbi'=> true
irb(main):002:0> dbh = DBI.connect( 'dbi:Mysql:test:localhost',
'francis', 'xxxxxxx' )
=> #<DBI::DatabaseHandle:0x5e76a4 @trace_mode=2,
@handle=#<DBI::DBD::Mysql::Database:0x5e6ec0 @handle=#<Mysql>,
@mutex=#<Mutex:0x5df4b8 @waiting=[], @locked=false>,
@have_transactions=true, @attr={"AutoCommit"=>true}>,
@trace_output=#<IO:0x1d3a5c>>
irb(main):003:0> dbh.select_all( 'select * from test' )
ArgumentError: argument out of range
from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:59:in `gm'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:59:in
`as_timestamp'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:79:in `send'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/sql.rb:79:in `coerce'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:418:in
`fill_array'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
`each_with_index'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in `each'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
`each_with_index'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:415:in
`fill_array'
from
/usr/local/lib/ruby/site_ruby/1.8/DBD/Mysql/Mysql.rb:425:in `fetch'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1145:in
`fetch_all'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1144:in `loop'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:1144:in
`fetch_all'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:909:in
`fetch_all'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:666:in
`select_all'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:662:in
`execute'
from /usr/local/lib/ruby/site_ruby/1.8/dbi/dbi.rb:662:in
`select_all'
from (irb):3irb(main):004:0>
irb(main):005:0*

Now, it isn't surprising or necessarily a bug that DBI won't even
sensibly process this value, and catching the exception is easy enough.
But it slows me down a little that

1) I don't know exactly what the bad value is. You'd be surprised how
many ways MySQL lets you insert bad datetime strings.
2) I don't know which row the bad value is in if I've done a select
that should return more than 1 row.

So I'm wondering if, short of patching DBI, there is a way for a DBI
client (Lafcadio, in my case) to get more info to return a more
informative error message. I'm big into informative errors.

Francis Hwang
http://f...



Shashank Date

1/2/2005 3:31:00 PM

0

Hi Francis,

Francis Hwang wrote:
> Perhaps I didn't explain my problem very clearly. None of my current
> code is generating bad datetimes, but at work I deal with a database
> with legacy data going back 7 years, and from time to time I come across
> bad datetimes that were inserted years before.

This much I had guessed from your earlier email, so I guess you had
explained the problem clearly ;-)

Apologies if my answers did not make sense to you.

> DBI can't handle these at all. Here's test code to show it:

<snip>

> Now, it isn't surprising or necessarily a bug that DBI won't even
> sensibly process this value, and catching the exception is easy enough.
> But it slows me down a little that
>
> 1) I don't know exactly what the bad value is. You'd be surprised how
> many ways MySQL lets you insert bad datetime strings.

I deal with such data almost on a routine basis (although not using
MySQL), and use standard "scrubbing techniques":

1. Try to limit the data in the SELECT, like borrowing from your example:

SELECT * FROM TEST WHERE DT > '1997-01-01';

or


SELECT DISTINCT DT FROM TEST;

to see what values show up

2. Use the datetime / string conversion functions to try to identify the
bad data.

3. If the data set is small (< 100,000 rows) use CURSOR logic (usually
frowned upon) to identify the bad rows

4. Export the table to a CSV and then use the text filtering tools and
then Import it back

Again, not knowing exactly what your situation is, I do not know if all
these are even relevant. So I guess, what I am trying to say is, as far
as possible try to handle it at the backend BEFORE it reaches DBI.


> 2) I don't know which row the bad value is in if I've done a select that
> should return more than 1 row.
> So I'm wondering if, short of patching DBI, there is a way for a DBI
> client (Lafcadio, in my case) to get more info to return a more
> informative error message. I'm big into informative errors.

I am afraid there may be no way to do that. But I am by no means a DBI
expert.

HTH,

> Francis Hwang
> http://f...

-- shanko

Francis Hwang

1/2/2005 4:53:00 PM

0


On Jan 2, 2005, at 10:31 AM, Shashank Date wrote:

> I deal with such data almost on a routine basis (although not using
> MySQL), and use standard "scrubbing techniques":
>
> 1. Try to limit the data in the SELECT, like borrowing from your
> example:
>
> SELECT * FROM TEST WHERE DT > '1997-01-01';
>
> or
>
>
> SELECT DISTINCT DT FROM TEST;
>
> to see what values show up
>
> 2. Use the datetime / string conversion functions to try to identify
> the bad data.
>
> 3. If the data set is small (< 100,000 rows) use CURSOR logic (usually
> frowned upon) to identify the bad rows
>
> 4. Export the table to a CSV and then use the text filtering tools and
> then Import it back
>
> Again, not knowing exactly what your situation is, I do not know if
> all these are even relevant. So I guess, what I am trying to say is,
> as far as possible try to handle it at the backend BEFORE it reaches
> DBI.

These scrubbing techniques do help, thanks.

Francis Hwang
http://f...