[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Very weird behavior in MySQLdb "execute"

John Nagle

2/4/2008 7:31:00 PM

This has me completely mystified. Some SELECT operations performed through
MySQLdb produce different results than with the MySQL graphical client.
This failed on a Linux server running Python 2.5, and I can reproduce it
on a Windows client running Python 2.4. Both are running MySQL 2.5.

The table involved is:

CREATE TABLE domaincache
(
domain VARCHAR(255) NOT NULL UNIQUE PRIMARY KEY,
rating CHAR(1) NULL,
rating_info ENUM
('error','no_domain','no_website','blocked','no_location','negative_info','non_commercial','unverified')
NULL,
special_case ENUM('normal','special'),
rating_id BIGINT UNSIGNED NULL,
last_update_end TIMESTAMP NOT NULL,
version SMALLINT UNSIGNED NOT NULL,
INDEX (rating_id)
);

Nothing exciting there.

In the MySQL query browser, I can do either

select * from domaincache where domain = "adwords.google.com"
or
select * from domaincache where domain = "google.com"

and I get one record back from each, with the correct info. That's correct.

Querying the database from Python gives different results. The database
connection is made with:

db = MySQLdb.connect(host="localhost",
use_unicode = True, charset = "utf8",
user=username, passwd=password, db=database)

When I make the same queries from Python, via IDLE, typing in commands:

cursor.execute('SELECT * FROM domaincache WHERE domain="adwords.google.com"')

returns 0L, no rows, which is wrong. That domain is in the database, and
a SELECT from the graphical client will find it.

But

cursor.execute('SELECT * FROM domaincache WHERE domain="google.com"')

returns 1L, one row, which is correct.

The database is InnoDB, and CHECK TABLE says the database is valid.

Restarting the MySQL instance changes the database. The entry "google.com"
disappears, and is replaced by "www.google.com". This must indicate a hanging
transaction that wasn't committed.

But that transaction didn't come from the Python IDLE session I've been
making test calls from. Those queries should match the graphical client
exactly.

So why don't they agree?




20 Answers

Paul Boddie

2/4/2008 8:07:00 PM

0

On 4 Feb, 20:30, John Nagle <na...@animats.com> wrote:
> This has me completely mystified. Some SELECT operations performed through
> MySQLdb produce different results than with the MySQL graphical client.
> This failed on a Linux server running Python 2.5, and I can reproduce it
> on a Windows client running Python 2.4. Both are running MySQL 2.5.

I'm not actively using MySQL at the moment, so my comments are just
passing remarks that may or may not help.

> The table involved is:
>
> CREATE TABLE domaincache
> (
> domain VARCHAR(255) NOT NULL UNIQUE PRIMARY KEY,

Passing remark #1: I believe that "domain" is a reserved word in SQL.
Of course, this is MySQL we're talking about, so you might get away
with this, but bizarre side-effects are sometimes a hallmark of MySQL,
too.

> rating CHAR(1) NULL,
> rating_info ENUM
> ('error','no_domain','no_website','blocked','no_location','negative_info','non_commercial','unverified')
> NULL,
> special_case ENUM('normal','special'),
> rating_id BIGINT UNSIGNED NULL,
> last_update_end TIMESTAMP NOT NULL,
> version SMALLINT UNSIGNED NOT NULL,
> INDEX (rating_id)
> );
>
> Nothing exciting there.
>
> In the MySQL query browser, I can do either
>
> select * from domaincache where domain = "adwords.google.com"
> or
> select * from domaincache where domain = "google.com"

Passing remark #2: In SQL, double quotes are usually used to "protect"
identifiers from being recognised as reserved words. Now, I believe
that MySQL can be quite relaxed about this, but this could be an issue
if some conformance mode gets set somewhere.

> and I get one record back from each, with the correct info. That's correct.
>
> Querying the database from Python gives different results. The database
> connection is made with:
>
> db = MySQLdb.connect(host="localhost",
> use_unicode = True, charset = "utf8",
> user=username, passwd=password, db=database)
>
> When I make the same queries from Python, via IDLE, typing in commands:
>
> cursor.execute('SELECT * FROM domaincache WHERE domain="adwords.google.com"')
>
> returns 0L, no rows, which is wrong. That domain is in the database, and
> a SELECT from the graphical client will find it.

Try using parameters instead of attempting to put the desired value of
domain directly into the query...

> But
>
> cursor.execute('SELECT * FROM domaincache WHERE domain="google.com"')

....or try using single quotes when delimiting the desired value of
domain.

> returns 1L, one row, which is correct.
>
> The database is InnoDB, and CHECK TABLE says the database is valid.
>
> Restarting the MySQL instance changes the database. The entry "google.com"
> disappears, and is replaced by "www.google.com". This must indicate a hanging
> transaction that wasn't committed.

Cheap shot: I guess this is why I'm using PostgreSQL.

Seriously, though, MySQL 5.x can be quite usable provided that you
switch on conformance to standards and do as much inside explicit
transaction blocks as possible. Indeed, this latter point might point
to a solution: the MySQLdb module might automatically put statements
inside transactions (like many other DB-API modules), whereas other
clients might not use transactions unless you use BEGIN and ROLLBACK/
COMMIT - a recommended practice given that I've seen MySQL abort
inserts and leave them half finished.

> But that transaction didn't come from the Python IDLE session I've been
> making test calls from. Those queries should match the graphical client
> exactly.
>
> So why don't they agree?

Transactions?

Paul

Carsten Haese

2/4/2008 8:37:00 PM

0

On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
> Restarting the MySQL instance changes the database. The entry "google.com"
> disappears, and is replaced by "www.google.com". This must indicate a hanging
> transaction that wasn't committed.
>
> But that transaction didn't come from the Python IDLE session I've been
> making test calls from. Those queries should match the graphical client
> exactly.
>
> So why don't they agree?

I don't have a definitive answer, but I do have a piece of generic
advice. If two database sessions receive differing results, the cause
could be any of the following factors:

1) One session is in a transaction, but the other is not.

2) Both sessions are in a transaction, but they are at different
isolation levels.

3) Both sessions are in a transaction, but the transactions were started
at different times.

It's entirely possible that the graphical client was operating outside
of a transaction, showing you phantom rows of a transaction in progress,
whereas the Python IDLE session was inside a transaction at something
like "read last committed" (or whatever MySQL calls it) isolation level.

Any of the factors above determine which "snapshot" of the data will be
visible to the database clients. Unless you are *absolutely certain*
that both clients should have seen the exact same snapshot, it's really
not all that weird that you are seeing discrepancies, especially in
light of the fact that you had an uncommitted transaction hanging around
somewhere.

Hope this helps,

--
Carsten Haese
http://informixdb.sourc...


John Nagle

2/4/2008 9:25:00 PM

0

Carsten Haese wrote:
> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
>> Restarting the MySQL instance changes the database. The entry "google.com"
>> disappears, and is replaced by "www.google.com". This must indicate a hanging
>> transaction that wasn't committed.
>>
>> But that transaction didn't come from the Python IDLE session I've been
>> making test calls from. Those queries should match the graphical client
>> exactly.
>>
>> So why don't they agree?
>
> I don't have a definitive answer, but I do have a piece of generic
> advice. If two database sessions receive differing results, the cause
> could be any of the following factors:
>
> 1) One session is in a transaction, but the other is not.
>
> 2) Both sessions are in a transaction, but they are at different
> isolation levels.
>
> 3) Both sessions are in a transaction, but the transactions were started
> at different times.

I see what's wrong, I think. I haven't been doing a commit after
a SELECT. I've been careful to commit after write-type actions,
but not after read-only actions. I'm using InnoDB in default mode,
which is REPEATABLE READ, and I've recently switched to long-running
processes which keep the database connection open for hours. So the
data view for a given connection never changes, regardless of what's
happening in other threads.

John Nagle

John Nagle

2/4/2008 9:25:00 PM

0

Carsten Haese wrote:
> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
>> Restarting the MySQL instance changes the database. The entry "google.com"
>> disappears, and is replaced by "www.google.com". This must indicate a hanging
>> transaction that wasn't committed.
>>
>> But that transaction didn't come from the Python IDLE session I've been
>> making test calls from. Those queries should match the graphical client
>> exactly.
>>
>> So why don't they agree?
>
> I don't have a definitive answer, but I do have a piece of generic
> advice. If two database sessions receive differing results, the cause
> could be any of the following factors:
>
> 1) One session is in a transaction, but the other is not.
>
> 2) Both sessions are in a transaction, but they are at different
> isolation levels.
>
> 3) Both sessions are in a transaction, but the transactions were started
> at different times.

I see what's wrong, I think. I haven't been doing a commit after
a SELECT. I've been careful to commit after write-type actions,
but not after read-only actions. I'm using InnoDB in default mode,
which is REPEATABLE READ, and I've recently switched to long-running
processes which keep the database connection open for hours. So the
data view for a given connection never changes, regardless of what's
happening in other threads.

John Nagle

Steve Holden

2/5/2008 3:43:00 AM

0

John Nagle wrote:
> Carsten Haese wrote:
>> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
>>> Restarting the MySQL instance changes the database. The entry "google.com"
>>> disappears, and is replaced by "www.google.com". This must indicate a hanging
>>> transaction that wasn't committed.
>>>
>>> But that transaction didn't come from the Python IDLE session I've been
>>> making test calls from. Those queries should match the graphical client
>>> exactly.
>>>
>>> So why don't they agree?
>> I don't have a definitive answer, but I do have a piece of generic
>> advice. If two database sessions receive differing results, the cause
>> could be any of the following factors:
>>
>> 1) One session is in a transaction, but the other is not.
>>
>> 2) Both sessions are in a transaction, but they are at different
>> isolation levels.
>>
>> 3) Both sessions are in a transaction, but the transactions were started
>> at different times.
>
> I see what's wrong, I think. I haven't been doing a commit after
> a SELECT. I've been careful to commit after write-type actions,
> but not after read-only actions. I'm using InnoDB in default mode,
> which is REPEATABLE READ, and I've recently switched to long-running
> processes which keep the database connection open for hours. So the
> data view for a given connection never changes, regardless of what's
> happening in other threads.
>


I believe you are correct in your belief that a commit() is never
required after SELECT. Once a database change is committed it should
become visible to all other connections (under normal circumstances).

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.hold...

Gabriel Genellina

2/5/2008 6:01:00 AM

0

On 5 feb, 01:42, Steve Holden <st...@holdenweb.com> wrote:
> John Nagle wrote:
> > Carsten Haese wrote:
> >> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
> >>> Restarting the MySQL instance changes the database.  The entry "google.com"
> >>> disappears, and is replaced by "www.google.com".  This must indicate a hanging
> >>> transaction that wasn't committed.
>
> >>> But that transaction didn't come from the Python IDLE session I've been
> >>> making test calls from.  Those queries should match the graphical client
> >>> exactly.
>
> >>> So why don't they agree?
> >> I don't have a definitive answer, but I do have a piece of generic
> >> advice. If two database sessions receive differing results, the cause
> >> could be any of the following factors:
>
> >> 1) One session is in a transaction, but the other is not.
>
> >> 2) Both sessions are in a transaction, but they are at different
> >> isolation levels.
>
> >> 3) Both sessions are in a transaction, but the transactions were started
> >> at different times.
>
> >      I see what's wrong, I think.  I haven't been doing a commit after
> > a SELECT.  I've been careful to commit after write-type actions,
> > but not after read-only actions.  I'm using InnoDB in default mode,
> > which is REPEATABLE READ, and I've recently switched to long-running
> > processes which keep the database connection open for hours.  So the
> > data view for a given connection never changes, regardless of what's
> > happening in other threads.
>
> I believe you are correct in your belief that a commit() is never
> required after SELECT. Once a database change is committed it should
> become visible to all other connections (under normal circumstances).

As said before, that depends on the transaction isolation level in
use. You describe the "read committed" level, where non-repeatable
reads may occur. A higher level is "repeatable read" where the
original data rows are always included in the result set for the same
query, even if other concurrent transaction modified that same rows
(but new "phantom" rows may appear in later reads).
Unfortunately the Python DBAPI 2.0 does not define any way to set this
parameter.

--
Gabriel Genellina

John Nagle

2/5/2008 5:59:00 PM

0

Steve Holden wrote:
> John Nagle wrote:
>> Carsten Haese wrote:
>>> On Mon, 2008-02-04 at 11:30 -0800, John Nagle wrote:
>>>> Restarting the MySQL instance changes the database. The entry
>>>> "google.com"
>>>> disappears, and is replaced by "www.google.com". This must indicate
>>>> a hanging
>>>> transaction that wasn't committed.
>>>>
>>>> But that transaction didn't come from the Python IDLE session I've been
>>>> making test calls from. Those queries should match the graphical
>>>> client
>>>> exactly.
>>>>
>>>> So why don't they agree?
>>> I don't have a definitive answer, but I do have a piece of generic
>>> advice. If two database sessions receive differing results, the cause
>>> could be any of the following factors:
>>>
>>> 1) One session is in a transaction, but the other is not.
>>>
>>> 2) Both sessions are in a transaction, but they are at different
>>> isolation levels.
>>>
>>> 3) Both sessions are in a transaction, but the transactions were started
>>> at different times.
>>
>> I see what's wrong, I think. I haven't been doing a commit after
>> a SELECT. I've been careful to commit after write-type actions,
>> but not after read-only actions. I'm using InnoDB in default mode,
>> which is REPEATABLE READ, and I've recently switched to long-running
>> processes which keep the database connection open for hours. So the
>> data view for a given connection never changes, regardless of what's
>> happening in other threads.
>>
>
>
> I believe you are correct in your belief that a commit() is never
> required after SELECT. Once a database change is committed it should
> become visible to all other connections (under normal circumstances).

No. If you're using InnoDB in default mode, you're in "repeatable read"
mode, which means the same SELECT on the same connection without an
intervening commit will produce the same result, regardless of any
other transactions.

I have one FCGI program which does only SELECT calls, for processing quick
lookup-only requests. This program wasn't picking up changes made by
transactions from other programs, because it's a long-running program
with a persistent database connection, and it didn't do a COMMIT.
Once I made it do a COMMIT after each SELECT, it started picking up
the changes being made elsewhere.

So you really do have to COMMIT after a SELECT, if you are reusing
the database connection. CGI programs usually don't have this issue,
because their connections don't live long, but long-running FCGI (and maybe
Twisted) programs do.

John Nagle

John Nagle

2/5/2008 9:25:00 PM

0

Paul Boddie wrote:
> On 4 Feb, 20:30, John Nagle <na...@animats.com> wrote:
>> This has me completely mystified. Some SELECT operations performed through
>> MySQLdb produce different results than with the MySQL graphical client.
>> This failed on a Linux server running Python 2.5, and I can reproduce it
>> on a Windows client running Python 2.4. Both are running MySQL 2.5.
>
> I'm not actively using MySQL at the moment, so my comments are just
> passing remarks that may or may not help.
>
>> The table involved is:
>>
>> CREATE TABLE domaincache
>> (
>> domain VARCHAR(255) NOT NULL UNIQUE PRIMARY KEY,
>
> Passing remark #1: I believe that "domain" is a reserved word in SQL.

Not since SQL-99. DOMAIN stopped being a reserved word in SQL-2003.
See "http://developer.mimer.com/validator/sql-reserved-words...
It is not a reserved word in MySQL.

>>
>> select * from domaincache where domain = "adwords.google.com"
>> or
>> select * from domaincache where domain = "google.com"
>
> Passing remark #2: In SQL, double quotes are usually used to "protect"
> identifiers from being recognised as reserved words. Now, I believe
> that MySQL can be quite relaxed about this, but this could be an issue
> if some conformance mode gets set somewhere.

The SQL standard requires single quotes, while MySQL allows both.
Actually, the real code lets the MySQLdb interface do the quoting; this
was just a manual test.

> Cheap shot: I guess this is why I'm using PostgreSQL.

Actually, most of the above comments were totally irrelevant. The
real problem (discussed in the Python newsgroup) was a failure to
COMMIT after a SELECT. MySQL InnoDB databases run in "repeatable read" mode,
and if you have a long-running process and don't COMMIT after a SELECT,
the results of redoing a SELECT will not change, regardless of other
updates to the table. So it really is necessary to COMMIT after SELECT
to see new updates to the database, even when not making changes.

John Nagle

Paul Boddie

2/5/2008 10:25:00 PM

0

On 5 Feb, 22:25, John Nagle <na...@animats.com> wrote:
> Paul Boddie wrote:
> > Passing remark #1: I believe that "domain" is a reserved word in SQL.
>
> Not since SQL-99. DOMAIN stopped being a reserved word in SQL-2003.
> See "http://developer.mimer.com/validator/sql-reserved-words...
> It is not a reserved word in MySQL.

Well, it was just a passing remark, useful to bear in mind if you have
to use another database system at some point.

> > Passing remark #2: In SQL, double quotes are usually used to "protect"
> > identifiers from being recognised as reserved words. Now, I believe
> > that MySQL can be quite relaxed about this, but this could be an issue
> > if some conformance mode gets set somewhere.
>
> The SQL standard requires single quotes, while MySQL allows both.

Yes, that's what standards are for, after all.

> Actually, the real code lets the MySQLdb interface do the quoting; this
> was just a manual test.

Fair enough.

> > Cheap shot: I guess this is why I'm using PostgreSQL.
>
> Actually, most of the above comments were totally irrelevant.

Well, they were just late night suggestions.

> The real problem (discussed in the Python newsgroup) was a failure to
> COMMIT after a SELECT. MySQL InnoDB databases run in "repeatable read" mode,
> and if you have a long-running process and don't COMMIT after a SELECT,
> the results of redoing a SELECT will not change, regardless of other
> updates to the table. So it really is necessary to COMMIT after SELECT
> to see new updates to the database, even when not making changes.

I always find this page to summarise the situation well enough:

http://www.postgresql.org/docs/8.3/static/transactio...

I think that, if you haven't actually updated the database within your
transaction, and your question would suggest that you hadn't because
you'd be thinking of the transactional issues under such
circumstances, then merely ending the transaction would allow your
code to "synchronise" its view of the database with the current state.
In other words, a ROLLBACK would also be satisfactory.

I must admit that I hadn't thought too much about transactions which
merely query the database until I noticed that in PostgreSQL you can
accumulate a number of locks if you have a long-running process which
doesn't terminate transactions in a timely fashion. Inspecting the
behaviour of your favourite DB-API module can be useful to see exactly
what happens under the covers.

Paul

Frank Aune

2/6/2008 3:04:00 PM

0

On Tuesday 05 February 2008 18:58:49 John Nagle wrote:
> So you really do have to COMMIT after a SELECT, if you are reusing
> the database connection. CGI programs usually don't have this issue,
> because their connections don't live long, but long-running FCGI (and maybe
> Twisted) programs do.

I've experienced the same thing for long-running tasks even when using
different connections/cursors against the same db for db queries and log
writing dbhandlers respectively.

Whenever I did a SELECT() on the first connection, the cursor would
stop "seeing" new entries commited in the log table by the other connection.
I always assumed you needed COMMIT() after adding new content to the
database, not after every single query, but this perhaps indicate otherwise?

Regards,
Frank