Asp Forum - Optimize an SP running against a table with 290 million rows.

MattB

3/28/2007 9:50:00 PM

The following stored procedure is used to calculate the number of unread
messages in a users inbox. The table is large (290 million rows), and
profiler shows an average of 20K reads per execution, with a few reads of
100K+. The index that the table uses (IX_Unread) has logical fragmentation of
about 15% with a scan density of about 31% over 1,311,883 pages
(fragmentation data collected on a month old copy of the DB on a testing
system.) We have been encountering I/O issues on our SAN, and I believe they
are partly due to this specific SP. I have just set up an automatic defrag
job that will occur from 10 PM to 3 AM each night on that index. I believe
that the fragmented index is what leads to the amount of reads, but I'd like
the opinion of the community here to suggest how the following stored
procedure may be optimized to reduce the amount of reads:

ALTER PROCEDURE dbo.spGetUnreadCounts
(
@intAccountGroupID int
) AS

SET NOCOUNT ON

DECLARE @tmpFolders TABLE(
FolderID int
)

INSERT INTO @tmpFolders (FolderID) VALUES (-1)
INSERT INTO @tmpFolders (FolderID) VALUES (-2)
INSERT INTO @tmpFolders (FolderID) VALUES (-4)
INSERT INTO @tmpFolders (FolderID) VALUES (-5)
INSERT INTO @tmpFolders (FolderID) VALUES (-7)

INSERT INTO @tmpFolders (FolderID)
SELECT
FolderID
FROM
Folders WITH (NOLOCK)
WHERE
AccountGroupID = @intAccountGroupID

SELECT
f.FolderID,
ISNULL(d.UnreadCount, 0) As UnreadCount
FROM
@tmpFolders f
LEFT OUTER JOIN (
SELECT
m.FolderID,
COUNT(*) As UnreadCount
FROM
vwLookup_Accounts a WITH (NOLOCK)
JOIN Messages m WITH (NOLOCK) ON m.AccountID = a.AccountID
WHERE
a.AccountGroupID = @intAccountGroupID
AND m.Unread in (1,2)
GROUP BY
m.FolderID
) d ON d.FolderID = f.FolderID

RETURN
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO

Thanks! If you need any more information, please let me know.

Matt

8 Answers

mesaalejandro

3/28/2007 11:56:00 PM

MattB,

Can you post more info like table structure, constraints, indexes and the
result of "dbcc showcontig". It will be helpful to have the execution plan
for that sp.

AMB

"MattB" <MattB@discussions.microsoft.com> wrote in message
news:E276043A-48E9-4A6C-A9C1-80FB466EA41D@microsoft.com...
> The following stored procedure is used to calculate the number of unread
> messages in a users inbox. The table is large (290 million rows), and
> profiler shows an average of 20K reads per execution, with a few reads of
> 100K+. The index that the table uses (IX_Unread) has logical fragmentation
> of
> about 15% with a scan density of about 31% over 1,311,883 pages
> (fragmentation data collected on a month old copy of the DB on a testing
> system.) We have been encountering I/O issues on our SAN, and I believe
> they
> are partly due to this specific SP. I have just set up an automatic defrag
> job that will occur from 10 PM to 3 AM each night on that index. I believe
> that the fragmented index is what leads to the amount of reads, but I'd
> like
> the opinion of the community here to suggest how the following stored
> procedure may be optimized to reduce the amount of reads:
>
> ALTER PROCEDURE dbo.spGetUnreadCounts
> (
> @intAccountGroupID int
> ) AS
>
> SET NOCOUNT ON
>
> DECLARE @tmpFolders TABLE(
> FolderID int
> )
>
> INSERT INTO @tmpFolders (FolderID) VALUES (-1)
> INSERT INTO @tmpFolders (FolderID) VALUES (-2)
> INSERT INTO @tmpFolders (FolderID) VALUES (-4)
> INSERT INTO @tmpFolders (FolderID) VALUES (-5)
> INSERT INTO @tmpFolders (FolderID) VALUES (-7)
>
> INSERT INTO @tmpFolders (FolderID)
> SELECT
> FolderID
> FROM
> Folders WITH (NOLOCK)
> WHERE
> AccountGroupID = @intAccountGroupID
>
> SELECT
> f.FolderID,
> ISNULL(d.UnreadCount, 0) As UnreadCount
> FROM
> @tmpFolders f
> LEFT OUTER JOIN (
> SELECT
> m.FolderID,
> COUNT(*) As UnreadCount
> FROM
> vwLookup_Accounts a WITH (NOLOCK)
> JOIN Messages m WITH (NOLOCK) ON m.AccountID = a.AccountID
> WHERE
> a.AccountGroupID = @intAccountGroupID
> AND m.Unread in (1,2)
> GROUP BY
> m.FolderID
> ) d ON d.FolderID = f.FolderID
>
> RETURN
> GO
> SET QUOTED_IDENTIFIER OFF
> GO
> SET ANSI_NULLS ON
> GO
>
>
> Thanks! If you need any more information, please let me know.
>
> Matt

MattB

3/29/2007 5:08:00 PM

Table: Folders (320,290 rows)
FolderID int
AccountGroupID int
FolderTypeID int
ParentFolderID int
DateCreated datetime
FolderName varchar(150)
FolderStatusID int
DateStatusChanged datetime
ShareID int
AppID int

View: vwLookup_Accounts (984,284 rows)
AccountID int
AccountGroupID int
AppID int
DateCreated datetime
FileSvrAccountID int
StatusID int
DateStatusChanged datetime

Table: Messages (281 million rows)
MessageID int
AccountID int
AppID int
MsgTypeID int
MsgCategoryID int
MsgStatusID int
DateCreated datetime
DateStatusChanged datetime
Priority smallint
Unread tinyint
FolderID int
FileSvrAccountID int
FileInstanceID int
NextAccountID int
NextMessageID int
LastActionID int
ListColor varchar(20)
RawFrom varchar(200)
HdrTo varchar(8000)
HdrFrom varchar(8000)
HdrSubject varchar(8000)
HdrDate datetime
HdrAttachCount int
HdrSize int
HdrPriority tinyint
DateExpires datetime

INDEX: Unread messages (Messages table)
AccountID int
FolderID int
Unread tinyint
MsgTypeID int
MessageID int
- non clustered
- primary filegroup

DBCC SHOWCONTIG on the Unread messages index in the Messages table. May need
to drop this in notepad for easier viewing:

Messages
2050106344
IX_Messages_AcctID_FolderID_Unread_MsgTypeID_MessageID
26 0
1311883 (null) (null) (null) (null)
(null) 0
527586 (null)
(null) 31.082267000513657
163986 527587 15.256466865539551
(null)

Also, one final thing. In the list of indexes on the Messages table, there
are NO clustered indexes. I was under the impression that it was important to
have a clustered index as the data will be sorted based on that index, but
since this is legacy and I wasn't employed when these settings were initially
configured, I haven't felt I had enough reason to set any of the indexes as
the "clustered" index. If I chose to change an index, should I expect it to
recreate the index and cause downtime? I ask because this is a 24/7
environment, and I don't want to change anything that might cause 8+ hours of
slow SQL performance.

Thank you for your help!

Alejandro Mesa

3/29/2007 6:12:00 PM

MattB,

> Also, one final thing. In the list of indexes on the Messages table, there
> are NO clustered indexes. I was under the impression that it was important to
> have a clustered index as the data will be sorted based on that index, but
> since this is legacy and I wasn't employed when these settings were initially
> configured, I haven't felt I had enough reason to set any of the indexes as
> the "clustered" index. If I chose to change an index, should I expect it to
> recreate the index and cause downtime? I ask because this is a 24/7
> environment, and I don't want to change anything that might cause 8+ hours of
> slow SQL performance.

Without a clustered index we will not be able to defrag the table unless you
create one clustered index or you dump the data to a device, recreate the
table and load the data from the device.

If you will add a clustered index, it will be better to drop nonclustered
existing ones and recreate them after the clustered one has been created.

I am afraid that the info you gave us is not enough.

What about the definition of the view involved (select statement)?
Any execution plan to take a look at?

AMB

"MattB" wrote:

> Table: Folders (320,290 rows)
> FolderID int
> AccountGroupID int
> FolderTypeID int
> ParentFolderID int
> DateCreated datetime
> FolderName varchar(150)
> FolderStatusID int
> DateStatusChanged datetime
> ShareID int
> AppID int
>
> View: vwLookup_Accounts (984,284 rows)
> AccountID int
> AccountGroupID int
> AppID int
> DateCreated datetime
> FileSvrAccountID int
> StatusID int
> DateStatusChanged datetime
>
> Table: Messages (281 million rows)
> MessageID int
> AccountID int
> AppID int
> MsgTypeID int
> MsgCategoryID int
> MsgStatusID int
> DateCreated datetime
> DateStatusChanged datetime
> Priority smallint
> Unread tinyint
> FolderID int
> FileSvrAccountID int
> FileInstanceID int
> NextAccountID int
> NextMessageID int
> LastActionID int
> ListColor varchar(20)
> RawFrom varchar(200)
> HdrTo varchar(8000)
> HdrFrom varchar(8000)
> HdrSubject varchar(8000)
> HdrDate datetime
> HdrAttachCount int
> HdrSize int
> HdrPriority tinyint
> DateExpires datetime
>
>
> INDEX: Unread messages (Messages table)
> AccountID int
> FolderID int
> Unread tinyint
> MsgTypeID int
> MessageID int
> - non clustered
> - primary filegroup
>
> DBCC SHOWCONTIG on the Unread messages index in the Messages table. May need
> to drop this in notepad for easier viewing:
>
> Messages
> 2050106344
> IX_Messages_AcctID_FolderID_Unread_MsgTypeID_MessageID
> 26 0
> 1311883 (null) (null) (null) (null)
> (null) 0
> 527586 (null)
> (null) 31.082267000513657
> 163986 527587 15.256466865539551
> (null)
>
>
> Also, one final thing. In the list of indexes on the Messages table, there
> are NO clustered indexes. I was under the impression that it was important to
> have a clustered index as the data will be sorted based on that index, but
> since this is legacy and I wasn't employed when these settings were initially
> configured, I haven't felt I had enough reason to set any of the indexes as
> the "clustered" index. If I chose to change an index, should I expect it to
> recreate the index and cause downtime? I ask because this is a 24/7
> environment, and I don't want to change anything that might cause 8+ hours of
> slow SQL performance.
>
>
> Thank you for your help!

MattB

3/29/2007 6:58:00 PM

Thanks for that. I didn't consider the lack of a clustered index before I
thought to perform the defrag. I have seen some articles online that said
that even a drop/create on the unread messages index won't help. I spoke with
the senior developer and he told me that they don't put clustered indexes on
tables that will be queried in more than one way. What do you think?

Last step of the execution plan:

Query cost related to batch: 93.96%
Query text: SELECT f.FolderID, ISNULL(d.UnreadCount,0).....

SELECT (Cost: 0%)
<--Compute Scalar (Cost: 0%)
---<--Nested Loops/Left Outer Join (Cost: 0%)
------<--Sort (Cost: 1%)
---------<--Table Scan (Cost: 0%)
------<--Table spool/lazy spool (Cost: 1%)
---------<--Compute Scalar (Cost: 0%)
------------<--Stream Aggregate (Cost: 2%)
---------------<--Nested Loops/Inner Join (Cost: 66%) tooltip->
Row count: 115,461
Number of executes: 239
Estimated row count: 250,000
CPU cost: 1.06
I/O cost: 0.0000
------------------<--Bookmark lookup (Cost: 0%)
---------------------<--Accounts.IX_Accounts (Cost: 0%)
------------------<--Messages.IX_Messages_Unread (Cost: 28%) tooltip->
Row count: 115,461
Number of executes: 1912
Est. row count: 7,576
CPU cost: 0.00851
I/O cost: 0.0143

Is there an easier way to get this data to you? :)

Also, the view query is simple and is as follows:
CREATE VIEW vwLookup_Accounts AS
SELECT
*
FROM
MessageSvrLookup..Accounts with (nolock)

I'm not very experienced with the results of the execution plan, so if you
need more information, just let me know. Thanks! I appreciate the help.

Matt

"Alejandro Mesa" wrote:

> MattB,
>
> > Also, one final thing. In the list of indexes on the Messages table, there
> > are NO clustered indexes. I was under the impression that it was important to
> > have a clustered index as the data will be sorted based on that index, but
> > since this is legacy and I wasn't employed when these settings were initially
> > configured, I haven't felt I had enough reason to set any of the indexes as
> > the "clustered" index. If I chose to change an index, should I expect it to
> > recreate the index and cause downtime? I ask because this is a 24/7
> > environment, and I don't want to change anything that might cause 8+ hours of
> > slow SQL performance.
>
> Without a clustered index we will not be able to defrag the table unless you
> create one clustered index or you dump the data to a device, recreate the
> table and load the data from the device.
>
> If you will add a clustered index, it will be better to drop nonclustered
> existing ones and recreate them after the clustered one has been created.
>
> I am afraid that the info you gave us is not enough.
>
> What about the definition of the view involved (select statement)?
> Any execution plan to take a look at?
>
>
> AMB
>

Gert-Jan Strik

3/29/2007 7:05:00 PM

Matt,

It looks like the original creator has put a quite a lot of thought in
it. Your SP would not benefit from an added clustered index. The current
nonclustered index is already covering the query.

You did not specify the indexes of view vwLookup_Accounts. The query
would benefit from an index on (AccountGroupID, AccountID). If this is
not an indexed view, then the view's definition is relevant (and not yet
posted). The actual query plan would be interesting too (check out SET
SHOWPLAN_TEXT ON)

The performance of the SP should depend on the amount of folders for the
specified AccountGroup, and the number of accounts in the AccountGroup.
However, the most determining factor is probably the massive number of
"read" messages.

You could try the query below, and see if it runs any faster.

SELECT f.FolderID
, ISNULL((
SELECT COUNT(*)
FROM vwLookup_Accounts a WITH (NOLOCK)
JOIN Messages m WITH (NOLOCK)
ON m.AccountID = a.AccountID
WHERE a.AccountGroupID = @intAccountGroupID
AND m.Unread in (1,2)
AND m.FolderID = f.FolderID
),0) As UnreadCount
FROM @tmpFolders f

However, I expect that the current query is optimal. Have you checked
whether there are AccountGroups that have several millions of messages
(both read and unread)? An accountgroup with 4 million messages would
explain the 20K reads. An accountgroup with 20 million messages would
explain the 100K reads.

Another approach is to consider an indexed view on the Messages table.
If you are using Enterprise Edition, it might be sufficient to define
the following view:

CREATE VIEW dbo.vwGrouped_Messages WITH SCHEMABINDING
AS
SELECT AccountID, FolderID, COUNT_BIG(*) AS COUNT
FROM dbo.Messages
WHERE Unread IN (1,2)
GROUP BY AccountID, FolderID;

CREATE UNIQUE CLUSTERED INDEX PK_vwGrouped_Messages
ON dbo.vwGrouped_Messages(AccountID, FolderID);

If you using Standard Edition, or if the indexed view is not used
automatically in your SP, then you could rewrite the query in the SP to:

SELECT f.FolderID
, ISNULL(d.UnreadCount, 0) As UnreadCount
FROM @tmpFolders f
LEFT OUTER JOIN (
SELECT m.FolderID
, SUM(COUNT) As UnreadCount
FROM vwLookup_Accounts a WITH (NOLOCK)
JOIN vwGrouped_Messages m WITH (NOEXPAND,NOLOCK)
ON m.AccountID = a.AccountID
WHERE a.AccountGroupID = @intAccountGroupID
GROUP BY m.FolderID
) d ON d.FolderID = f.FolderID

Please note that creating the index on the view will require quite some
time during which you will have a lot of blocking. It will also require
some additional disk space (perminently). When it is done,
inserts/updates/deletes will have become somewhat more expensive and
might encounter more blocking. On the other hand, selects for unread
messages should be much much faster, because they require a lot less
I/O.

HTH,
Gert-Jan

Alejandro Mesa

3/29/2007 8:06:00 PM

MattB,

Take a look at the msg posted from Gert-Jan Strik, it is very detailed. See
"set showplan_text on" in BOL and, if possible, post the result here.

> I spoke with
> the senior developer and he told me that they don't put clustered indexes on
> tables that will be queried in more than one way. What do you think?

Table fragmentation will be a huge obstacle for whatever query hitting the
table (not the want you posted in the original msg). A table of that magnitud
will get benefits from using a clustered index.

The reasons given to you by the senior developer are not good ones to not
create a clustered index.

AMB

"MattB" wrote:

> Thanks for that. I didn't consider the lack of a clustered index before I
> thought to perform the defrag. I have seen some articles online that said
> that even a drop/create on the unread messages index won't help. I spoke with
> the senior developer and he told me that they don't put clustered indexes on
> tables that will be queried in more than one way. What do you think?
>
>
> Last step of the execution plan:
>
>
> Query cost related to batch: 93.96%
> Query text: SELECT f.FolderID, ISNULL(d.UnreadCount,0).....
>
>
> SELECT (Cost: 0%)
> <--Compute Scalar (Cost: 0%)
> ---<--Nested Loops/Left Outer Join (Cost: 0%)
> ------<--Sort (Cost: 1%)
> ---------<--Table Scan (Cost: 0%)
> ------<--Table spool/lazy spool (Cost: 1%)
> ---------<--Compute Scalar (Cost: 0%)
> ------------<--Stream Aggregate (Cost: 2%)
> ---------------<--Nested Loops/Inner Join (Cost: 66%) tooltip->
> Row count: 115,461
> Number of executes: 239
> Estimated row count: 250,000
> CPU cost: 1.06
> I/O cost: 0.0000
> ------------------<--Bookmark lookup (Cost: 0%)
> ---------------------<--Accounts.IX_Accounts (Cost: 0%)
> ------------------<--Messages.IX_Messages_Unread (Cost: 28%) tooltip->
> Row count: 115,461
> Number of executes: 1912
> Est. row count: 7,576
> CPU cost: 0.00851
> I/O cost: 0.0143
>
>
> Is there an easier way to get this data to you? :)
>
>
> Also, the view query is simple and is as follows:
> CREATE VIEW vwLookup_Accounts AS
> SELECT
> *
> FROM
> MessageSvrLookup..Accounts with (nolock)
>
>
> I'm not very experienced with the results of the execution plan, so if you
> need more information, just let me know. Thanks! I appreciate the help.
>
> Matt
>
>
> "Alejandro Mesa" wrote:
>
> > MattB,
> >
> > > Also, one final thing. In the list of indexes on the Messages table, there
> > > are NO clustered indexes. I was under the impression that it was important to
> > > have a clustered index as the data will be sorted based on that index, but
> > > since this is legacy and I wasn't employed when these settings were initially
> > > configured, I haven't felt I had enough reason to set any of the indexes as
> > > the "clustered" index. If I chose to change an index, should I expect it to
> > > recreate the index and cause downtime? I ask because this is a 24/7
> > > environment, and I don't want to change anything that might cause 8+ hours of
> > > slow SQL performance.
> >
> > Without a clustered index we will not be able to defrag the table unless you
> > create one clustered index or you dump the data to a device, recreate the
> > table and load the data from the device.
> >
> > If you will add a clustered index, it will be better to drop nonclustered
> > existing ones and recreate them after the clustered one has been created.
> >
> > I am afraid that the info you gave us is not enough.
> >
> > What about the definition of the view involved (select statement)?
> > Any execution plan to take a look at?
> >
> >
> > AMB
> >

MattB

3/29/2007 10:50:00 PM

Thanks for the response!

If I created another indexed view on the table called vwGrouped_Messages
like you described, and then also created the clustered index on that view,
would I even have to worry about blocking since it's adding the index to a
view and not the actual Messages table, or would it still affect the Messages
table?

If I can create the view and index without affecting table blocking in our
production environment, I want to perform the existing query side-by-side the
new query, and use profiler to monitor the difference in reads/speed. What do
you think?

Thanks again for all the help. Oh, and here is the execution plan using the
SET SHOWPLAN_TEXT ON:

DECLARE @tmpFolders TABLE(
FolderID int
)

INSERT INTO @tmpFolders (FolderID) VALUES (-1)

(3 row(s) affected)

StmtText

------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders), SET:(@tmpFolders.[FolderID]=-1))

(1 row(s) affected)

StmtText
---------------------------------------------------

INSERT INTO @tmpFolders (FolderID) VALUES (-2)

(1 row(s) affected)

StmtText

------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders), SET:(@tmpFolders.[FolderID]=-2))

(1 row(s) affected)

StmtText
---------------------------------------------------

INSERT INTO @tmpFolders (FolderID) VALUES (-4)

(1 row(s) affected)

StmtText

------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders), SET:(@tmpFolders.[FolderID]=-4))

(1 row(s) affected)

StmtText
---------------------------------------------------

INSERT INTO @tmpFolders (FolderID) VALUES (-5)

(1 row(s) affected)

StmtText

------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders), SET:(@tmpFolders.[FolderID]=-5))

(1 row(s) affected)

StmtText
-----------------------------------------------------

INSERT INTO @tmpFolders (FolderID) VALUES (-7)

(1 row(s) affected)

StmtText

------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders), SET:(@tmpFolders.[FolderID]=-7))

(1 row(s) affected)

StmtText

---------------------------------------------------------------------------------------------------------------------------------------------

INSERT INTO @tmpFolders (FolderID)
SELECT
FolderID
FROM
Folders WITH (NOLOCK)
WHERE
AccountGroupID = @intAccountGroupID

(1 row(s) affected)

StmtText

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Table Insert(OBJECT:(@tmpFolders),
SET:(@tmpFolders.[FolderID]=[Folders].[FolderID]))
|--Top(ROWCOUNT est 0)
|--Index
Seek(OBJECT:([ewMessageSvrData].[dbo].[Folders].[IX_Folders_AccountGroupID_FolderID]),
SEEK:([Folders].[AccountGroupID]=[@intAccountGroupID]) ORDERED FORWARD)

(3 row(s) affected)

StmtText

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SELECT
f.FolderID,
ISNULL(d.UnreadCount, 0) As UnreadCount
FROM
@tmpFolders f
LEFT OUTER JOIN (
SELECT
m.FolderID,
COUNT(*) As UnreadCount
FROM
vwLookup_Accounts a WITH (NOLOCK)
JOIN Messages m WITH (NOLOCK) ON m.Account

(1 row(s) affected)

StmtText

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Compute Scalar(DEFINE:([Expr1005]=isnull([Expr1003], 0)))
|--Nested Loops(Left Outer Join, OUTER
REFERENCES:([f].[FolderID]))
|--Sort(ORDER BY:([f].[FolderID] ASC))
| |--Table Scan(OBJECT:(@tmpFolders AS [f]))
|--Table Spool
|--Compute
Scalar(DEFINE:([Expr1003]=Convert([Expr1008])))
|--Stream Aggregate(DEFINE:([Expr1008]=Count(*)))
|--Nested Loops(Inner Join, OUTER
REFERENCES:([Accounts].[AccountID]) WITH PREFETCH)
|--Bookmark
Lookup(BOOKMARK:([Bmk1000]), OBJECT:([ewMessageSvrLookup].[dbo].[Accounts])
WITH PREFETCH)
| |--Index
Seek(OBJECT:([ewMessageSvrLookup].[dbo].[Accounts].[IX_Accounts]),
SEEK:([Accounts].[AccountGroupID]=[@intAccountGroupID]) ORDERED FORWARD)
|--Index
Seek(OBJECT:([ewMessageSvrData].[dbo].[Messages].[IX_Messages_AcctID_FolderID_Unread_MsgTypeID_MessageID]
AS [m]), SEEK:([m].[AccountID]=[Accounts].[AccountID] AND
[m].[FolderID]=[f].[FolderID] AND [m].[Unread]

(11 row(s) affected)

StmtText
----------

RETURN

(1 row(s) affected)

StmtText
-----------------------------
SET STATISTICS PROFILE OFF

(1 row(s) affected)

"Gert-Jan Strik" wrote:

> Matt,
>
> It looks like the original creator has put a quite a lot of thought in
> it. Your SP would not benefit from an added clustered index. The current
> nonclustered index is already covering the query.
>
> You did not specify the indexes of view vwLookup_Accounts. The query
> would benefit from an index on (AccountGroupID, AccountID). If this is
> not an indexed view, then the view's definition is relevant (and not yet
> posted). The actual query plan would be interesting too (check out SET
> SHOWPLAN_TEXT ON)
>
> The performance of the SP should depend on the amount of folders for the
> specified AccountGroup, and the number of accounts in the AccountGroup.
> However, the most determining factor is probably the massive number of
> "read" messages.
>
> You could try the query below, and see if it runs any faster.
>
> SELECT f.FolderID
> , ISNULL((
> SELECT COUNT(*)
> FROM vwLookup_Accounts a WITH (NOLOCK)
> JOIN Messages m WITH (NOLOCK)
> ON m.AccountID = a.AccountID
> WHERE a.AccountGroupID = @intAccountGroupID
> AND m.Unread in (1,2)
> AND m.FolderID = f.FolderID
> ),0) As UnreadCount
> FROM @tmpFolders f
>
> However, I expect that the current query is optimal. Have you checked
> whether there are AccountGroups that have several millions of messages
> (both read and unread)? An accountgroup with 4 million messages would
> explain the 20K reads. An accountgroup with 20 million messages would
> explain the 100K reads.
>
>
> Another approach is to consider an indexed view on the Messages table.
> If you are using Enterprise Edition, it might be sufficient to define
> the following view:
>
> CREATE VIEW dbo.vwGrouped_Messages WITH SCHEMABINDING
> AS
> SELECT AccountID, FolderID, COUNT_BIG(*) AS COUNT
> FROM dbo.Messages
> WHERE Unread IN (1,2)
> GROUP BY AccountID, FolderID;
>
> CREATE UNIQUE CLUSTERED INDEX PK_vwGrouped_Messages
> ON dbo.vwGrouped_Messages(AccountID, FolderID);
>
> If you using Standard Edition, or if the indexed view is not used
> automatically in your SP, then you could rewrite the query in the SP to:
>
> SELECT f.FolderID
> , ISNULL(d.UnreadCount, 0) As UnreadCount
> FROM @tmpFolders f
> LEFT OUTER JOIN (
> SELECT m.FolderID
> , SUM(COUNT) As UnreadCount
> FROM vwLookup_Accounts a WITH (NOLOCK)
> JOIN vwGrouped_Messages m WITH (NOEXPAND,NOLOCK)
> ON m.AccountID = a.AccountID
> WHERE a.AccountGroupID = @intAccountGroupID
> GROUP BY m.FolderID
> ) d ON d.FolderID = f.FolderID
>
> Please note that creating the index on the view will require quite some
> time during which you will have a lot of blocking. It will also require
> some additional disk space (perminently). When it is done,
> inserts/updates/deletes will have become somewhat more expensive and
> might encounter more blocking. On the other hand, selects for unread
> messages should be much much faster, because they require a lot less
> I/O.
>
> HTH,
> Gert-Jan
>

Gert-Jan Strik

3/30/2007 5:44:00 PM

Matt,

MattB wrote:
>
> Thanks for the response!
>
> If I created another indexed view on the table called vwGrouped_Messages
> like you described, and then also created the clustered index on that view,
> would I even have to worry about blocking since it's adding the index to a
> view and not the actual Messages table, or would it still affect the Messages
> table?

It would still affect inserts/updates/deletes to the Messages table,
because the indexed view is always immediately updated. So if you update
a row in Messages with AccountID = 5 and Unread IN (1,2), then this row
is locked in the Messages table. The corresponding row in
vwGrouped_Messages will have to be locked too.

> If I can create the view and index without affecting table blocking in our
> production environment, I want to perform the existing query side-by-side the
> new query, and use profiler to monitor the difference in reads/speed. What do
> you think?

You will have to estimate whether the impact is acceptable. The index
creation has about the same impact as adding another nonclustered index
to the base table. Afterwards the impact on blocking will be about the
same as that of an added index to the base table, but with less
granulariy in the locking mechanism (as if you get page locks instead of
page locks), but more selectivity (only for rows where Unread IN (1,2)).

Of course, if it turns out that the indexed view results in too much
blocking, you can simply change the query back and drop the view (or
index). The drop will be done within a second.

> Thanks again for all the help. Oh, and here is the execution plan using the
> SET SHOWPLAN_TEXT ON:
[snip]
> |--Compute Scalar(DEFINE:([Expr1005]=isnull([Expr1003], 0)))
> |--Nested Loops(Left Outer Join, OUTER
> REFERENCES:([f].[FolderID]))
> |--Sort(ORDER BY:([f].[FolderID] ASC))
> | |--Table Scan(OBJECT:(@tmpFolders AS [f]))
> |--Table Spool
> |--Compute
> Scalar(DEFINE:([Expr1003]=Convert([Expr1008])))
> |--Stream Aggregate(DEFINE:([Expr1008]=Count(*)))
> |--Nested Loops(Inner Join, OUTER
> REFERENCES:([Accounts].[AccountID]) WITH PREFETCH)
> |--Bookmark
> Lookup(BOOKMARK:([Bmk1000]), OBJECT:([ewMessageSvrLookup].[dbo].[Accounts])
> WITH PREFETCH)
> | |--Index
> Seek(OBJECT:([ewMessageSvrLookup].[dbo].[Accounts].[IX_Accounts]),
> SEEK:([Accounts].[AccountGroupID]=[@intAccountGroupID]) ORDERED FORWARD)
> |--Index
> Seek(OBJECT:([ewMessageSvrData].[dbo].[Messages].[IX_Messages_AcctID_FolderID_Unread_MsgTypeID_MessageID]
> AS [m]), SEEK:([m].[AccountID]=[Accounts].[AccountID] AND
> [m].[FolderID]=[f].[FolderID] AND [m].[Unread]

I think the execution plan is optimal. The Messages table is seeked for
each folder and account combination using a Nested Loop. This explains
the high number of reads. Usually, this type of read does not translate
into high I/O, because the cache hit ratio will be high.

The index on Messages is somewhat wide (some of its column are not
needed in this query), but perfectly suited for the query. The Accounts
table does not (yet) have a perfect index. If there are AccountGroups
with many Accounts, then it pays to add an index on
Accounts(AccountGroupID,AccountID). However, it is doubtful whether the
performance improvement is big enough to elimate your problem. You might
simply be pushing your hardware to its limit, and might have to find the
solution there...

HTH,
Gert-Jan

microsoft.public.sqlserver.programming

Optimize an SP running against a table with 290 million rows.

MattB

mesaalejandro

MattB

Alejandro Mesa

MattB

Gert-Jan Strik

Alejandro Mesa

MattB

Gert-Jan Strik

x Login to ForumsZone