Asp Forum - Updating a Very larg table - microsoft.public.sqlserver.programming

CTS_DBA

3/19/2007 3:55:00 AM

Hi,

I have a issue with updating a column in a larger table which has about 35
million records. what I need to do is select the NULL value for a particular
column and populate it with a column value from the same table.

something like

update dbo.table1 set col1=col2 where col1 is null

i'm currently running this process in 10,000 rows per transaction and take
about 3 minutes to finish 100,000.

table has a clustered index and none of the columns involved is in any index.

can some one please suggest a better approach?
I'm running SQL 2000 EE SP4 on win 2003

thanks

--
CTS DBA

4 Answers

manganb

3/19/2007 5:24:00 AM

If you are comfortable with it, I would change the recovery mode to
Simple and then run it that way. Your bottle-neck may be at the
Transaction log level.

On Mar 18, 8:55 pm, CTS_DBA <CTS...@discussions.microsoft.com> wrote:
> Hi,
>
> I have a issue with updating a column in a larger table which has about 35
> million records. what I need to do is select the NULL value for a particular
> column and populate it with a column value from the same table.
>
> something like
>
> update dbo.table1 set col1=col2 where col1 is null
>
> i'm currently running this process in 10,000 rows per transaction and take
> about 3 minutes to finish 100,000.
>
> table has a clustered index and none of the columns involved is in any index.
>
> can some one please suggest a better approach?
> I'm running SQL 2000 EE SP4 on win 2003
>
> thanks
>
> --
> CTS DBA

SivaCh

3/19/2007 8:55:00 AM

Hi,
I have a suggestion here. Create a NonClustered Index on Col1 and then
perform your desired operation. I hope that will increase the performance of
update a lot.

Note: If you don't have a big issue with disk size, you can goahead and
create NON clustered index.

thanks,
Siva

"manganb@gmail.com" wrote:

> If you are comfortable with it, I would change the recovery mode to
> Simple and then run it that way. Your bottle-neck may be at the
> Transaction log level.
>
>
>
> On Mar 18, 8:55 pm, CTS_DBA <CTS...@discussions.microsoft.com> wrote:
> > Hi,
> >
> > I have a issue with updating a column in a larger table which has about 35
> > million records. what I need to do is select the NULL value for a particular
> > column and populate it with a column value from the same table.
> >
> > something like
> >
> > update dbo.table1 set col1=col2 where col1 is null
> >
> > i'm currently running this process in 10,000 rows per transaction and take
> > about 3 minutes to finish 100,000.
> >
> > table has a clustered index and none of the columns involved is in any index.
> >
> > can some one please suggest a better approach?
> > I'm running SQL 2000 EE SP4 on win 2003
> >
> > thanks
> >
> > --
> > CTS DBA
>
>
>

Roy Harvey

3/19/2007 11:51:00 AM

How are you limiting the transactions to 10,000 rows? If you are
using ROWCOUNT then I would expect each batch to take longer than the
last, as each UPDATE has to scan past more and more rows that are
already processed to reach the first row that has to be changed. That
is what the non-clustered index suggested in another response will
correct.

If you are using some other means than ROWCOUNT to batch the updates
that would be very important.

Roy Harvey
Beacon Falls, CT

On Sun, 18 Mar 2007 20:55:10 -0700, CTS_DBA
<CTSDBA@discussions.microsoft.com> wrote:

>Hi,
>
>I have a issue with updating a column in a larger table which has about 35
>million records. what I need to do is select the NULL value for a particular
>column and populate it with a column value from the same table.
>
>something like
>
>update dbo.table1 set col1=col2 where col1 is null
>
>i'm currently running this process in 10,000 rows per transaction and take
>about 3 minutes to finish 100,000.
>
>table has a clustered index and none of the columns involved is in any index.
>
>can some one please suggest a better approach?
>I'm running SQL 2000 EE SP4 on win 2003
>
>thanks

CTS_DBA

3/20/2007 5:45:00 AM

Hi,

yep I've been using the Rowcount...

One other condition that devlopers forgot to mention earlier came to light
lately, I actually don't have to search for the NULL values, I can keep
updating All of them.

so I got rid of the rowocunt and traverse the identity column ranges.. it
works in roughly 40 min for 35 mill

Thanks guys
--
CTS DBA

"Roy Harvey" wrote:

> How are you limiting the transactions to 10,000 rows? If you are
> using ROWCOUNT then I would expect each batch to take longer than the
> last, as each UPDATE has to scan past more and more rows that are
> already processed to reach the first row that has to be changed. That
> is what the non-clustered index suggested in another response will
> correct.
>
> If you are using some other means than ROWCOUNT to batch the updates
> that would be very important.
>
> Roy Harvey
> Beacon Falls, CT
>
> On Sun, 18 Mar 2007 20:55:10 -0700, CTS_DBA
> <CTSDBA@discussions.microsoft.com> wrote:
>
> >Hi,
> >
> >I have a issue with updating a column in a larger table which has about 35
> >million records. what I need to do is select the NULL value for a particular
> >column and populate it with a column value from the same table.
> >
> >something like
> >
> >update dbo.table1 set col1=col2 where col1 is null
> >
> >i'm currently running this process in 10,000 rows per transaction and take
> >about 3 minutes to finish 100,000.
> >
> >table has a clustered index and none of the columns involved is in any index.
> >
> >can some one please suggest a better approach?
> >I'm running SQL 2000 EE SP4 on win 2003
> >
> >thanks
>

microsoft.public.sqlserver.programming

Updating a Very larg table

CTS_DBA

manganb

SivaCh

Roy Harvey

CTS_DBA

x Login to ForumsZone