Cowboy
3/5/2008 3:32:00 PM
I still think XML may be your best option, but preload the XML into a single
table and then run processes on the data there. This works best if you can
create a table to load. You may need multiple load tables if you have
different clients. This will still hammer the server for a bit. It should
not be an issue for thousands of records, however. After you fill one table,
you need to get its primary key values into the load table, especially if
using row guids or IDENTITY.
When you get to millions of records, this can still cause issues with the
server. In these cases, you may want to preprocess outside of SQL Server. If
you opt for this, you essentially create bcp files (or files for
BULK_INSERT) outside of SQL. When you get to the step to create IDENTITY
values, you can use DBCC commands to increment a "hole" the size of the
records being loaded. The customer's unique key and new IDENTITY driven
primary key can be loaded into a hashtable. Load time for a million records
is about 2-3 seconds with 1 million lookups takign far less than a second.
At least this was the stats we worked with when I did a project that
preprocessed offline.
If you can load into a single table from XML and then run commands, it is
less time consuming to code.
What about batching? It is a possibility, but realize you are just
flattening the performance curve. Rather than a single huge hit, you are
taking many smaller hits. This could well be the solution you are looking
for, but you may find that this approach fails as you grow in size. Of
course, you can always move SQL Server to a big box, the data files to a
SAN, etc. to alleviate some, if not all, issues. At least until you grow to
millions of files being loaded this way.
Another potential option is getting more transactional with the process that
creates the XML so single records are sent. If this is a client, it may not
be an option. Even internal, there may be valid reasons not to go record by
record.
Also, in most of these systems, you have to check for existing records
before loading, as inevetably someone sends you the same info. You may have
some system of checks and balances here to avoid this. If not, consider it,
as a process failure in the middle of a large batch of records can be
extremely hard to cure.
--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA
*************************************************
| Think outside the box!
|
*************************************************
<mj2736@yahoo.com> wrote in message
news:e95705f2-1199-473f-ba0a-62acce54aa36@n36g2000hse.googlegroups.com...
> In our .Net 2.0 web service we need to insert/update data records into
> SQL Server 2005 that have been uploaded to us in XML documents. There
> could be anywhere from one to literally thousands of records in each
> uploaded transaction, and there are several dozen destination tables.
>
> Since performance is critical for this particular app, we were
> thinking of combining the stored procedure calls into configurable-
> sized batches using CommandType.Text and multiple concatenated EXEC
> statements, so we minimize database roundtrips as opposed to calling
> each SP individually in its own SqlCommand.
>
> Is this a good approach? One concern is that SQL Server would have to
> compile each and every batch since they will all be different, and the
> overhead involved in this could become significant. Is this really
> something to worry about, or not? Wouldn't a few batch compiles per
> transaction still be better than potentially thousands of roundtrips?
> Are there other problems with this approach?
>
> Of course we are planning to stress test to see how it actually
> responds, but I just wanted to get some opinions first in case someone
> has done this before and can give advice/recommendations/suggestions
> before we get too far into it.
>
> Thanks!
> MJ