[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

microsoft.public.sqlserver.programming

How to maximise write performance for streaming data via C#

drewnoakes

3/19/2007 6:32:00 PM

Hello,

I'm working on a component that takes streaming data and writes it to a
database. Each item in the stream has a key, and multiple items per key will
be seen. They must then be inserted/updated in a particular SQL Server 2005
table. The schema of items is fixed and maps to the table's columns. There
will be only one row per key in the table.

My component currently batches them up over about 500ms (configurable, but
ideally in this range). When multiple items have the same key during this
time, only the last item need be kept.

I'm currently observing around 1200 messages per second, though the peak may
be higher. Even with duplicate keys, I'll need to support writing to 1000
rows per second, each of about 8 columns. The server will be running other
services too.

Some random ideas:

- I'm assuming sending XML or SQL statements is out of the question due to
string parsing.
- Would a C# stored procedure that parses a binary blob (image?) with a
custom binary encoding make sense?
- Is the problem easier if I can guarantee that a row with my key already
exists (i.e. do inserts up-front, and only update in realtime)?
- Someone mentioned BCP as a fast way to load data.
- Am I trying to be too clever here?

Thank you in advance for your thoughts and help.

Best regards,

Drew Noakes.
1 Answer

Robert Klemme

3/20/2007 8:45:00 AM

0

On 19.03.2007 19:32, drewnoakes wrote:
> I'm working on a component that takes streaming data and writes it to a
> database. Each item in the stream has a key, and multiple items per key will
> be seen. They must then be inserted/updated in a particular SQL Server 2005
> table. The schema of items is fixed and maps to the table's columns. There
> will be only one row per key in the table.
>
> My component currently batches them up over about 500ms (configurable, but
> ideally in this range). When multiple items have the same key during this
> time, only the last item need be kept.

That seems like a clever idea to me.

> I'm currently observing around 1200 messages per second, though the peak may
> be higher. Even with duplicate keys, I'll need to support writing to 1000
> rows per second, each of about 8 columns. The server will be running other
> services too.
>
> Some random ideas:
>
> - I'm assuming sending XML or SQL statements is out of the question due to
> string parsing.
> - Would a C# stored procedure that parses a binary blob (image?) with a
> custom binary encoding make sense?

I'm not sure whether DB's are particularly good at parsing stuff. I'd
rather provide records to the DB instead of having the DB have to rip
apart something. Also consider that something has to be packaged into a
BLOB before the DB can rip it apart again. Sounds like too many
transformations in between.

> - Is the problem easier if I can guarantee that a row with my key already
> exists (i.e. do inserts up-front, and only update in realtime)?
> - Someone mentioned BCP as a fast way to load data.

Since you have to do update and insert, you probably need to fill a
staging table and execute two operations one update and one insert to
the real target table. But yes, the original loading can be made pretty
fast by using bulk load operations.

> - Am I trying to be too clever here?
>
> Thank you in advance for your thoughts and help.

Dunno whether that's possible from C#'s DB connectivity, but with JDBC
you can batch invoke stored procedures. The procedure would first
update a row and if that returns 0 rows changed insert it. When called
in batches of 100 this is pretty fast because you do have less than one
network roundtrip per record on average.

Ultimately you'll have to implement test versions of several candidates
and measure.

Kind regards

robert


F'up to microsoft.public.sqlserver.programming