Alvin Bruney [MVP]
1/8/2007 3:48:00 AM
See inline.
> Scalability implies lots of web servers all referring back to a central
> SQL server, which in turn implies limited caching which in turn hurts
> performance opportunities.
It most certainly does not. Caching avoids the SQL bottleneck.
> I would like to use typed datasets for all the benefits they have, and I
> would like to use timestamps to assist in concurrent edit checking.
timestamps won't help you with concurrency because the timestamp isn't
guaranteed accurate since windows is not a real time OS.
Even a minor lag will thru off your sync on a heavy traffic day.
> I would like to cache the datasets in the asp.net application.
Nope, cache is a poor choice because it is per process. You have a multi-cpu
architecture on a web farm. That leads to cache duplication.
> I would cache data key in a client side cookie.
What happens if cookies are lost, unreadable or client turns them off?
> multiple servers and CPUs). *In my view getting into cache
> synchronisation across web servers will hurt the very performance gains we
> are trying to get via caching in the first place.*
Yes.
> As a user becomes interested in a specific set of data (house) the datakey
> cookie would be set, and this would drive the selection of web process
> that is best suited to serve the request.
Yes, but this is all driven by the client. Not a particularly good choice
since the client doesn't have to follow the rules you impose; that is, a
client can most easily disable cookies.
I *think* I can give each web process a
> distinct url by using application pool configuration, but I haven't
> confirmed this yet.
That doesn't solve your cache affinity problem.
> different web process, however I believe the DOS risk is sufficiently
> small that this design is still widely applicable.
You are pushing a cookie to the client, the wrong client can regenerate
multiple cookies that in turn drive the caching mechanism in your
architecture right?
Then, it's easy to flood the cache architecture from the client since every
request is valid.
--
Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc
"Martin" <x@y.z> wrote in message
news:%23KIyBnNLHHA.4460@TK2MSFTNGP03.phx.gbl...
> Hello all,
>
> We know that designing a web application that is both scaleable and high
> performance is difficult.
>
> Scalability implies lots of web servers all referring back to a central
> SQL server, which in turn implies limited caching which in turn hurts
> performance opportunities.
>
> Clearly there is no right answer for all scenarios, but I have been
> thinking over a particular design which I would like to get your views
> on...
>
> This scenario involves a collection of data which is concerned with an
> overall user operation. The data is persisted across multiple tables, but
> the primary keys are hierarchical in nature. Eg house relates to rooms,
> relates to furniture. The house primary key forms part of the room and
> furniture primary keys.
>
> I would like to use typed datasets for all the benefits they have, and I
> would like to use timestamps to assist in concurrent edit checking. One
> dataset would hold the data for one house (plus related tables).
> I would like to cache the datasets in the asp.net application. If it
> times out, so be it I can go fetch a new version.
> I would expect data edits to be applied to the database as part of the web
> request operation, so dataset and database remain in sync.
> I would not anticipate using Session state for this application.
> I would cache data key in a client side cookie.
>
> I require affinity to the specific cache and therefore web process (across
> multiple servers and CPUs). *In my view getting into cache
> synchronisation across web servers will hurt the very performance gains we
> are trying to get via caching in the first place.*
>
> As a user becomes interested in a specific set of data (house) the datakey
> cookie would be set, and this would drive the selection of web process
> that is best suited to serve the request. Consequently as the user works
> with the site, different requests may be served by different web
> processes/servers. If the datakey cookie is not set, then no cache
> affinity is required.
>
> I have looked for some extension to Microsoft's Network Load Balancer
> using a provider pattern to allow me to control the selection criteria of
> a specific web process, but without success.
> I want to take advantage of the NLB heart beat facility. The scenario I
> imagine is say a collection of four web processes (spread say across two
> servers each with dual processor). I *think* I can give each web process
> a distinct url by using application pool configuration, but I haven't
> confirmed this yet.
>
> So I would expect my web process selection algorithm to be driven by the
> value of the cookie holding a datakey. The algorithm would distribute the
> requests according to the data keys. I was thinking something simple like
> modulo 4 of the house ID in this scenario. When a server goes down NLB
> should know this, and expose this to my provider code. My web process
> selection algorithm would check the required web process is alive
> (refering to NLB API), and make an alternate selection if necesary.
>
> So far as I am aware, the piece of the picture that is missing is a
> provider pattern API in NLB to facilitate this. I wonder if this is
> something that is on the drawing board at Microsoft (or a third party
> supplier for that matter).
>
> Apart from that piece missing, the main disadvantage I can see in this
> design is it's defence against denial of service attack. Theoretically
> attackers would need to select just four distinct IDs that each hit a
> different web process, however I believe the DOS risk is sufficiently
> small that this design is still widely applicable.
>
> Other issues that I know come into play include:
> security of datakey
> overhead of establishing potential ssl sessions on each new web process,
> as datakey changes (I think this is relatively infrequent)
> authentication cookies would need to apply to scope of entire web farm.
> authorisation to access data would need to occur in the web application,
> not just the database.
>
> NB this design does not preclude distributed web farm clusters on
> different continents (each cluster potentially caching the same data),
> because at the end of the day if concurrent data edits are detected, the
> dataset can be refreshed from the database, and the user can reconfirm
> their edit operation.
> Also in this scenario, there are likely to be multiple databases
> synchronised using replication. Typically the set of editable data would
> configured for each database.
>
> I would welcome a lively discussion on the viability on the design.
>
> Thanks very much for your time.
>
> Martin
>
>