Why Session State Should Not Be Stored In A Distributed Cache

clock Wednesday, 30 September 2009 21:16 by sunny

Web developers often refer to session state stores and caches interchangeably, while in actuality they serve different purposes.

A cache serves as a caching layer between a web application and an external data source. Caches exist mainly to lighten the load on the external data source, thus improving the performance of the application.

The purpose of a session state store is to store a user’s workspace. Session state stores enable client activity to be persisted consistently across several HTTP requests.

Applications that utilize a cache write to the external data source, typically a database, and read from the cache. This technique can significantly improve the performance of applications that mostly read from the data source.
Caches are designed to be read speedily, simultaneously by many clients and threads. Some cache implementations can link the cached object to the data source, such that if the data source is updated, the cache is invalidated.

Session state stores are not linked to external data sources by design, although it can be consumed by an application to do so. While it may be necessary to store certain parts of a client’s session state to a database, there is usually no need to store all of the session state to a database. In fact, many database-less web applications rely solely on session state to operate.
Session state implementations are designed such that each client has exclusive access to its session data. Even though caches can be consumed by an application to simulate this, exclusivity is not enforced and there is always a chance that a client will be able to access another client’s session data due to either poor design or security flaws.

A distributed cache spreads out an application’s caching layer across many machines, which allow high-traffic web applications to scale out by adding more machines as demand increases. Performance can also be improved by distributing session state data across many machines; therefore, it is worthwhile to examine the requirements and nuances of cached data and session state before deciding on a distributed solution to apply.

Distributed caches, like local caches, are most effective when used to cache data that changes infrequently. They also support simultaneous fast reads of a cached object by many threads. This is where session state sharply differs from cached data.

Session state has little need for speedy multiple-thread access to a single stored resource because a client can only exclusively read or update its session. In addition, the usage pattern of session state is unpredictable. Some applications update session data very frequently while some do not. Session state storage designers normally safely assume that session state is write-heavy.

Caches are configured by default to use either an optimistic concurrency mechanism or no concurrency control at all, to access cached data. This design is driven by the strong requirement to eliminate blocking by any means possible, and works superbly due to the lower proportion of writes to reads.
Session state stores, on the other hand, utilize a pessimistic concurrency mechanism to access stored data. This works effectively because of the exclusive nature of resource access.

The number of concurrent session state accesses to a stored resource can increase if a user opens up several web browser instances of the same application or if the application makes use of numerous AJAX calls. Notwithstanding multiple instances and AJAX-intensive applications, a user’s session cannot have more than a handful of concurrent access attempts.
A pessimistic concurrency mechanism, as used by session state stores, can gracefully handle a few concurrent accesses on a write-heavy resource, and more importantly, provide consistent data to all operations. Inconsistencies in served data can arise, if a cache with no concurrency control is employed to store write-heavy session state. This problem becomes more apparent if the application is AJAX-intensive.

Critical applications that rely on session state require failover and redundancy support. These features are usually built into commercial session state storage solutions.
Caches have no need for failover or redundancy because caches are simply a caching layer: if the requested data cannot be retrieved from the cache, it can always be fetched from the primary source. Therefore, most distributed cache implementations do not support failover or redundancy; issues solution architects seldom remember when moving session storage to a distributed cache.

The conundrum of where to store session state arises when an application needs to scale to accommodate more users.
While there are a few commercial distributed session state storage solutions, there are no free robust alternatives, and the usual consensus is to store session state in freely available distributed cache solutions, or eliminate session state entirely from the application.

Moreover, even when session state is manageably stored in a distributed cache, most often, the same servers that are caching infrequently changing data are used to store session state. Sharing the cache this way leads to performance degradation.
This occurs because whenever the cache server needs to store a new cached object or remove an expired one, it has to momentarily suspend all read operations internally on all other cached objects until the object is added or removed. The overall outcome is sub-optimal reads for cached infrequently changing data.

Developers and architects should carefully weigh the aforementioned issues before moving locally stored session state to a distributed storage and should, whenever possible, opt for a solution that was specifically built for distributed session state storage.

Digg It!RedditDel.icio.usStumbleUponTechnorati

When Documentation Does Not Match Implementation

clock Saturday, 19 September 2009 19:39 by sunny

Not too long ago, I needed to piece out the communication protocol between the ASP.NET state server and the web server, because I was working on a peer to peer version of the ASP.NET state server.

I made some effort to obtain the protocol as described in this post. Along the way, I found the protocol documentation by Microsoft, and was delighted. I could now design my implementation of the state server based on this information.

While going through the documentation, I noticed that the format of some of the messages did not match what I had earlier observed, so I decided to verify the protocol to be extra sure – Boy, was I in for a surprise.

First, there are bold, wrong statements in the documentation:

From section 3.1.5.3: “Because a client sends a lock-cookie value along with the session state data, the state server MUST store the lock-cookie value. Internally, the state server MUST also store the date and time when the state server received the lock-cookie value. This information is necessary if the state server ever has to send response-locked messages, as specified in sections 2.2.5.2 and 2.2.5.4.”

This is a wrong, misleading statement. A LockCookie value is used by a Set Request message to unlock a locked session entry (if it is locked) before storing the new data.
The state server has no use for storing the client LockCookie value, in fact, the state service MUST never store any LockCookie value sent by the client or in any way let the client influence LockCookie values as they are exclusively generated by the server.

From section 3.1.5.5: “A client can acquire an exclusive lock on session state by using either a successful GetExclusive_Request or Set_Request message.”

This is another off the mark statement. A client can only acquire an exclusive lock with the GetExclusive Request.
Even a cursory look at the SessionStateStoreProviderBase class is sufficient to confirm that this statement is wrong.

Then, there is important information that is left unstated:

The document does not mention anywhere that the ActionFlags header actually indicates that the server should only store the presented data if the unique session id does not already exist. If the ActionFlags header value is set to 1, and the server already has the presented session id, the existing session data will not be updated with the new one, however the state server will still reply with an OK response (as if it stored the data). This behavior is not easily noticeable to the casual observer, but is important to implement a 100% compatible state server.

There are other inaccuracies and misleading statements in the documentation that makes it virtually impossible for anyone to develop a state server implementation using the Microsoft documentation.
I had to painstakingly piece out the protocol from scratch. I’ve published the correct version of the protocol in PDF format and HTML format for reference purposes.

What's more, if you take a look at the history of the Microsoft document, you’ll notice that it has been edited more than twenty times over the course of almost three years. You’d think that after twenty edits, it will be somewhat accurate, but after almost three years of editing the documentation for a major ASP.NET server, Microsoft still manages to get it wrong.  It’s safe to say that either Microsoft is doing this intentionally or they do not know how their own technology works.

What’s even more disturbing is that this documentation is for a protocol, not for a piece of software. Do protocols change every other month? Imagine the chaos that would ensue if every developer had to second guess each statement in RFC-2821 when writing an email client and then also make sure that the protocol hasn’t changed every other month.

It seems the only reason Microsoft publishes these specs at all is to pay lip service to the European Union because there is no point publishing specs that are innacurate and can't save developers’ time when implementing a technology.

Digg It!RedditDel.icio.usStumbleUponTechnorati

I Want a Native C# Compiler

clock Monday, 26 January 2009 11:39 by sunny

Wouldn’t it be nice if C# code could be compiled directly to machine code? Having such a compiler would position C# as a serious system programming language.
Developers would be able to write system software for routers, for instance, in C#.

I don’t see any reason why such a compiler should not exist. In fact, the creation of a native C# compiler will be well justified.

C# is a well designed programming language. It would be a shame if the language is stuck forever with the .NET/Mono frameworks, especially if you consider that there is no reason why the language has to be inextricable tied to these frameworks.
There is no requirement that C# code must compile to IL and there are no language-level assumptions that the compiled code has to be machine-independent.
High-level features routinely used by developers such as threading and reflection, are .NET library calls and have no connection to the language itself. The only high-level feature that the language implies is a garbage collection system. In fact there are language-level hints that C# can be a system-level programming language -- how often do you use the volatile, stackalloc and fixed keywords?

The C# developer base is huge, so a native C# compiler will push the language even further to new platforms and projects that are currently unsuitable for development with C#. It will enable developers to write ALL their code; high-level and low-level in C#. Higher-level code will be compiled to IL, whereas lower-level code will be compiled to machine code.

A native C# compiler would be great for coding libraries, for instance, a proprietary encryption or compression algorithm. With a native C# compiler, an algorithm can be coded in C# and compiled as a native code library. This library can be linked to and used in systems without any .NET (or alternative) frameworks.
The best part is that if this library is needed for a .NET/Mono project -- all that is needed is recompilation and the algorithm will scale to managed code without having to port the library or use unmanaged calls. It will work great in both managed and unmanaged worlds.

The language is an open standard, so anyone with the time, expertise and resources can create a native C# compiler. Also, to be successful, this compiler only needs to support C# version 2.0.
Language features introduced in versions 3.0 and 4.0 are not that important and can be considered “Microsoft extensions” to the C# language. Indeed, Anders Hejlsberg admitted that features added in C# 2.0 were features they didn’t have time or didn’t know how to properly implement in C# 1.0.

If you are still not convinced about the viability of such a compiler, take a look at the compilers available for the D language. They are living proof that such a compiler is feasible and will compile a modern language directly to machine code, complete with a (small) runtime, memory management and type system.

 

Digg It!RedditDel.icio.usStumbleUponTechnorati
Tags:   ,
Categories:   C# | Random Thoughts
Actions:   E-mail | Permalink | Comments (10) | Comment RSSRSS comment feed

The Soul of An Engineer

clock Saturday, 2 December 2006 23:34 by sunny
It’s difficult to describe the inner workings of the engineer. Peering through his glasses may seem like looking for what Alice found through the looking glass.

Basically, an engineer is someone who applies scientific or mathematical principles to develop solutions that are useful. To meet this noble goal, the engineer has to see a world that has never been, in order to create it. The best practices to achieving this include ‘concretizing’ mathematical models, processes, methodologies, reasoning and imagination.

As an engineer, I know that this required out-of-box thinking can make us seem rather eccentric. Many of us reason in Boolean logic, some of us jot down queer looking notes on napkins during dinner parties and some (a lot less) wake up, lying next to their dog.

A true engineer is born and not made. You notice their inquisitive and creative minds even when they are very young.

You know you are one when:

1. The first thing you do after waking up is pull down your white board, recompile the Linux kernel or start AutoCAD.
2. If you’d rather fix your car (with all the extra gizmos you installed) than go to the mechanic.
3. You prefer to use Gauss-Jordan elimination to solve simultaneous equations.
4. You have worked extensively with MathLab at some point in your life.
5. And you’ll rather grab an AMD Opteron quad-core than grab a bikini-clad Angelina Jolie.

It’s the passion and sense of responsibility that drives us.
From designing a new coffee-maker to building the next Taj-Mahal to working on avionics for the next generation of Tomcat bomber jets. Getting those thermodynamic, torsion and discrete math equations right could be the difference between a happy or a burnt home, losing or winning a raging war, and uplifting a nation in its entirety.

I’d like to give kudos to the lady engineers. The ones who took up this gargantuan profession despite all the odds. The ones with pencils in their ears whilst wearing high-heeled shoes. A lady engineer is courageous, motivated and adept at work; she understands people and real-world issues and expresses her abstract thoughts even more fluently. She is firm and yet quite tender.
If Alice was an Engineer, she would have told Humpty Dumpty she could put him together again and so thereby shorten the poem.

Engineers have some rather exceptional traits.
An excellent engineer, more than anyone else thinks in paradigms. We have the ability to pull together and apart any object or situation with just our imagination. We dream the dreams of giants and we can see greatness in the modest.

To my fellow engineers – Though the world around you may not understand you quite as much as you understand it, remember that most of us were born well before our time and that God loves the Engineer more than any other profession because we live out the Creator’s first commandment.

Also there’s a divine perk to the profession -- God is an Engineer. All engineers go to heaven.
Digg It!RedditDel.icio.usStumbleUponTechnorati
Categories:   Random Thoughts
Actions:   E-mail | Permalink | Comments (2) | Comment RSSRSS comment feed