The sentinel’s duty (a modern polling’s fairytale)

sentinel

I like real-time web app, I like to use it, I like to design it.
Even if the context seems very similar, every real-time web app is a different challenge with peculiar, specific needs.

Because a lot of constraints about retro-compatibility and browsers support only in few occasions I was been able to play with SignalR or the newest technology to deal with real real-time data in an enterprise scenario.

The strategy I have to implement the most is the polling, so project after project I’ve tried different approaches.

Currently my favorite is the “Sentinel’s approach”, at least I call it in this way, I don’t know if this “pattern” exists with a different, well known name.

If you have to deal with high traffic websites the optimization of the data is not a detail, so it is a good idea optimize even your polling, mostly if you can’t have at your disposal things like this for your new sparkling project:

server-farm-redSo sadly also the long polling is not a valid option in a lot of cases.

Besides normally i prefer to work with static files in order not to overload the db and the server and for the same reason my polling rate rarely is below the 5 seconds.

It means, to clarify, that in the Sentinel’s approach you have an almost-real-time app and not a real-real-time app, but luckily it is enough for a lot of web applications.

A simple example

Let’s assume that you want to build an application to follow a live event, like a chess match.

You want to be updated every time a new move is done, in this case to be updated with a delay of 5 seconds (in the worst case) is acceptable.

Speaking about the data for a second:

It makes a huge difference if you are the owner of the data or you depend on a third party source.
If you own the data you can do whatever you want, but if you depends on a third party there are different scenarios with which you could have to deal.

The most common:

  1.  Incremental: any time there are new data the third party provide the delta missing (the last move in the chess match).
  2. Overall: any time there are new data the third party send ALL the data till the latest information (all moves done in the chess match).
  3. Mixed: they provide the Incremental and the Overall.

Excluding the third one, the two scenarios above drive you in the design of your application: how to store the information inside your application, how to update the UI with the information, how to correct an old information sent wrong, what to do in case of missing data or in case of failing a call, what to do when you bootstrap the application and the match is already started and you have to recreate the information till the latest moment, and so on.

If you own the data probably you will go with the third option to make your life easier, but in any case you will ask the same file every 5 seconds with information that probably you won’t need because you already have, so you will discard it.

If you have to ask the Overall file every time the waste of bytes per request grows more the match goes on, but even with the Incremental file you ask for information that you have to discard and probably even the Incremental file won’t be as small as expected.

And how can you get advantage of a cache service (Akamai for instance) without risking to lose new data?

In case of doubt trust the Sentinel.
So what is the Sentinel and what is its duty?

The Sentinel is a small, small file (few bytes) that warns you when new data are present.
In the Overall file case (the simpler from this perspective) every time the third party send you new information you rewrite the Overall file and update a time-stamp field in the Sentinel file.
If the time-stamp stored in your client is different, the app asks for the Overall file and updates its internal time-stamp with the one in the Sentinel: the next five seconds the app will get the Sentinel with now the same time-stamp stored and won’t ask for the Overall file.

And so on till the new sending from the third party.
Basically your polling looks for the Sentinel and the Sentinel gives you the information (a timestamp in our simple case) to decide what to do.
So waiting for a new move during the chess match (and a move can take a very looong time) the app will ask to the server few bytes every 5 seconds instead of Kb every 5 seconds.

I know that it can seem a bit paranoid, but trust me, if you have thousands or tens of thousands of people following the chess match at the same time, the difference between bytes and kilobytes can be a huuuuge difference, often it’s that difference that makes your server(s) up and running.

A sample of a Sentinel can be:

{"t":1417105815282}

You can use the time-stamp in the Sentinel as a cache-buster for the Overall file.

overall.json?t=1417105815282

In this way you would take full advantage of services like Akamai, because every new Client connecting will ask for a file with the same time-stamp already cached by Akamai and won’t hit the origin.
Simplifying: you are synchronizing the Clients with the same time-stamp.

Of course I’ve simplified a LOT the possible scenarios and their implications, but the Sentinel’s approach should be clear (hopefully!). Depending on the data, the hardware, the traffic expected, the polling rate needed and a lot of other details related to the specific business, the Sentinel can be enriched to allow more sophisticated logic just paying a few bytes more.
You can also take different decision on how/where store/read the information instead to use a static file as I suggested in this example.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s