Has replication as we know it reached the end of its usefulness?

In a fully connected, always on, any device world there doesn't seem to be space for replication (or synchronization as it is called by other vendors). So why not scrap it and move on? I'm sure no one will miss the little diamond indicators denouncing a replication conflictIf you followed the teaser entry to the full post, you are ready for the real argument. I'm a big proponent of replication/sync, I just think it needs some fixing. In reality we are not fully but mostly connected and not always on but occasionally out of coverage. The big success in mobile devices: push mail (another fancy term for sync). So why do I state the end of usefulness. Let us look how replication works today: Replication is:

User initiated: client start/stop, button pressed, server console command typed
On schedule
Between clusters: streaming on data change
Network unaware, not being able to reach the other partner is considered a failure rather than a transient condition

The first two are available for clients and servers, #3 only for (cluster) servers. In a mobility world that model doesn't fit. I don't want to replicate or wait for the next schedule to occur. I need information entered available wherever I'm looking for it. I don't want to distinguish between online and offline, connected and out-of-coverage. Some samples to illustrate this:

One success factor for the Blackberry was exactly this behaviour. You can read and reply your email regardless your connection status. Outgoing stuff simply gets queued until network availability.
Another page from history: what was (besides the size and battery live) the one defining feature of the original Palm: no save button. Information entered was entered, Full stop. When the palm got network connectivity (think: put into its cradle) data was send to the other participating devices (your desktop) automatically.

What works for email needs to work for applications too. So what needs to be fixed in replication? Definitely not the mechanism as of such. It has been proven to be robust, reliable (short of occasional clearances of the replication history) and recoverable (think resync after error). Network awareness and triggers need to change:

We need an option: Sync after data change. A document locally created or changed gets queued for replication immediately. That would be very much similar (in concept) to the cluster streaming replication model. For the server it is a little trickier since the server would need to house-keep to which servers to replicate to and Admins need to have a say.
Replication needs to use a network aware queue that stop replicating to a server if it isn't reachable (but retries in defined intervals to clear the queue). The queue needs to be smart to eliminate duplicate requests. Only if a certain threshold is crossed status goes from pending to warning to failure.
Partial replication needs to be easier. Today entering a formula is close to rocket science. Some developer provision needs to be made, so a user can select something like "Replication profile {add-a-name-here}" that would define the subset for her. That profile, defined by the developer, would allow the use of @UserName or @UserNameList. It would allow to specify things like: replicate everything except the "request" form and all request forms where @UserName is in one of the following fields:... This will make it VERY easy to take applications offline. After all in a lot of typical Notes application I want MY data (and not your project plan, while I occasionally wouldn't mind peeking, which my access would allow, so using reader fields only doesn't cover the use case).
Currently servers can initiate a replication to a client (push-replication so to speak). That needs to change somehow. I would introduce a publish subscribe model. A client "subscribes" to a set of documents in a database and whenever a document matching the subscription criteria is touched the server sends a message to a message queue that data has changed. If the client is connected it will receive the message instantly and can initiate a client pull replication. If the client is not connected it will get the message next time it connects and will know what databases to replicate. One could argue: why not put the changed data into the message queue and eliminate the replication step? The answer is manyfold: change notifications would be very small and don't require a lot of storage. A client implementation might offer the user a choice: retrieve this and that but not everything now. But foremost: replication is tried and tested and we don't want to have different mechanism of data synchronisation, so the message queue would be just another trigger.

What's your take?

Posted by Stephan H Wissel on 17 November 2009 | Comments (2) | categories: IBM Notes Lotus Notes Show-N-Tell Thursday

posted by Karsten Lehmann on Wednesday 18 November 2009 AD:
Well, it's a good idea. But we would need an NSF storage on a mobile device to use any of the replication features.
As long as no mobile NSF store is available, this could also be solved by a custom implementation of an ISV.

Is IBM dev working on such a mobile NSF version? Emoticon smile.gif

posted by Stephan H. Wissel on Wednesday 18 November 2009 AD:
I don't see a direct connection between an NSF on a mobile device and this approach. There are a number of mobile apps that sync with Domino without using a local NSF. The beauty of the "please replicate" notification is that you can create adapters to any platform (I would use JMS as queue, so most of the code is ready baked) which then uses whatever mechanism is in place to pull data from Domino.
Emoticon smile.gif

stw