This functionality is available in versions 0.0.6 and up.
An obvious repercussion of being able to replicate documents about the place is that sometimes you might edit them in more than one place at the same time. When the databases containing these concurrent edits are replicated, there needs to be some way to bring these divergent documents back together. Cloudant's MVCC data-model is used to do this. This page describes how it works.
The Sync datastore participates in master-less replication with Cloudant or Apache CouchDB. What this means is that there is no canonical copy of the documents in each database. One of the main results of this fact is that changes may happen to a document in many different places concurrently. When these changes are replicated between previously disconnected databases, conflicts arise. However, Cloudant and Cloudant Sync provide ways to both access and resolve conflicts from within your application.
It's important to understand that this data model is in place to make sure that:
- The user loses no data -- we keep all versions of a document that haven't been superceded. That is, all leaf nodes of the tree.
- The application has as much information as possible to resolve the conflicts, as it's able to examine all of the leaf nodes of the tree before resolving a conflict.
Cloudant Sync's MVCC data layer is key to the conflict resolution process, and can be visualised as a tree structure.
A document is really a tree of the document and its history. This is neat because it allows us to store multiple versions of a document. In the main, there's a single, linear tree -- just a single branch -- running from the creation of the document to the current revision. This is the usual case, and looks like this, with the revisions represented by their revision IDs:
1-x --- 2-x --- 3-x --- 4-x
^
"winning" revision /
The fact that the document is a tree implies that it's possible, however, to create further branches in the tree.
When a document has been replicated to more than one place, it's possible to edit it concurrently in two places. When the datastores storing the document then replicate with each other again, they each add their changes to the document's tree. This causes an extra branch to be added to the tree for each concurrent set of changes. When this happens, the document is said to be conflicted. This creates multiple current revisions of the document, one for each of the concurrent changes.
Say we last replicated the document above at the 2-x
revision. We make
two changes locally (3-x
and 4-x
) and the remote datastore has a single
change made to it (3-y
). On replicating back from the remote, the local
datastore ends up with a document like this:
replicated from remote
|
v
------ 3-y
/
1-x --- 2-x --- 3-x --- 4-x
^
"winning" revision /
We now have two non-deleted leaf nodes: the document is conflicted.
To make things easier, calling [CDTDatastore -getDocumentWithId:error:]
returns one of
the leaf nodes of the branches of the conflicted document. It selects the
node to return in an arbitrary but deterministic way, which means that all
replicas of the database will return the same revision for the document. The
other copies of the document are still there in the case of conflicts,
however, waiting to be merged, as shown below.
When a document has been changed in many places, it becomes conflicted. This means that there are a number of active, alternative versions of the document. Applications -- whether on device or a web app communicating with the Cloudant or CouchDB HTTP interfaces -- must resolve the conflicts by creating a merged version of the active versions of the document, then updating the document with this and deleting the now obsolete leaf nodes.
Fortunately, Cloudant Sync has helper methods to simplify this. There's a method which returns all the documents in a conflicted state, along with a helper method to streamline the process of resolving conflicts.
There's a method on the CDTDatastore
interface:
Objective-C:
- (NSArray*)getConflictedDocumentIds;
Swift:
public func getConflictedDocumentIds() -> [AnyObject]!
This method returns an iterator over the document IDs:
Objective-C:
for (NSString *docId in [datastore getConflictedDocumentIds]) {
NSLog(@"%@", docId);
}
Swift:
for docId in datastore.getConflictedDocumentIds() {
print(docId)
}
Once you've found the list of documents, you need to resolve them. This is
done one-by-one, passing a class able to resolve conflicts and a document
ID to the resolveConflictsForDocument:resolver:error:
method
of the CDTDatastore
interface.
The CDTConflictResolver
interface has one method:
Objective-C:
@protocol CDTConflictResolver
-(CDTDocumentRevision *)resolve:(NSString*)docId
conflicts:(NSArray*)conflicts;
@end
Swift:
public protocol CDTConflictResolver {
public func resolve(docId: String!, conflicts: [AnyObject]!)
-> CDTDocumentRevision!
}
This method is passed the docId and the list of active revisions, including the current winning revision. A rather simplistic implementation which returns a revision from the list of conflicts would be:
Objective-C:
@implementation CDTPickFirstResolver
-(CDTDocumentRevision *)resolve:(NSString*)docId
conflicts:(NSArray*)conflicts
{
return [conflicts objectAtIndex:0];
}
@end
Swift:
class CDTPickFirstResolver : NSObject, CDTConflictResolver {
func resolve(docId: String!, conflicts: [AnyObject]!) -> CDTDocumentRevision! {
return conflicts.first! as! CDTDocumentRevision
}
}
Clearly, in the general case this will discard the user's data(!), but it'll do for this example.
It is also possible to return a CDTDocumentRevision
from
resolve
, perhaps by merging data from the conflicts:
Objective-C:
@implementation CDTMergeResolver
-(CDTDocumentRevision *)resolve:(NSString*)docId
conflicts:(NSArray*)conflicts
{
CDTDocumentRevision *rev = conflicts[0];
rev.body = /* ...update body, perhaps with data from the other conflicts */
rev.attachments = /* ...you can also create/update/delete attachments */
return rev;
}
Swift:
class CDTMergeResolver : NSObject, CDTConflictResolver {
func resolve(docId: String!, conflicts: [AnyObject]!) -> CDTDocumentRevision! {
let rev = conflicts.first!
rev.body = /* ...update body, perhaps with data from the other conflicts */
rev.attachments = /* ...you can also create/update/delete attachments */
return rev as! CDTDocumentRevision
}
}
Conceptually, the resolveConflictsForDocument:resolver:error:
method
does the following:
-
Get all the non-deleted leaf node revisions for the document.
------ 3-y / 1-x --- 2-x --- 3-x --- 4-x
That's
3-y
and4-x
here. -
Call
-resolve:conflicts:
with the list of revisions from (1). -
Take the returned revision and update the current winning revision (
4-x
) with this revision. -
Delete the other non-deleted leaf nodes (
3-y
in this case) of the document tree.
The tree ends up looking like this:
------ 3-y --- 4-deleted
/
1-x --- 2-x --- 3-x --- 4-x --- 5-x
^
"winning" revision /
The winning revision is now the only non-deleted leaf node, so the document is no longer conflicted.
All this happens inside a transaction, ensuring consistency.
This resolution can be replicated to the remote document store, bringing the two databases into a consistent state.
You could imagine an application running the following method via a timer to periodically fix up any conflicts:
Objective-C:
- (void)resolveConflictsInDatastore:(CDTDatastore*)datastore
{
CDTConflictResolver *pickFirst = [[CDTPickFirstResolver alloc] init];
for (NSString *docId in [datastore getConflictedDocumentIds]) {
NSError *error;
if !([datastore resolveConflictsForDocument:docId
resolver:pickFirst
error:&error]) {
// handle error
}
}
}
Swift:
public func resolveConflictsInDatastore(datastore:CDTDatastore)
{
let pickFirst = CDTPickFirstResolver()
for docId in datastore.getConflictedDocumentIds() {
do {
try datastore.resolveConflictsForDocument(docId as! String,
resolver: pickFirst)
} catch {
//handle error
}
}
}
How often this should run depends on your application, but you'd probably want to consider:
- Running every few minutes.
- Running when a pull replication completes.
We're always looking at ways to improve the experience around conflicts, so be sure to file an issue if you have suggestions or problems.