Distributed Reconciliation

Y. Chong, Y. Hamadi, M. Shapiro and K. Moody

 

IceCube is envisioned to provide reconciliation services to distributed applications/primaries where each application/primary is able to retain its autonomy in managing its objects.

Conflicting operations are deemed to be a rare occurrence in optimistic replication systems; therefore users are able to perform tentative operations on their local copies (replicas) of the shared objects without any immediate synchronisation. Operations are deemed tentative because they could conflict with other operations performed by some other users and may be aborted by the system at a later time. Due to the lack of synchronisation, replicas inevitably diverge and become inconsistent and have to be made consistent with each other through a process called reconciliation. IceCube performs reconciliation by computing a schedule (an ordered set that includes only non-conflicting operations) which excludes some conflicting operations. When the computed schedule of operations is executed at the replicas, the system converges to a new single consistent state.

However, the current version of IceCube like many other optimistic replication systems such as Bayou [6] and CVS [1]  employ centralised policies to perform reconciliation. Such a configuration requires all tentative operations to be propagated to the elected site for reconciliation. Furthermore all the authoritative copies of all the objects have to be stored at this site to guarantee that the computed schedules are correct. Such a centralised architecture presents several drawbacks like limited scalability, single point of failure and lost of autonomy (applications/primaries have to surrender control over their objects to a single site).

The main objective of this project is to develop a distributed algorithm for computing the schedules. Our motivations for this approach are:

1.      Fault-tolerance - ability to make progress if one or more primaries fail or are non-contactable.

2.      Autonomous control - each primary/application retains its authority in committing operations acting on its objects.

3.      Privacy – a tentative operation is only propagated to the primaries that have the authority to commit or abort it. 

The model we are currently exploring relies heavily on the formal constraint theory. We represent the problem using Distributed Constraints Satisfaction Problem (DCSP) formulation [3].

Formally, a DCSP consist of 4 elements (X,D,C,A) where:

1.      X is a set of n variables  x1, x2, …, xn

2.      D is a set of domain where D1, D2, …, Dn, is the set of possible values for the variables x1, x2, …, xn respectively,

3.      C is a set of constraints on their values.

4.      A = {A1, A2,…, Am} is a partition of the n variables amongst m autonomous sequential processes  Agent1, Agent2, …, Agentm  called agents, where each agent Agentk “owns" a subset Ak of the variables in x1,..,xn

The solution to a DCSP is an instantiation of all n variables such that all the constraints on their values are respected.

Therefore we can formalise a distributed reconciliation problem as a DCSP in which there are:

1.      X is the set of 2n variables  x1, x2, …, xn,s1, s2, …, sn where n is the number of operations in the system to be reconciled,

2.      D is the set of domain where Dx1, Dx2,…, Dxn, Ds1, Ds2, …, Dsn , , is the set of possible values for the variables x1, x2, …, xn,s1, s2, …, sn respectively, where Dxi=[0..1] and Dsi=[-(n-1) .. (n-1)] | 1 ≤ in

3.      C is the set of predicates defining the IceCube constraints between the operations {see [4] for definition},

4.      A is a partition of the 2n variables amongst the set of m autonomous primaries P1, P2, …, Pm defined by f: X θ A such that f(xi) = Pk  ==> f(si) = Pk and f(si) = Pk  ==> f(xi) = Pk | 1 ≤ in; 1 ≤ km

Solving a distributed reconciliation problem would then be equivalent to solving a DCSP problem which is to search for an assignment of values to the set of variables such that none of the constraints are violated. Although satisfaction ensures that the computed schedules are correct, it doesn’t provide any guarantees on the quality of the computed schedules. For example, a correct schedule could exclude higher priority operations and include lower priority ones. Although correctness is important, correctness alone is not enough; we would also like the quality of the aggregate schedules to be as close to the quality of the optimal (albeit a centrally-computed) solution (i.e. we want to maximise the number of non-conflicting operations in the computed schedule).

Despite its optimality, complete protocol such as ADOPT [5] is not practical. Indeed we have to manage hundreds of operations distributed amongst dozens of primaries. Therefore we are considering protocols that are correct but possibly not optimal.

We have implemented a simulator (see screenshot below) to evaluate the viability of the model. The simulator generates a random problem that is bounded by the parameters defined by the user. 

The graph on the left represents the set of primaries as a flat structure. Each primary in the system is represented as a node in the graph and two nodes are linked by an edge if there are constraints between any two of their operations. The graph on the right depicts a hierarchical structure of the problem. The main purpose of the acyclical hierarchical structure is to ensure that a solution can be computed and that the algorithm will terminate. This structure can be built using various heuristics [3]. We have identified a method which can ensure, for some input, a backtrack-free search. Interestingly, when a backtrack-free search is not possible, our method minimises backtracking. The display below the two graphs shows the results of the centralised and the distributed computations.

The above problem simulates a distributed system containing 20 primaries. Our backtracking-minimising method constructed a 14-level hierarchical structure. The optimal (i.e. the best possible) solution to the problem was computed using Disolver [2]. As can be seen from the screenshot above, 36 out of the 100 tentative operations were scheduled in the optimal solution. Our distributed algorithm was able to produce a solution that contains 27 out of the possible 36 operations, giving the distributed solution a quality of about 75% of the maximum quality.

We are currently developing various heuristics to further maximise the quality of the results and hence the reconciliation.

References:

1.      P. Cederqvist, R. Pesch, et al. 2001. Version management with CVS. http://www.cvshome.org/docs/manual.

2.      Y. Hamadi. Disolver: A Distributed Constraint Solver. http://research.microsoft.com/%7Eyoussefh/DisolverWeb/Disolver.html.

3.      Y. Hamadi. Interleave Backtracking in Distributed Constraint Networks. International Journal on Artificial Intelligence Tools, Volume 11, Number 4, 2002.

4.      Y. Hamadi, M. Shapiro. Pushing log-based Reconciliation. International Journal on Artificial Intelligence Tools, Volume 14, Number 3-4, 2005.

5.      P. J. Modi, W. Shen, M. Tambe, M. Yokoo. An asynchronous complete method for distributed constraint optimization. The 2nd International Joint Conference on Autonomous Agents & Multiagent Systems, July 2003.

6.      D. Terry, M. Theimer, K. Petersen, A. Demers, M. Spreitzer, C. Hauser. Managing update conflicts in Bayou, a weakly connected replicated storage system. Proceedings 15th Symposium on Operating Systems Principles, December 1995.