Distributed
Reconciliation
Y. Chong, Y.
Hamadi, M. Shapiro and K.
Moody
IceCube is envisioned to provide reconciliation services
to distributed applications/primaries where each application/primary is able to
retain its autonomy in managing its objects.
Conflicting
operations are deemed to be a rare occurrence in optimistic replication systems;
therefore users are able to perform tentative operations on their local copies
(replicas) of the shared objects without any immediate synchronisation.
Operations are deemed tentative because they could conflict with other
operations performed by some other users and may be aborted by the system at a
later time. Due to the lack of synchronisation, replicas inevitably diverge and
become inconsistent and have to be made consistent with each other through a
process called reconciliation. IceCube performs
reconciliation by computing a schedule (an ordered set that includes only
non-conflicting operations) which excludes some conflicting operations. When the
computed schedule of operations is executed at the replicas, the system
converges to a new single consistent state.
However, the
current version of IceCube like many other optimistic
replication systems such as Bayou
[6] and CVS
[1] employ centralised policies
to perform reconciliation. Such a configuration requires all tentative
operations to be propagated to the elected site for reconciliation. Furthermore
all the authoritative copies of all the objects have to be stored at this site
to guarantee that the computed schedules are correct. Such a centralised
architecture presents several drawbacks like limited scalability, single point
of failure and lost of autonomy (applications/primaries have to surrender
control over their objects to a single site).
The main
objective of this project is to develop a distributed algorithm for computing
the schedules. Our motivations for this approach are:
1.
Fault-tolerance -
ability to make progress if one or more primaries fail or are
non-contactable.
2.
Autonomous control -
each primary/application retains its authority in committing operations acting
on its objects.
3.
Privacy a tentative
operation is only propagated to the primaries that have the authority to commit
or abort it.
The model we
are currently exploring relies heavily on the formal constraint theory. We
represent the problem using Distributed
Constraints Satisfaction Problem (DCSP) formulation [3].
Formally, a
DCSP consist of 4 elements (X,D,C,A)
where:
1.
X is a set of n variables x1, x2,
, xn
2.
D is a set of domain
where D1, D2,
, Dn, is the set of
possible values for the variables x1, x2,
, xn
respectively,
3.
C is a set of
constraints on their values.
4.
A =
{A1, A2,
, Am} is a partition of
the n variables amongst m autonomous sequential processes Agent1, Agent2,
, Agentm
called agents, where each
agent Agentk owns" a subset
Ak of the variables in
x1,..,xn
The solution
to a DCSP is an instantiation of all n variables such that all the
constraints on their values are respected.
Therefore we
can formalise a distributed reconciliation problem as a DCSP in which there
are:
1.
X is the set of 2n variables x1, x2,
, xn,s1, s2,
, sn where n is the number of operations in the
system to be reconciled,
2.
D is the set of domain
where Dx1, Dx2,
, Dxn, Ds1, Ds2,
, Dsn , , is
the set of possible values for the variables x1, x2,
, xn,s1, s2,
, sn respectively, where
Dxi=[0..1] and Dsi=[-(n-1) .. (n-1)] | 1 ≤
i ≤ n
3.
C is the set of
predicates defining the IceCube constraints between
the operations {see
[4] for definition},
4.
A is a partition of
the 2n variables amongst the set of
m autonomous primaries P1, P2,
, Pm defined by f: X θ A
such
that f(xi) = Pk ==> f(si) =
Pk and f(si) = Pk ==> f(xi) = Pk | 1 ≤ i ≤ n; 1 ≤ k ≤ m
Solving a
distributed reconciliation problem would then be equivalent to solving a DCSP
problem which is to search for an assignment of values to the set of variables
such that none of the constraints are violated. Although satisfaction ensures
that the computed schedules are correct, it doesnt provide any guarantees on
the quality of the computed schedules. For example, a correct schedule could
exclude higher priority operations and include lower priority ones. Although
correctness is important, correctness alone is not enough; we would also like
the quality of the aggregate schedules to be as close to the quality of the
optimal (albeit a centrally-computed) solution (i.e. we want to maximise the
number of non-conflicting operations in the computed
schedule).
Despite its
optimality, complete protocol such as ADOPT
[5] is not practical. Indeed we
have to manage hundreds of operations distributed amongst dozens of primaries.
Therefore we are considering protocols that are correct but possibly not
optimal.
We have
implemented a simulator (see screenshot below) to evaluate the viability of the
model. The simulator generates a random problem that is bounded by the
parameters defined by the user.

The graph on
the left represents the set of primaries as a flat structure. Each primary in
the system is represented as a node in the graph and two nodes are linked by an
edge if there are constraints between any two of their operations. The graph on
the right depicts a hierarchical structure of the problem. The main purpose of
the acyclical hierarchical structure is to ensure that a solution can be
computed and that the algorithm will terminate. This structure can be built
using various heuristics [3].
We have identified a method which can ensure, for some input, a backtrack-free
search. Interestingly, when a backtrack-free search is not possible, our method
minimises backtracking. The display below the two graphs shows the results of
the centralised and the distributed computations.
The above
problem simulates a distributed system containing 20 primaries. Our
backtracking-minimising method constructed a 14-level hierarchical structure.
The optimal (i.e. the best possible) solution to the problem was computed using
Disolver [2]. As can be seen from the screenshot above,
36 out of the 100 tentative operations were scheduled in the optimal solution.
Our distributed algorithm was able to produce a solution that contains 27 out of
the possible 36 operations, giving the distributed solution a quality of about
75% of the maximum quality.
We are
currently developing various heuristics to further maximise the quality of the
results and hence the reconciliation.
1.
P. Cederqvist, R. Pesch, et al. 2001. Version management with CVS.
http://www.cvshome.org/docs/manual.
2.
Y. Hamadi. Disolver: A Distributed Constraint Solver.
http://research.microsoft.com/%7Eyoussefh/DisolverWeb/Disolver.html.
3.
Y. Hamadi.
Interleave Backtracking in
Distributed Constraint Networks. International Journal on Artificial
Intelligence Tools, Volume 11, Number 4,
2002.
4.
Y. Hamadi, M. Shapiro. Pushing
log-based Reconciliation. International
Journal on Artificial Intelligence Tools, Volume 14, Number 3-4,
2005.
5.
P. J.
Modi, W. Shen, M. Tambe, M. Yokoo. An asynchronous complete method
for distributed constraint optimization. The 2nd International Joint Conference on
Autonomous Agents & Multiagent Systems, July
2003.
6.
D. Terry, M. Theimer, K. Petersen, A. Demers, M. Spreitzer, C. Hauser. Managing update conflicts in
Bayou, a weakly connected replicated storage system. Proceedings 15th
Symposium on Operating Systems Principles, December
1995.