Cluster Wide Services

The existing OpenDaylight service deployment model assumes symmetric
clusters, where all services are activated on all nodes in the cluster.
However, many services require that there is a single active service
instance per cluster. We call such services 'singleton services'.
Examples of singleton services are global MD-SAL RPC services, services
that use centralized data processing, or the OpenFlow Topology Manager,
which needs to interact with all OF switches connected to a clustered
controller and determine how the switches are interconnected.

Developers of singleton services must create logic that determines
the active service instance, manages service failovers and ensures
that a service instance always runs in the surviving partition of a
cluster. This logic must interact with the Entity Ownership Service
(EOS), and it is not easy to implement and debug. Moreover, each
developer of an ODL-based singleton service has to design and implement
essentially the same functionality, each with its own behavior,
engineering and issues. The Cluster Singleton Service is intended to
abstract this funtionality into a simple-to-use service that can be
used by all developers of ODL-based singleton services.

The General Cluster Singleton Service Approach

One of the key elements in the design of the Cluster Singleton Service is the notion of a single cluster service instance, which corresponds to a single Entity instance in the Entity Ownership Service (EOS). The EOS represents the base Leadership choice for one Entity instance. Therefore, candidate elections can be moved ("outsourced") to the EOS. Every Cluster Singleton Service type must have its own Entity, and every Cluster Singleton service instance must have its own Entity Candidate. Every registered Entity Candidate must be notified about its actual role in the cluster.

To ensure that there is only one active (i.e. fully instantiated) service instance in the cluster at any given time, we use the "double-candidate" approach: a service instance maintains not only a candidate registration for the ownership of the service’s own Entity in the cluster, but also an additonal (guard) ownership registration that facilitates full & graceful shutdown of the service instance before the "leadership" of the service is relinquished (i.e. before a service instance is deactivated). To achieve "leadership" of a singleton service, a service candidate must hold ownership of both these entities (see the sequence diagram below).

Double Candidate Solution (Async. Close Guard)
Figure 1. Double Candidate Solution (Async. Close Guard)

The double-candidate approach prevents the shutdown of a service instance with outstanding asynchronous operations, such as unfinished MD-SAL Data Store transactions. The main entity candidate reflects the actual role of the service instance in the cluster; the close guard entity candidate is a guard that tracks outstanding asynchronous operations. Every new Leader must register its own close guard entity candidate. A Leader that wishes to relinquish its leadership must close its close guard entity candidate. This is typically done in the last step of the service’s shutdown procedure. When the old Leader relinquishes its close guard entity ownership, the new Leader will take the ownership for the close guard entity candidate (it has to hold ownership for both candidate signatures). That is the marker to full start cluster node application instance and the old leader stops successfully. Figure 1 shows the entire sequence.

The double-candidate approach (with the asynchronous close guard entity) creates the following prerequisite: "the actual ownership doesn’t change with the registration of a new candidate".

The Cluster Singleton Service

The double-candidate solution can be used for all ODL services. Developers of Singleton services no longer have to keep re-implement the same code for every new service. Moreover, the Cluster Singleton Service hides the interactions with the EOS service from its user. These interactions can be encapsulated in an "ODL Cluster Singleton Service Provider" parent component.

Class Diagram Cluster Singleton Service
Figure 2. Class Diagram Cluster Singleton Service

Cluster Singleton Service Grouping

In some use cases, closely cooperating services should "share the same fate", i.e. they should always be instantiated on the same Cluster Node. In case of a failover, leaders for all services in the fate-sharing set should be instantiated on the same (surviving) Cluster Node. For best efficiency and performance, the shard leaders for shards owned by these services should also reside on the same Cluster Node as the services. In this case, interactions with the EOS are provided for the entire set of fate-sharing ClusterSingletonService instances by a single double-candidate Entity instance.

Class Diagram Cluster Singleton Service Group
Figure 3. Class Diagram Cluster Singleton Service Group

Cluster Singleton Service Provider

The Provider implementation is realized as a standalone service which has to be instantiated for every ClusterNode and it has to be available for every dependent application. Its class diagram looks as follows.

Class Diagram Cluster Singleton Service Provider
Figure 4. Class Diagram Cluster Singleton Service Provider

Example: Cluster Singleton Service RPC Implementation

We’d like to show a grouping RPC service sample. RPC services don’t need to be a part of the same project.

public class SampleClusterSingletonServiceRPC_1 implements ClusterSingletonService, AutoCloseable {

    /* Property contains an entity name guard for all instances of this group of services */
    private static final String CLUSTER_SERVICE_GROUP_IDENTIFIER = "sample-service-group";

    private ClusterSingletonServiceRegistration registration;

    public SampleClusterSingletonServiceRPC_1(final ClusterSingletonServiceProvider provider) {
        Preconditions.checkArgument(provider != null);
        this.registration = provider.registerClusterSingletonService(this);
    }

    @Override
    public void instantiateServiceInstance() {
        // TODO : implement start service functionality
    }

    @Override
    public ListenableFuture<Void> closeServiceInstance() {
        // TODO : implement sync. or async. stop service functionality
        return Futures.immediateFuture(null);
    }

    @Override
    public String getServiceGroupIdentifier() {
        return CLUSTER_SERVICE_GROUP_IDENTIFIER;
    }

    @Override
    public void close() throws Exception {
        if (registration != null) {
            registration.close();
            registration = null;
        }
    }

}

public class SampleClusterSingletonServiceRPC_2 implements ClusterSingletonService, AutoCloseable {

    /* Property contains an entity name guard for all instances of this group of services */
    private static final String CLUSTER_SERVICE_GROUP_IDENTIFIER = "sample-service-group";

    private ClusterSingletonServiceRegistration registration;

    public SampleClusterSingletonServiceRPC_1(final ClusterSingletonServiceProvider provider) {
        Preconditions.checkArgument(provider != null);
        this.registration = provider.registerClusterSingletonService(this);
    }

    @Override
    public void instantiateServiceInstance() {
        // TODO : implement start service functionality
    }

    @Override
    public ListenableFuture<Void> closeServiceInstance() {
        // TODO : implement sync. or async. stop service functionality
        return Futures.immediateFuture(null);
    }

    @Override
    public String getServiceGroupIdentifier() {
        return CLUSTER_SERVICE_GROUP_IDENTIFIER;
    }

    @Override
    public void close() throws Exception {
        if (registration != null) {
            registration.close();
            registration = null;
        }
    }

}

Both RPCs are instantiated for some ClusterNode and both RPCs have only one instance in the entire Cluster.

Cluster Singleton Application

Applications packaged as OSGi modules can be viewed as services too. The OSGI container can be viewed as an application loader. Every OSGi application has its own lifecycle that should be adapted to use EOS. Only the application instance that has become the "leader" should be fully loaded and initialized. Basically, we would like to encapsulate interactions with the EOS in a cluster-aware ODL application Loader.

Life cycle of plug-ins in OSGi
Figure 5. Life cycle of plug-ins in OSGi

Application Module Instantiation

Every "ODL application" contains the Provider class, which is instantiated in the AbstractModule<ODL app> class. A Module has the method createInstance(), which starts an application Provider. So an application provider must implement the ClusterSingletonService interface and the application provider initialization (or constructor) must register itself to the ClusterSingletonServiceProvider. The application Provider body will be initialized by the leader ClusterNode election for the master only.

Base Cluster-wide app instantiation
Figure 6. Base Cluster-wide app instantiation

So we are able to hide whole EOS interaction from the user and encapsulate it inside the "ClusterSingletonServiceProvider" implementation. An application or a service only needs to implement the relevant interface and register itself to the ??? provider.

A simplified sequence diagram (without the double-candidate) is displayed in the following figure:

Simply Cluster-wide app instantiation (without double candidate)
Figure 7. Simply Cluster-wide app instantiation (without double candidate)

The full sequence implementation diagram for the AbstractClusterProjectProvider ??? is displayed in the following figure:

Cluster-wide app instantiation
Figure 8. Cluster-wide app instantiation
public class ClusterSingletonProjectSample implements ClusterSingletonService, AutoCloseable {

    /* Property contains an entity name guard for all instances of this group of services */
    private static final String CLUSTER_SERVICE_GROUP_IDENTIFIER = "sample-service-group";

    private ClusterSingletonServiceRegistration registration;

    public ClusterSingletonProjectSample(final ClusterSingletonServiceProvider provider) {
        Preconditions.checkArgument(provider != null);
        this.registration = provider.registerClusterSingletonService(this);
    }

    @Override
    public void instantiateServiceInstance() {
        // TODO : implement start project functionality

    }

    @Override
    public ListenableFuture<Void> closeServiceInstance() {
        // TODO : implement sync. or async. stop project functionality
        return Futures.immediateFuture(null);
    }

    @Override
    public String getServiceGroupIdentifier() {
        return CLUSTER_SERVICE_GROUP_IDENTIFIER;
    }

    @Override
    public void close() throws Exception {
        if (registration != null) {
            registration.close();
            registration = null;
        }
    }

}

public class ApplicationModule extends ProjectAbstractModule<? extends AbstractStatisticsManagerModule> {

    ...

    @Override
    public java.lang.AutoCloseable createInstance() {
        AbstractServiceProvider projectProvider =
            new ClusterSingletonProjectSample(getClusterSingletonServiceProviderDependency());
        return projectProvider;
    }
}