Oracle Flex Cluster: Leaf Node Failover

Oracle 12c has introduced Flex Clusters which use hub-and-spoke technology, allowing the cluster to scale much beyond the pre-12c clusters as it requires fewer network interactions between nodes in the cluster, and Less contention for key Clusterware resources like OCR and voting disks. A Flex Cluster has two types of nodes: Hub Nodes and Leaf Nodes. Hub Nodes These…

Oracle 12c has introduced Flex Clusters which use hub-and-spoke technology, allowing the cluster to scale much beyond the pre-12c clusters as it requires:

  • Fewer network interactions between nodes in the cluster, and :
  • Less contention for key Clusterware resources like OCR and voting disks.

A Flex Cluster has two types of nodes: Hub Nodes and Leaf Nodes.

Hub Nodes

  • These nodes are essentially same as conventional nodes in Pre-12c clusters and form the core of the cluster.
  • Each Hub Node is connected with other Hub Nodes via private network for peer-to-peer communication.
  • Each Hub Node can access the shared storage and hence the OCR and voting disks lying on the shared storage.  
  • A Hub Node may host an ASM instance, database instance(s) and applications.
  • Each cluster must have at least one Hub Node and can have up to 64 Hub Nodes.

Leaf Nodes

  • Leaf Nodes are more loosely coupled to the cluster than Hub Nodes and are not connected among themselves.
  • Each Leaf Node is connected to the cluster through a Hub Node through which it requests the data.
  • Though Leaf Nodes do not require direct access to shared storage, they may be provided access so that they can be changed to a Hub Node in future.
  • They run a lightweight version of the Clusterware.
  • They cannot host database or ASM instances.
  • Leaf Nodes can host different types of applications e.g. Fusion Middleware, EBS, IDM, etc. The applications on Leaf Nodes can failover to a different node if the Leaf Node fails.
  • There may be zero or more Leaf Nodes in a flex cluster.
  • All Leaf Nodes are on the same public and private network as the Hub Nodes.

Hub Nodes can run in an Oracle Flex Cluster configuration without having any Leaf Nodes as cluster member nodes, but for Leaf Node(s) to be part of a cluster, the cluster must have at least one Hub Node. When Clusterware is started on a Leaf Node, the Leaf Node automatically uses GNS to discover the Hub Nodes and gets connected to the cluster through one of the Hub Nodes. One Hub node may be associated with zero or more Leaf Nodes. The Hub Node periodically exchanges heartbeat messages with the associated Leaf Nodes, so that Leaf Nodes can participate in the cluster.

A Standard Cluster can be changed to a Flex Cluster, but a Flex Cluster cannot be changed to Standard Cluster without reconfiguration.

What happens when a Hub Node ceases to be a part of the cluster?

A Hub Node can be removed from the cluster as a result of:

  • Getting evicted
  • Server shutdown
  • Manually stopping the Oracle Clusterware

In such a scenario, the Leaf Nodes associated with that Hub Node failover to one of the surviving nodes in the cluster.

In this article, I will demonstrate:

  • Identification of the Hub Node a Leaf Node is connected to
  • Failover of a Leaf Node following removal of the associated Hub Node from the cluster

Current Scenario:

For the purpose of this demonstration I have setup a 12.1.0.2c Flex Cluster having following nodes:

  • Hub Nodes
    • Host01
    • Host02
    • Host03
  • Leaf Nodes
    • Host04
    • Host05

Demonstration

Let us verify that currently Hub Node host01 and Leaf Node host04 are active:

Since host01 is the only one Hub Node active in the cluster currently, Leaf Node host04 is associated with host01. It can be verified by looking the trace file of the ocssdrim process on host04.

Let us start another Hub Node host02 and Leaf Node host05.

In order to find the Hub Node associated with Leaf Node host05, we will take a look at the trace file of ocssdrim process on host05:

We can see that Leaf Node host05 is also connected to Hub Node host01.

Let’s stop Oracle Clusterware on host01 to verify that both of the Leaf Nodes fail over to the only other surviving Hub Node in the cluster, i.e. host02.

Verify that that host04 has failed over to host02:

Verify that that host05 has also failed over to host02:

[root@host05 ~]# cat $ORACLE_BASE/diag/crs/host05/crs/trace/ocssdrim.trc |grep ‘Destroying connection’ | tail -1

Summary:

  • Oracle 12c has introduced Flex Clusters which have two types of nodes: Hub Nodes and Leaf Nodes.
  • Whereas each Hub Node can access the shared storage, Leaf Nodes do not require direct access to shared storage and are connected to the cluster through Hub Nodes.
  • When Clusterware is started on a Leaf Node, the Leaf Node automatically uses GNS to discover the Hub Nodes and gets connected to the cluster through one of the Hub Nodes.
  • If a Hub Node ceases to be part of the cluster, the Leaf Nodes associated with it failover to one of the surviving nodes in the cluster.

References: