Oracle Cloud VMware Solution – Fault Domains

This blog summarises the general principles of how OCI Availability Domains and Fault Domains have an impact on the design of VMware vSAN Fault Domains for different types of Oracle Cloud VMware Solution (OCVS) deployments, like OCVS Single AD Clusters and OCVS Multi AD Clusters and how theses clusters interact with VMware vSphere HA Admission Control policies.

OCVS general overview:

Oracle Cloud VMware Solution (OCVS) is a VMware Software Defined Data Center (SDDC) solution on Oracle Cloud Infrastructure (OCI) that can either be based on Intel or AMD Bare Metal Hosts and is based on the following VMware product stack:

  • VMware vSphere Hypervisor (ESXi)
  • VMware vCenter Server
  • VMware vSAN
  • VMware NSX-T
  • VMware HCX – Advanced Edition (Enterprise Edition billed separately)

This blog summarises how the design of OCVS cluster fault domains has an impact on the availability and manageability of OCVS Single AD and Multi AD Clusters.

OCVS Cluster minimum sizing:

The minimum supported sizing for a OCVS Cluster is always 3 Hosts as this is the minimum required host count for a vSAN Standard Cluster to support a RAID-1 configuration that can tolerate 1 host failure.

For more information about OCVS sizing & scaling please refer to the OCVS – Sizing & Scaling blog post.

What are OCI Availability Domains?

Oracle Cloud Infrastructure is hosted in regions and availability domains. A region is a localized geographic area, and an availability domain is one or more data centers located within a region. A region is composed of one or more  availability domains.

What are OCI Fault Domains?

A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain contains three fault domains. Fault domains provide anti-affinity: they let you distribute your instances so that the instances are not on the same physical hardware within a single availability domain. A hardware failure or Compute hardware maintenance event that affects one fault domain does not affect instances in other fault domains.

For more information about Region, Availability Domains and Fault Domain please visit the Oracle Cloud Infrastructure Documentation.

OCVS Cluster types:

OCVS Clusters can either be Single AD Clusters which means that these types of clusters are only available in a single availability domain stretched across 3 fault domains or Multi AD Clusters that can be stretched across availability domains where the availability domains acts as a fault domain.

OCVS Single AD Cluster:

OCVS Single AD Clusters span across three OCI Fault Domains within an OCI Availability Domain, and start with a minimum setup of three nodes. Each OCVS Host acts a a vSAN Fault Domain.The the VMware vSphere HA Failover Capacity for a 3-Node OCVS Cluster will be 33% this will ensure that the VM’s running on the OCVS Cluster will be available in case a Fault Domain (vSAN Fault Domain) should fail.

OCVS Single AD Cluster Scaling:

OCVS Single AD Clusters span across three OCI Fault Domains within an OCI Availability Domain, and start with a minimum setup of three nodes. Each OCVS Host acts a a vSAN Fault Domain. As shown below when adding OCVS Hosts a OCVS Cluster the newly added hosts can act as a separate vSAN Fault Domain or can be added to existing vSAN Fault Domains. The the VMware vSphere HA Failover Capacity for a 6-Node OCVS Cluster will be 15% this will ensure that the VM’s running on the OCVS Cluster will be available in case a Fault Domain (vSAN Fault Domain) should fail. This can be adjusted based on the Failures to tolerate and the Fault Tolerance Method of the vSAN Objects.

OCVS Multi AD Cluster:

OCVS Multi AD Clusters span across three OCI Availability Domains within an OCI Region, and start with a minimum setup of three nodes. In this case, each OCI Availability Domain acts as a vSAN Fault Domain, this indicated that the OCVS Hosts spread across the OCI Fault Domains within the Availability Domain form the vSAN Fault Domain for a specific Availability Domain. In this scenario the VMware vSphere HA Failover Capacity will be 33% this will ensure that the VM’s running on the OCVS Cluster will be available in case a Availability Domain should fail.

OCVS Multi AD Cluster Scale:

OCVS Multi AD Clusters span across three OCI Availability Domains within an OCI Region, and start with a minimum setup of three nodes. In this case, each OCI Availability Domain acts as a vSAN Fault Domain, this indicated that the OCVS Hosts spread across the OCI Fault Domains within the Availability Domain form the vSAN Fault Domain for a specific Availability Domain. This solution scales by adding OCVS Hosts to the respective vSAN Fault Domain per Availability Domain, keep in mind to distribute hosts evenly for best storage availability and work load balancing in case of an Availability Domain outage. In this scenario the VMware vSphere HA Failover Capacity will be 33% this will ensure that the VM’s running on the OCVS Cluster will be available in case a Availability Domain should fail.

Leave a Reply

Your email address will not be published.