What is Oracle ASM to Azure IaaS?

Kellyn Pot'Vin-Gorman explains Oracle Automatic Storage Management and why it’s impossible to do a comparison to Azure IaaS.

There is a natural desire for us to make sense out of confusion, and Oracle is a topic that confuses many. One of the challenges of being known as a subject matter expert on a topic that isn’t common in the Microsoft space, is that you end up with people asking you to compare things that just aren’t the best candidates for comparison. It’s rare that I can’t take two or three features/products from Oracle and find a similar combination in Microsoft, but one of the areas not so straight forward is Oracle’s Automatic Storage Manager, (ASM). As Oracle on Azure is all Infrastructure as a Service, (IaaS) the conversation of ASM will undoubtedly come up in most database discussions, and odd comparisons will ensue.

Automatic Storage Management, (ASM) is Oracle’s own storage management solution, (volume manager and file system in one) and for many situations, the recommended one to use with any Oracle RDBMS. Unlike LVM, (OS level Logical Volume Manager utility), ASM bypasses the need for a second management layer between the disk volume and Oracle. ASM also provides a set of background processes, (as a substantiated Oracle instance) to manage and monitor the storage layer.

Figure 1: ASM Instance in Relation to Azure VM and Storage

In single instance environments, there is commonly one ASM instance per host, even if there are multiple databases residing and each database has its own disk group(s) for storing datafiles, redo logs and archive logs managed through ASM.

What is LVM?

When there is confusion, we try to bring database technologists back to LVM for comparisons to ASM. For those newer to Linux, no comparison to LVM with ASM can be complete without some understanding of LVM features. The LVM is a thin software layer that sits on top of storage and/or partitions in Linux/Unix, creating the ability to maintain and manage disk simply and along with other storage utilities like it, compares quite easily to ASM when in the Azure world.

LVM

  • Creates single, logical volumes from physical volumes or from an entire managed disk.
  • When dealing with numerous disk volumes, (common when optimizing or working with high IO demands) allows for efficient and effective striping and mirroring of disk.
  • The ability to add disks with as little downtime or impact to Service Level Agreements (SLAs).

Where ASM deviates from LVM is that ASM is an Oracle management layer and can interact with the database directly. Advanced features include the ability to stripe, (evenly distribute) data across sets of disks, referred to as a diskgroup by ASM. By using ASM, the database can bypass a secondary storage management layer that isn’t incorporated into Oracle and offer performance comparable to raw devices. It’s not that LVM can’t stripe, (as it does so quite effectively) but that many DBAs never get past the initial creation commands when using ASM and are less likely to take advantage of the advanced features that are important when working in the cloud.

One of the top benefits and why we might prefer to use ASM over LVM in Azure has to do with the ability to add disks to a Disk Group and not having to allocate a new volume. Using LVM, a new volume would have to be created:

This disk is now a NEW volume on the host and not just a partition that was enlarged to take on more data, transparent to the database.

Once I set up all the directory structure, ownership, etc. on the new volume that I will be using, I can then allocate it to an Oracle database without ASM, logging into SQL*Plus and running the following command:

Notice that the file path must be fully qualified with the directory structure I created. The previous volume used for datafile in our example were all on /u01 and /u02, so now I have to remember to allocate any new datafiles, as well as ensure autoextend is only set to the new volume, which can be higher maintenance for the DBA and development.

Allocating More Disk in ASM

ASM uses an “alias” to identify the disk group, starting with a ‘+’ sign to let Oracle know it is dealing with ASM managed volumes.

I can use the additional disk demonstrated in the above command immediately:

Notice the file path uses the alias of +DATA and ASM knows the logical path to store the file and needs no further information in the command. I also added this new disk to the existing diskgroup, so no other management is required. All datafiles with autoextend can simply grow into the new disk I’ve added to the diskgroup in ASM.

Although all data could be stored in a single disk group, it is common for data to be dispersed across multiple diskgroups for better performance, (as well as using different disks for those diskgroups. These diskgroups can have any naming convention, but most common is ‘+DATA’ for datafiles, archive logs written to the ‘+RECO’ and redo logs in the ‘+FRA’, (flash recovery area). Different environment demands may have different architecture, but this is a scenario we see often in Oracle environments. This comes in handy when we are faced with limitations on IO and disk distribution, striping, etc.

Not Apples to Apples

Now where this gets confusing for many migrating Oracle to Azure surrounds what ASM is and what it is not. Some think of it as a volume OPTIMIZING layer, and although there are some performance benefits to running ASM, it isn’t to be confused with Azure NetApp Files, which is a storage solution in Azure.

ASM can’t address limitations in per VM IO or network bandwidths like ANF and where ASM can use ANF just as easily as it can use standard or premium managed disk in Azure, ANF can’t use ASM. ANF is completely unaware of the ASM layer, as it’s part of Oracle.

The VM vs. Disk Upper Limits Quandary

Understanding limitations per VM is an important aspect of an architect’s job in Azure. As powerful as VMs are, there are limits set to ensure that no one becomes the “noisy neighbor” and overwhelms the cloud. However small the possibility, there is a very crucial reason these limits are in place. These limits are a complex combination between storage type, network and VM- including the VM series.

Let’s create an example architecture to demonstrate for an Oracle database with the following requirements:

vCPU

Memory

Database Size

MB/s IO

IO Throughput

8

24G

6TiB

325 MB/s

2000 Reqs/s

Just using the Azure calculator, it may seem appropriate to migrate the database to one of the following VMs:

VM Instance

vCPU

Memory

Network MiB/s

IOPS/Read MBps/Write MBps

Max Number of Disks/type

DS4 v2

8

28G

500

24000/375/187

32 standard

D8s v4

8

32G

500

12800/192/123

16 premium

E8ds v4

8

64G

1200

77000/485(200) cached

16 premium

M8ms

8

219G

2000

10000/100(793) cached

8 premium

Now let’s look at disk and for this example, I won’t go into ANF storage solutions and just stick to a standard and a premium set of storage examples:

Detail

Ultra disk

Premium SSD

Standard SSD

Standard HDD

Disk type

SSD

SSD

SSD

HDD

Scenario

IO-intensive workloads such as SAP HANA, top tier databases (for example, SQL, Oracle), and other transaction-heavy workloads.

Production and performance sensitive workloads

Web servers, lightly used enterprise applications and dev/test

Backup, non-critical, infrequent access

Max disk size

65,536 gibibyte (GiB)

32,767 GiB

32,767 GiB

32,767 GiB

Max throughput

2,000 MB/s

900 MB/s

750 MB/s

500 MB/s

Max IOPS

160,000

20,000

6,000

2,000

At first glance, you think you’re just paying a lot more for memory and would just pick what fits the requirements closest and then match it up with the disk of your choice, but upon inspection of the specific limits between VM and disk, you begin to realize, that even if you choose a disk with immense throughput, you have limits per VM which will throttle the storage limits chosen. It’s important to understand the combinations that go into a successful architecture design to meet the needs of the workload, and no amount of ASM is going to get you around this mistake.

The best combination from what we’ve displayed here, is the following:

VM Instance

vCPU/Memory

Managed Disk

IOPS/Read MBps/Write MBps

Disk # and Size

E8ds v4

8 vCPU/64G

P40 cached

7500/250

2 * 4095G

This combination meets the requirements for the vCPU and memory, along with the IO demands. By pairing it up with the P40 premium disk, I’m able to support the premium disk with caching, (always look for VM instances with an ‘s’ in the instance name, as in E8ds v4 in this example). I chose the P40 disk, as we can only use read-only, (write thru) caching up to 4096G and I’d like to use as much disk as possible that I’ve allocated. If you attempt to allocate over 4096G, read-only caching will be disabled by default, and this determines the higher IO we’re in search of. We must still keep in mind that we may run into some throttling on the MiB/s on the network bandwidth, but this is where we get into advance discussions about what causes heavy IO across the network, such as backups and data refreshes, etc.

The Importance of the ASM Discussion

It is essential to separate what ASM, (or LVM) does vs. what the specifications, features and limitations are for different areas of Azure infrastructure, such as premium SSD or Ultra Disk in Azure managed disk and not just VM series, but individuals versions of VMs.

What ASM can do is offer a CLI to perform common Linux level tasks via an Oracle tool designed for the DBA to manage and maintain their Oracle storage. Files create via ASM commands should only be deleted with ASM using ASMCMD. The CLI can be used to create new aliases, inspect open files, and extract files from an ASM diskgroup. Once using ASM, very few direct Linux OS commands work to inspect the files or manage them, which is the important reason for the CLI in ASM.

One of the most common commands to be run in ASM, outside of datafile creation or resizing, is rebalancing. As files are added or removed via ASMCMD, a rebalance operation will automatically be issued, as the goal is to provide as even a distribution of space usage and file extents across the disks in the diskgroup. A rebalance, (RBAL) process can be identified by its background process from the OS level using the following command:

Note that the process will come from the ASM instance and not the Oracle database. This

Rebalancing is an IO intensive process, so keep this in mind as you choose the power state for the rebalancing action. It can be set as high as 11, but to set this at 0 is off, 4 is common, but your mileage may vary, and with Azure, everything is about ensuring you don’t tip the IO scales to throttling, so take care with the setting for this type of IO intensive process.

The completion status can be verified in the alert log or can also query the status from the database with the following:

You can also see the completion of the rebalance in the alert log, which will display if the disk was rebalance completed, updated successfully, etc. depending on the specific command that was run via the ASMCMD window.

Conclusion

As complex as this discussion is and the myriad confusions around what ASM is and isn’t,

hopefully, this article helped to explain some of the values of ASM, along with the reason we can’t compare it to storage performance when in Azure. The topic is simply too vast and covers too many layers to boil it down “to do ASM or not to do ASM”.

If you’d like to know more about Oracle Automatic Storage Management, (ASM) you can read up on it from Oracle here: https://docs.oracle.com/en/database/oracle/oracle-database/19/ostmg/index.html

If you’d like to understand more about Azure Infrastructure features and limitations when architecting an environment, I’d start here: https://docs.microsoft.com/en-us/azure/virtual-machines/premium-storage-performance