{"id":106769,"date":"2025-06-20T19:08:53","date_gmt":"2025-06-20T19:08:53","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=106769"},"modified":"2025-05-15T19:13:27","modified_gmt":"2025-05-15T19:13:27","slug":"oracle-asm-monitoring-and-managing-part-2","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/oracle-asm-monitoring-and-managing-part-2\/","title":{"rendered":"Oracle ASM: Monitoring and Managing (Part 2)"},"content":{"rendered":"\n<p>In the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/oracle-asm-a-simple-solution-or-another-complexity-part-1\/\">first part of this series<\/a>, we explored the fundamentals of <strong>Oracle Automatic Storage Management (ASM)<\/strong>\u2014a powerful volume manager and file system integrated with Oracle Database. ASM simplifies storage by abstracting physical disks into disk groups and automating tasks like striping, mirroring, and rebalancing. This automation helps ensure high availability, scalability, and optimal performance for Oracle workloads.<\/p>\n\n\n\n<p>But once ASM is up and running, how do you ensure it stays healthy and efficient? That\u2019s where <strong>monitoring and proactive maintenance<\/strong> come into play.<\/p>\n\n\n\n<p>This article outlines the <strong>key areas to monitor in Oracle ASM<\/strong>, why they matter, and what SQL or tools to use for visibility and alerts.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-disk-group-usage\">Disk Group Usage<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: Disk groups are the heart of ASM. If they fill up, database writes can fail, and performance may degrade. Although rare in modern and far less limited storage solutions of today, it\u2019s still important to monitor the free space of any diskgroup.<\/p>\n\n\n\n<p><strong>What to monitor<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT name, state, type, total_mb, free_mb, usable_file_mb\nFROM v$asm_diskgroup;<\/pre><\/div>\n\n\n\n<p><strong>Recommendations<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> Set alerts when <code>usable_file_mb<\/code> drops below <strong>15\u201320%<\/strong> of the total disk group size. In larger storage environments, it often makes sense to calculate thresholds by MB free vs. percentage free. <\/li>\n\n\n\n<li> ASM may report enough <em>free<\/em> space, but due to mirroring and striping, <code>usable_file_mb<\/code> is a more accurate indicator of true usable capacity. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-disk-health-and-status\">Disk Health and Status<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: A single failed or degraded disk can compromise redundancy, performance, and rebalance operations. Having a clear understanding of the status and availability of disk groups is essential.<\/p>\n\n\n\n<p><strong>What to check<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT name, path, mount_status, header_status, state\nFROM v$asm_disk;<\/pre><\/div>\n\n\n\n<p><strong>Red flags<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><code>MOUNT_STATUS = CLOSED<\/code> or <code>UNKNOWN<\/code> <\/li>\n\n\n\n<li><code>HEADER_STATUS = FORMER<\/code>, <code>MISSING<\/code>, or <code>CANDIDATE<\/code> (unexpectedly) <\/li>\n\n\n\n<li><code>STATE = OFFLINE<\/code> or <code>UNKNOWN<\/code> <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-rebalance-operations\">Rebalance Operations<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: ASM automatically rebalances data when disks are added or removed. Frequent or slow rebalances may indicate configuration or I\/O bottlenecks. The estimates showing minutes left, as well as the amount of work that\u2019s been completed can assist the DBA in knowing the clear status of a rebalance operation.<\/p>\n\n\n\n<p><strong>What to monitor<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT operation, state, power, actual, sofar, est_work, est_minutes\nFROM v$asm_operation;<\/pre><\/div>\n\n\n\n<p><strong>Tips<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> Track <strong>how often rebalances occur<\/strong>. <\/li>\n\n\n\n<li> Monitor if operations remain in an active or long-running state. <\/li>\n\n\n\n<li> Use <code>POWER<\/code> level wisely. Although higher values are available in post 11g limits, higher values mean higher IO and CPU to complete faster. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-redundancy-and-mirroring-levels\">Redundancy and Mirroring Levels<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: ASM supports different redundancy levels (<code>EXTERNAL<\/code>, <code>NORMAL<\/code>, <code>HIGH<\/code>). Choosing the wrong one for your workload can risk data availability. Mission critical workloads require higher redundancy levels, where development environments can use lower ones.<\/p>\n\n\n\n<p><strong>What to check<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT name, type, compatibility, database_compatibility\nFROM v$asm_diskgroup;<\/pre><\/div>\n\n\n\n<p><strong>Best practices<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> Match redundancy levels to workload importance: <\/li>\n\n\n\n<li><strong>HIGH<\/strong>: Mission-critical production systems <\/li>\n\n\n\n<li><strong>NORMAL<\/strong>: General production or staging <\/li>\n\n\n\n<li><strong>EXTERNAL<\/strong>: When using hardware RAID <\/li>\n\n\n\n<li> Automate checks to verify that templates align with the expected configuration. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-file-access-and-i-o-statistics\">File Access and I\/O Statistics<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: ASM handles all Oracle files\u2014datafiles, redo logs, control files\u2014and poor I\/O can severely impact database performance.<\/p>\n\n\n\n<p><strong>Files Overview<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT file_number, type, incarnation, blocks, block_size\nFROM v$asm_file;<\/pre><\/div>\n\n\n\n<p><strong>I\/O Stats<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT name, reads, writes, read_errs, write_errs\nFROM v$asm_disk;<\/pre><\/div>\n\n\n\n<p><strong>Insights<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> Join these views with V$ performance views for comprehensive performance analysis. <\/li>\n\n\n\n<li> Track error counts and abnormal read\/write ratios. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-asm-alert-logs-and-trace-files\">ASM Alert Logs and Trace Files<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: Not all issues appear in SQL views. ASM writes key warnings and errors to its alert logs. Any DBA will know for detailed information on issues or errors, the alert log is the place to go. For ASM, there is a handy tool to provide filtered views of errors.<\/p>\n\n\n\n<p><strong>How to check logs<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">adrci&gt; show alert -p \"message_text like '%error%'\"<\/pre><\/div>\n\n\n\n<p><strong>Look for<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> ORA- errors <\/li>\n\n\n\n<li> Rebalance messages <\/li>\n\n\n\n<li> I\/O errors <\/li>\n\n\n\n<li> Disk offline or fail events <\/li>\n<\/ul>\n<\/div>\n\n\n<p><strong>Tip<\/strong>: Set up log monitoring tools or scripts to scan for critical messages in real time.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-asm-and-cluster-instance-availability-rac-environments\">ASM and Cluster Instance Availability (RAC Environments)<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: In Oracle RAC environments, each node runs an ASM instance. ASM unavailability on any node can affect database operations. As RAC has shared storage, it\u2019s essential to monitor the ASM instance on each RAC node. For RAC, as for each database node, the ASM instance must have a unique name for the shared storage. It\u2019s common to number the ASM instance, (i.e. +ASM1, +ASM2\u2026)<\/p>\n\n\n\n<p><strong>What to monitor<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">crsctl stat res -t<\/pre><\/div>\n\n\n\n<p><strong>Ensure<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> All ASM instances are online and registered with Oracle Clusterware. <\/li>\n\n\n\n<li> No unexpected restarts or failovers have occurred. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-compatibility-and-feature-settings\">Compatibility and Feature Settings<\/h2>\n\n\n\n<p><strong>Why it matters<\/strong>: ASM compatibility levels control access to advanced features like ASM Cluster File System (ACFS), Flex ASM, and more.<\/p>\n\n\n\n<p><strong>Check settings<\/strong>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"block\" highlight=\"false\" decode=\"true\">SQL&gt; SELECT name, compatibility, database_compatibility\nFROM v$asm_diskgroup;<\/pre><\/div>\n\n\n\n<p><strong>Validate<\/strong>:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> Compatibility levels align with Oracle version and features in use. <\/li>\n\n\n\n<li> Features like <strong>ADVM<\/strong> and <strong>Flex ASM<\/strong> are configured only where supported. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-bonus-tips-for-proactive-monitoring\">Bonus Tips for Proactive Monitoring<\/h2>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><strong>Set up custom scripts<\/strong> to check space usage and disk health daily. <\/li>\n\n\n\n<li><strong>Leverage Oracle ASMCMD<\/strong> for file-system-like navigation and troubleshooting. <\/li>\n\n\n\n<li><strong>Monitor ASM patch levels<\/strong> to ensure they are in line with Oracle\u2019s recommendations. <\/li>\n\n\n\n<li><strong>Track fragmentation<\/strong> and rebalance trends to catch early signs of performance issues. <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p>Oracle ASM automates many storage management tasks, but like any critical infrastructure, it requires careful monitoring to ensure reliability and performance. By keeping a close eye on disk group usage, disk health, I\/O metrics, and <code>rebalance<\/code> activity and setting up proactive alerts, you can prevent issues before they impact your database.<\/p>\n\n\n\n<p>Oracle can have many <code>architecture<\/code> solutions, no matter if it\u2019s a single instance, multi-tenant or Oracle RAC, best practices will help you unlock the full power of ASM while maintaining the high availability and performance your Oracle environment requires.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the first part of this series, we explored the fundamentals of Oracle Automatic Storage Management (ASM)\u2014a powerful volume manager and file system integrated with Oracle Database. ASM simplifies storage by abstracting physical disks into disk groups and automating tasks like striping, mirroring, and rebalancing. This automation helps ensure high availability, scalability, and optimal performance&#8230;&hellip;<\/p>\n","protected":false},"author":316206,"featured_media":106771,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143533],"tags":[4459,159318],"coauthors":[48576],"class_list":["post-106769","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-featured","category-oracle-databases","tag-oracle","tag-oracle-asm"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106769","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/316206"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=106769"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106769\/revisions"}],"predecessor-version":[{"id":106770,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106769\/revisions\/106770"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/106771"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=106769"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=106769"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=106769"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=106769"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}