Infinite I/O

The awesomeness that happened when we integrated with SolarWinds VMAN

Posted by Dan Perkins on Nov 14, 2016

Today, I'll outline how you can use SolarWinds Integrated Virtual Infrastructure Monitor (IVIM) and SolarWinds Virtualization Manager (VMAN), Infinio Accelerator, and a little scripting, to build an automated latency remediation process.  To fit all of this into a single post, I plan to focus on the required details and leave out fine tuning and exception handling.  I'll also point you to a bare-bones script that you can adapt for your purposes.

To start, let me answer the question “why did we do this?”  Well, as we’ve said before, Infinio Accelerator is the easiest and quickest route to stellar VM storage performance. Likewise, SolarWinds IVIM and SolarWinds VMAN provide best-in-breed virtualization monitoring. Pairing Infinio and SolarWinds together means you now have both the means to detect latency issues and to directly address them. In a lot of ways it is a no-brainer. The world is transitioning to SDDC and we plan to play our part.

The process we use in this blog is based on VMware’s vSphere PowerCLI and uses a simple script to set the storage policy of all VMDKs associated with a given VM. In IVIM we create a custom alert that calls the script and passes a UUID identifier to it so that we can target the specific virtual machine.

 

Prerequisites 

First up, install VMware vSphere PowerCLI on the SolarWinds IVIM server.  Jump over here for the latest version as of this writing. Provided that has been installed and PowerShell is working as expected, you should be good to go. You will also need the credentials of a vSphere user that can act as the script runner.

Next, place the attached PowerShell script onto the IVIM server. I chose E:\Scripts, with drive E: being my data drive.  Regardless of where you place this file, note down the directory for later.

 

Alert Configuration 

The high latency alert in SolarWinds IVIM is what drives this workflow. We chose this style of alert but you can trigger on high IOPS or another alert type. To configure the alert, log in to the IVIM web UI and navigate to the Manage Alerts page (http://<URL>/Orion/Alerts/Default.aspx). If, like us, you are configuring the alert based on high latency, you can either use the existing “VM Disk Latency” alert and modify it or create a new custom alert. We have chosen to create a new alert named “Fix High Disk Latency - Infinio Accelerator” which is a copy of the built-in VM Disk Latency alert.  This following section shows how our alert differs from the build-in alert.   

We break it out by alert configuration section.  Refer to the images where provided for additional context and clarity. 

  1. Trigger Condition (see Image 1)
  • I want to alert on: Virtual Machine
  • The actual trigger condition:
    • Trigger alert when: All conditions must be satisfied (AND)
      • VM power state is equal to powered on
      • VM latency read is greater than or equal to 4
 Trigger conditions
Image 1: Trigger Conditions

 

  1. Trigger Action
  • Message displayed when this alert is triggered: High VM Disk Latency detected, Infinio Accelerator to be applied.
  • Trigger Actions (see Image 2)
    • Log the Alert to the NetPerfMon Event Log settings
      • Read or write latency on virtual machine ${DisplayName} is higher than threshold.
    • Execute an external program settings (see Image 3)
      • Name of action
        • Set Accelerator storage policy
      • Network path to external action
        • powershell -command "&E:\Scripts\Accelerate.ps1 -UUID ${N=SwisEntity;M=UUID}"
          • “E:\Scripts” should be modified to match your environment.
    • Log the Alert to the NetPerfMon Event Log settings
      • Infinio Accelerator storage policy applied to VM ${DisplayName}.
 
Trigger Actions
Image 2: Trigger Actions

 

 

Execute an External Program

Image 3: Execute an External Program

 

6. Reset Actions

  • Log the Alert to the NetPerfMon Event Log settings
    • Read and write latency on virtual machine ${DisplayName} are back below the threshold.

That should just about do it.  To summarize, here is a screenshot of our full configuration (see Image 4).

 

Alert configuration summary

Image 4: Alert Configuration Summary

 

Alert Execution

As you have already seen, the alert calls an external program resident to the IVIM server.  This program is a simple PowerShell script that uses the VMware PowerCLI libraries to perform the following:

  1. Read a VM UUID as a command line argument
  2. Connect to the vCenter server using the script runner’s credentials
  3. Get and save references to the required objects, including:
    1. VM, from the provided UUID
    2. VMDKs, from the VM object
  4. Apply the Infinio Accelerator Storage Policy to each VMDK

Make sure to update the vCenter server address and the script runner credentials as indicated in the script.  As a best practice you should never hard-code these credentials but for the sake of brevity and to ensure we shared a fully working script we did so in this case.

In this script we use the default Infinio Accelerator Storage Policy that is created when you install Accelerator but you can choose any Storage Policy and just add the Accelerator IO Filter to the policy.  This is outside the scope of this post, but you can read more about how this allows you to customize storage tiers and manage additional policy-based storage options.

You can find the full sample script on our GitHub page 

 

The Full Stack 

Putting all the pieces together, here is how it plays out.  

  1. If the performance of a monitored VM degrades to the point where it meets the trigger condition, the alert fires, resulting in both logging to the IVIM console and execution of the external script. When the script is called the alert passes the VM UUID as a command line argument, which is interpreted and processed by the script. The script connects to vCenter and uses the UUID to retrieve a reference to the VM that caused the alert. The script enumerates the VM’s VMDKs and applies the Infinio Accelerator Storage Policy to each of them.
  2. Once the above happens, the VM’s IO will warm the cache to the point where the cache: 1) reduces average latency, and 2) offloads IO from the storage array and storage network.  The combination of (1) and (2) often time results in an increase in the amount of IO that can be completed in a given time period.  (see Image 5)
  3. Finally, the alert clears and is reset.

 

Here is a graph from the Infinio Accelerator Management Console which depicts step #2.

 

Latency reduction within Infinio's UI
Image 5: Latency reduction within Infinio's UI 
 
A: Caching is enabled automatically via application of the storage policy and IO Filter.
B: Cache starts to warm, resulting in increased overall IO and a reduction in latency.
C: Cache is fully warm, resulting in no IO to the backend array, a substantial increase in IO from the cache, and a significant reduction in latency.

 

That's it!  With those steps you now have automatic remediation of storage latency.  

  Get started with a free  30-day trial of Infinio

 

Topics: Latency, Automatic remediation, Solarwinds, Performance