On IOPS, shared storage and a fresh idea. (Part 3) tying it all together in the stack

Note: This is part three, have a read of part one and two.

Hello there, and thank you for dropping back for part 3…

I suppose I should start with the disappointing news that I have yet to test this option for VDI in a box. And despite Aaron Parker’s suggestions it wasn’t due to lack of inspiration, it was down to lack of time! This series has gathered allot of interest from both community and storage vendors alike, and I feel I should set the record straight before I got any further:

  1. This isn’t a production idea, you would be crazy to use this idea in a live environment.
  2. Throughout this entire project, we’re focusing on pooled stateless. Stateful desktops would be a separate post entirely.
  3. This wasn’t an attack on products in this market space, merely a fresh view on an old problem.
  4. If i had the skills or funds necessary to run this project to a production solution, I wouldn’t have posted it. I would already be hard at work creating a reasonably priced product!

Now that my declarations are out of the way, I’d first like to talk about the moral of the story. This isn’t an unfamiliar expression:

IOPS mitigation is not about read IOPS it’s about WRITE IOPS!

VMware, Citrix and Microsoft have similar but very different solutions for read IOPS negotiation. Similar in the sense that they try to negate storage read IOPS. But the key difference with XenServer is the local disk cache via Intellicache has the out of box functionality to cache the majority of read’s to local disk (think SSD*) without the baked in soft limit of 512 MB for Microsoft HyperV and VMware respectively.

Long story short, VMware and Microsoft’s solution is about 512mb of recognizable read IOPS negation un-tuned, but enabled. Of course this value can be tuned upwards, but the low entry point of cache would suggest, at least to me, that tuning up will have an upsetting affect on the host.

This to me is why IntelliCache has the upperhand in the (value add product) VDI space for read IOPS negation and they even throw in the Hypervisor as part of your XenDesktop licensing, so win win, but what about those pesky write IOPS?

Let’s look at Citrix for a moment!

If we review article 1 and article 2 for a moment, the half baked idea formed by Barry Schiffer, Ingmar Verheij, Kees Baggerman, Remko Weijnen and I, when combined with Citrix IntelliCache turned out to be a phenomenal combination to create an IOPS avoidance product. But the key point we had in the back of our heads was, Citrix provisioning server already has this driver!

Citrix provisioning server has a RAM cache driver, with configurable size baked into the Provisioning Services product. This driver works in a very similar way to the EWF driver, just without the API and flexibility of a Microsoft driver.

There are customers of Citrix out there using RAM caching with XenApp who I spoke to, they have assigned a large (8gb+ cache, 3-4gb of cache utilised day to day) to negotiate the chance of a spillover, but it could still happen and this is an acceptable risk to their implementation. I struggled to find a XenDesktop customer using this method who would talk about their implementation.

But with XenDesktop in mind, using an overly large cache size to avoid spillover just doesn’t scale the way a consolidated Server Based Computing model would… Which leaves us back in this same dilemma, I’d like to cache in RAM, but I don’t want to risk users losing data when they perform a task we have not factored for.

So with all this in mind, lets talk about my final hurdle I faced with this project.

RAM cache flood and my half baked solution:

The Microsoft EWF driver suffered the same problem as the Citrix provisioning server when the cache filled, i.e. it froze or outright bug checked if you tried to reference a file put into the cache before you filled it with other information.

Jeff Silverman has done a great article on what happens in this event with the Citrix Provisioning Services RAM cache and I urge you to go read it so I don’t have to repeat it! Go now, I’ll wait right here for you to get back!

Read that article? Ok good, let’s continue.

To combat this scenario with the Windows EWF driver, I created a very rough around the edges windows service using polling (Sorry Remko) to check the size of the cache periodically via the EWFAPI and write the cache out to disk.

The function in the EWFAPI called EwfCommitAndDisableLive allowed me to dump the ram cache to disk and subvert the cache, going straight to disk when this event had occurred. By using this routine the ram cache simply disables itself and allows pass-through to disk at this point.

With this in mind, I tuned my service to spill to disk when the cache was greater than 80% of ram available, not in use by the operating system when the system booted, This worked well to a point, but the key failure to this approach became apparent when you opened a large number of applications and they struggled to occupy the space available around the cache.

My second attempt however, proved much more fruitful where I monitored for free system bytes in memory and if this dropped below a certain threshold the EWF drive began it’s dump routine to disk. Once the dump routine was complete, ram cleared and the writes had been committed to storage where the disk continued to use it for the remainder of the session. Once this spill over had occurred, a notification bubble fired in the user session warning them of the degradation of service and they should log off saving work at the next convenient moment…

Voila! no blue screen, spill to disk and user notification of degradation.

This wasn’t fool proof, it was very rough, it didn’t handle files being written larger than the RAM cache, but In my opinion, it satisfied the biggest fear and business case against using the Citrix Provisioning server ram caching, the ram cache flood scenario. I was happy with the results and it scaled well to the size of my lab, it showed what is possible with a RAM filter driver and it allowed me to prove my point before I started to poke the hornets nest of the storage market. So let’s park the EWF story for now and focus on my favorite subject, Citrix technologies.

Note: I won’t be making this service and driver solution publicly available, it’s not a production ready solution and Microsoft would sue the pants off me. I’m sure you’ll understand why, but if you want more information drop me an email.

The next part’s of this blog are all theoretical, but I know certain people I want to listen, are listening (Hi Kenneth, hope you are well :)).

Negating the negated. Lets talk about that spill over.

But what about that Spill over event? by using a spill over from ram to disk, we create a situation where we could change a steady, slightly reliable IOPS per from each desktop, to a situation where, “couldn’t a collection of spillovers at the same time cause your storage to becoming swamped with IOPS?”

Absolutely, and I’ve been thinking about this quite a bit… But there’s another hidden potential here even in a RAM spill over scenario…

With a little bit of trickery couldn’t we also negate this spillover event with a simple provisioning job?

With a bit of time spent thinking about this scenario, this is what I came up with…

Why doesn’t the controller, for XenApp or XenDesktop (PVS / DDC) copy or (my personal preference) create a local, blank differencing disk when a VM boots?

The hypervisor could be completely agnostic at this point, we could spill over to local disk, keeping write IOPS completely away from the shared storage? even in a spill over event?

This approach (in this half baked enthusiasts view) would negate the negated… But don’t take my word for it, lets go for some theoretical solutions.

Solution A: Implementing a spill over routine with the Citrix provisioning server driver.

So with my success with the Microsoft driver, I got myself thinking how easy would this be to do utilizing the Citrix Provisioning Services driver? Without access to the code, I’m going to be making slightly risky statements here, but I have allot of faith in Citrix that they could make this happen.

From everyone I have spoken to about this idea, they all see the value in the ability to spill out of RAM… So Citrix, please make it so. Below are some idea’s for deployment methods assuming Citrix do march down this route and the pro’s and con’s I see that live with each scenario.

Bear in mind, I’m not a storage guy, a full time developer or a software architect, I’m just an enthusiast that see’s potential, so drop your comments below as to what you think!

Machine Creation Services.

Idea A: Provisioning services improved RAM caching, Machine Creation services and Intellicache on XenServer.

Utilizing XenServer and MCS we could leverage the intellicache to negate the read IOPS as we proved in my own EWF driver testing, but still deliver on that spill over mechanism allowing continuation of service.

Pros:

  • Read IOPS negation.
  • RAM caching to remove write IOPS (bar a spill over).
  • Continuation of service in a cache flood scenario.

Cons:

  • Limited to XenServer.
  • Over provisioning of RAM necessary per desktop.
  • RAM spillover will result in a large amount of IOPS to the shared storage.

Idea B: Provisioning services improved RAM caching, Machine Creation services and Intellicache on XenServer… With local copy!

Same benefits as previous, but now, we have zero reliance on the shared storage when the VM is up (except for ID disk actions).

Pros:

  • Read IOPS negation.
  • RAM caching to remove write IOPS.
  • Uses local resources in a spill over.
  • Continuation of service in a cache flood scenario.

Cons:

  • Limited to XenServer.
  • Over provisioning of RAM necessary per desktop.

So that’s MCS in a nutshell with per VM caching, I think this solution has so much potential I can’t believe it’s not been done, but I digress. So lets park that topic for now and move on to Citrix Provisioning Services.

Citrix Provisioning Services:

So lets look at the “favorite of many” technology.

Idea A: Citrix Provisioning Services and Improved RAM caching driver.

In a pure Provisioning services environment, we would force our read IOPS via the lan, instead of storage protocol but still deliver a spill back to disk to allow continuation of service.

Pros:

  • Hypervisor agnostic.
  • RAM caching to remove write IOPS (bar a spill over).
  • Continuation of service in a cache flood scenario.
  • Potentially no shared storage needed, at all, if caching on the PVS server.

Cons:

  • Read IOPS aren’t really negated, they’re just forced over another technology.
  • Over provisioning of RAM necessary per desktop.
  • RAM spillover will result in a large amount of IOPS to the shared storage / pvs server.

Idea B: Citrix Provisioning Services and Improved RAM caching driver.. With local copy!

Taking the above benefits, but with the gain of utilizing local storage in the spillover event.

Pros:

  • Hypervisor agnostic.
  • RAM caching to remove write IOPS.
  • Uses local resources in a spill over.
  • Continuation of service in a cache flood scenario.
  • Potentially no shared storage needed, at all.

Cons:

  • Read IOPS aren’t really negated, they’re just forced over another technology.
  • Over provisioning of RAM necessary per desktop.

So lets Review:

And there we have it, 4 solutions to IOPS negation utilizing the Provisioning server RAM caching driver, with a little bit of a modification to deliver a robust solution to RAM caching.

The copy and creation of differencing disks again would deliver additional benefits to leverage the hardware you put into each Hypervisor without the Shared Storage investment.

Win Win, but is it?

There’s an oversight here:

There’s a niggle here that’s been bothering me for some time and you’ll probably note I mentioned it as a CON to most of the solutions above… I’m going to lay it out on the table in complete honesty…

“Isn’t over provisioning RAM on a per desktop basis a waste of resource? Wouldn’t it be better if we could share that resource across all VM’s on a Hypervisor basis?”

You betcha! If we are assigning out (for argument sake) 1gb of ram cache per VM, that RAM is locked into that VM and if another machine spills, the free RAM in other desktops is wasted.

You would be completely insane not to reserve this RAM per VM, if an over-commit for the VM’s is reached this RAM will merely spill out to a page file type arrangement, negating all your benefits.

Ultimately, assigning RAM in this way could be seen as wasteful in the grand scheme of things…

But there are other ways to skin this cat!

So this leads me on to something I was also considering which popped up in a twitter conversation recently. RAM disk’s and Hypervisors.

Current storage vendors will create a storage pool, consisting of ram inside a VM, per hosting hypervisor for local storage to allow good IOPS leveraging that VM. This VM can perform additional RAM optimizations like compression, de-duplication and sharing of similar pages to reduce the count.

This approach is super clever, but, In my humble opinion (please don’t kill me vendors), this approach is wasteful. Running a VM as a storage repository per host has an overhead to run this virtual machine and takes away from the agnostic solution that is possible…

What If the hypervisor provides the RAM disk?

So excluding ESXi for a second, as getting a RAM disk into that platform would require VMware to allow access to the stack. Lets look at Hyper-V (2012) and XenServer for a second…

With Unix and Windows platforms, RAM disks have been available for years. They were once a necessity and a number of vendors still provide them for high performance IO environments.

Lets say (for arguments sake) Citrix and Microsoft decide to provide a snap-in to their hypervisor to allow native RAM disks (or a vendor writes one themselves!) and maybe, they even decide to provide RAM compression, Spill over to local disk, dedupe, and page sharing on this volume from the hypervisor stack…

Wouldn’t this extend provide all the benefits we’ve spoken about, without the need for a VM per host? And using Thin Provisioning allow all desktops to share the large pool of RAMdisk available?

Yes, yes, it would.

Above is an example of how I would see this working.

So random pictures are fine and all, but what about the read IOPS negation technologies? and what about combining these with XenServer or Hyper-V?

XenServer and IntelliCache:

Well there you have it now, all the benefits of a per VM filter, leveraging intellicache for reads and spilling out to local disk… RESULT!

Pro’s:

  • Read IOPS negation
  • Write IOPS negation
  • No Shared Storage required for running VM’s
  • Shared pool of RAM for all desktop’s to use.

Con’s:

  • A small one, but no migration of VM’s.

Provisioning server?

and again, all the benefits of a per VM filter, reads redirected via lan and spilling out to local disk… RESULT!

Pro’s:

  • Write IOPS negation
  • No Shared Storage required for running VM’s
  • Shared pool of RAM for all desktop’s to use.

Con’s:

  • A small one, but no migration of VM’s.
  • No real read IOPS negation.

And HyperV + CSV Cache?

Well here’s an accumulation of my thoughts:

So lets just talk for a second about what we are seeing… Utilizing a RAM disk with spill over, and copied vhd’s on boot we are removing the need for shared storage completely and hosting the IOPS from the hypervisor, natively without the need for additional VM’s.

And see that little icon down the bottom? Yep, that’s right, live migration from host to host thanks to Microsofts Shared Nothing Live Migration!

Pro’s:

  • Some write IOPS negation
  • No Shared Storage required for running VM’s
  • Liv migration.
  • Shared pool of RAM for all desktop’s to use.
  • write back to local disk.

Con’s:

  • No full read IOPS negation.

Review.

I’m sure there’s loads of reading in here, and there will be tons of thought and questions after this blog post. This has been in my head for roughly 9 months now and it feels victorious to finally get it all down on paper.

At the end of the day, Ram Caching is the way of the future, you pay no maintenance on it, you keep your IOPS off your storage and with a little magic from Microsoft, or particularly Citrix. You could see these benefits.

If / When these technologies make it out to you, you could quite happily stick with Machine Creation services, leverage intellicache and your shared storage requirement could be a cheap NAS from any vendor. Provisioning services also see’s allot of benefits from this approach, but the real creme de la creme is in that intellicache feature.

The key functionality to these approaches is simple:

  • Cache RAM
  • Spill to disk
  • Keep the differences on local storage if it spills.

One last thing?

In true Mark Templeton fashion, you know I have a one last thing. This actually hit me today while writing this post and to be honest, I’m so amazed by the potential of this idea I’m going to build it in the lab this weekend.

But until then, a teaser…

What if I told you there was already an overlay, freely available baked into a Microsoft technology that would allow you to overlay local storage in Hyper-V with RAM. A powerful API attached to it, and the ability to write specific files to the local disk, subverting the overlay only when needed?

  • RAM disk type technology? > yep
  • already available? > yep
  • Powerful API? > yep
  • already can spill to disk? > yep

Yep Yep Yep! check back in a few days.

(If you know what it is already, keep it to yourself until I’ve gotten out of the lab!)

What do you think?

It doesn’t take a half baked enthusiast like me to see this potential and I’d be really eager to hear your comments on these approaches. If you would prefer to keep your comments offline, you can reach me on andrew@andrewmorgan.ie.

Related Posts

While using the ShareFile mobile applications, NTF... Here's a weird little bug I caught in the wild while deploying XenMobile Enterprise. While browsing NTFS shares, published as connectors in the ShareF...
UnSticking an AppDisk provisioning task in XenDesk... Here's a wee little bug I've no idea how i created, but managed to clear it out anyway. After creating an AppDisk, it got a little stuck. I tried d...
Cannot Log into XenMobile 10.3 Appliance after ini... Here's a horrendous bug I just came across in the field today while deploying a XenMobile 10.3 Proof...

3 Comments About “On IOPS, shared storage and a fresh idea. (Part 3) tying it all together in the stack

  1. Steve Greenberg

    I think we have come to a similar conclusion- that the RAM disk should be provided by the Hypervisor itself. This was my suggestion early in the twitter discussion and I think in the end it IS best. It is standard now to see customers with servers with 256 RAM installed. In many scenarios this truly leaves a chunk of RAM available for this purpose. Instead of being VM based it can be a storage repository and therefore larger and possible able to absorb spillover in RAM. PS- XenServer can now migrate machines using local storage.

    Reply

Leave a Reply