While attending E2EVC vienna recently, I found myself attending the Citrix Geekspeak session, although the opportunity to rub shoulders with fellow Citrix aficionado’s was a blast, I found myself utterly frustrated spending time talking about storage challenges in VDI.
The topic was interesting and informative, and while there were a few idea’s shared about solid state arrays, read IOPS negation with PVS or MCS there really wasn’t a vendor based, one size fits all solution to reduce both Read and Write IOPS without leveraging paid for (and quite expensive I may add) solutions like Atlantis ILIO, Fusion IO, etc. The conversation was tepid, deflating and we moved on fairly quickly to another topic.
So we got to talking…
In the company of great minds and my good friends Barry Schiffer, Iain Brighton, Ingmar Verheij, Kees Baggerman, Remko Weijnen and Simon Pettit. Over lunch we got to talking about this challenge and the pro’s and con’s to IO negation technologies.
Citrix Provisioning Server…
We spoke about the pro’s and con’s delivered by Citrix provisioning server, namely rather than actually reducing read IOPS to shared storage, it merely puts the pressure instead on the network rather than a SAN or local resources in the hypervisor.
The RAM caching option for differencing is really clever with Citrix Provisioning server, but it’s rarely utilised due to the ability to fill the cache in RAM and the ultimate blue screen that will follow this event.
Citrix provisioning server does require quite a bit of pre work to get the PVS server configured, highly available then the education process of learning how to manage images with Provisioning server, it’s more of an enterprise tool.
Citrix Machine Creation Services…
We spoke about the pro’s and con’s of delivered by Citrix Machine Creation Services and Intellicache, this technology is much smarter around caching regular reads to a local device but it’s adoption is mainly SMB and the requirements for NFS and XenServer were a hard sell… Particularly with a certain hypervisors dominance to date.
Machine Creation Services is stupidly simple to deploy, install XenServer, stick the image on an NFS share (even windows servers can host these)… Bingo. Image changes are based on Snapshots so the educational process is a null point.
But again MCS is only negating read IO, and this assumes you have capable disk’s under the hypervizor to run multiple workspaces from a single disk or array. It’s also specific to XenDesktop, sadly. So no hosted shared desktop solution.
We mulled over SSD based storage quite alot with MCS and in agreement we discussed how MCS and Intellicache could be leveraged on an SSD, but the write cache or differencing disk activity would be so write intensive it would not be long before the SSD would start to wear out.
The mission Statement:
So all this got us to thinking, without paying for incredibly expensive shared storage, redirecting the issue onto another infrastructure service, or adding on to your current shared storage to squeeze out those precious few IOPS, what could be done with VDI and storage to negate this issue?
So we got back to talking about RAM:
RAM has been around for a long time, and it’s getting cheaper as time goes by. We pile the stuff into hypervisors for shared server workloads. Citrix provisioning server chews up the stuff for caching images in server side and the PVS client even offers a ram caching feature too, but at present it limit’s the customer to a Provisioning server and XenDesktop or XenApp.
But what if we could leverage a RAM caching mechanism? decoupled from a paid for technology, or provisioning server, that’s freely available and can be leveraged in any hypervisor or technology stack? Providing a catch all, free and easy caching mechanism?
and then it hit me…
This technology has been around for years, cleverly tucked away in Thin Client architecture and Microsoft have been gradually adding more and more functionality to this heavily underestimated driver…
The Microsoft Extended Write Filter has been happily tucked away in the Microsoft embedded operating systems since XP and is fairly unfamiliar to most administrators. The Microsoft extended write filter saves all disk writes to memory (except for specific chunks of registry) resulting in the pc being clean each time it is restarted.
This technology has been mostly ignored unless you are particularly comfortable with Thin Client architectures. Interestingly during my initial research for this mini project, I found that many car pc enthusiasts or early SSD adopters have been hacking this component out of Windows Embedded and installing it into their mainstream windows operating systems to cut down on wear and tear to flash or Solid state disks.
Microsoft have been gradually building this technology with each release adding a powerful API to view cache hits, contents of the cache and even write the cache out to disk as required, in an image update scenario… or even better, when a RAM spill over is about to happen…
If this technology is tried and trusted in the car pc world, could it be translated to VDI?…
To the lab!
Excited with my research Thus far, I decided to take the plunge. I began by extracting the EWF drivers and registry keys necessary to run this driver from a copy of windows Embedded standard…
With a bit of trial, error, hair pulling and programming I managed to get this driver into a clean windows 7 x86 image, so from here I disabled the driver and started testing first without this driver.
(I’m not going to go into how to extract the driver yet, please check back later for a follow up post)
Test A: XenDesktop, no RAM caching
To start my testing, I went with a XenDesktop image, hosted on XenServer without intellicache. I sealed the shared image on an NFS shared from a windows server. I performed a cold boot, opened a few applications then shutdown again.
I tracked the read and write IOPS Via Windows Perfmon on the shared NFS disk, and below were my findings:
(Black line denotes Write IOPS to shared storage)
Test B: XenDesktop, with the Microsoft Extended Write Filter.
The results below pretty much speak for themselves…
(again, black line denotes write IOPS)
I couldn’t believe how much of a difference this little driver had made to the write IOPS…
Just for ease of comparison, here’s a side by side look at what exactly is happening here:
So where do we go from here:
Well, no matter what you do, Microsoft will not support this option and they’d probably sue the pants off me if I wrapped this solution up for download…
But this exercise on thinking outside of the box raised some really interesting questions… Around other technologies and the offerings they already have in their stack.
In my next few posts on this subject I’ll cover the below topic’s at length and discuss my findings.
- I’ll be looking at how to leverage this with Citrix Intellicache for truly IO less desktops.
- I’ll be looking at how this technology could be adopted by Microsoft for their MED-V stack.
- I’ll be looking at one of my personal favourite technologies, VDI in a Box.
- I’ll be looking at how we could leverage this API ,for spill over to disk in a cache flood scenario or even a management appliance to control the spill overs to your storage.
And Finally, I’ll be looking at how Citrix or Microsoft could quite easily combine two of their current solutions to provide an incredible offering.
And that’s it for now, just a little teaser for the follow up blog posts which I’ll gradually release before Citrix Synergy.