VMWare

Meltdown – Performance Impact Evaluation (Citrix XenApp 6.5)

2018-01-15
/ /
in Blog
/

Meltdown came out and it’s a vulnerability whose fix may have a performance impact.  Microsoft has stipulated that the impact will be more severe if you:

a) Are an older OS
b) Are using an older processor
c) If your application utilizes lots of context switches

Unfortunately, the environment we are operating hits all of these nails on the head.  We are using, I believe, the oldest OS that Microsoft is patching this for.  We are using older processors from 2011-2013 which do not have the PCID optimization (as reported by the SpeculationControl test script) which means performance is impacted even more.

I’m in a large environment where we have the ability to shuffle VM’s around hosts and put VM’s on specific hosts.  This will allow us to measure the impact of Meltdown in its entirety.  Our clusters are dedicated to Citrix XenApp 6.5 servers.

Looking at our cluster and all of the Citrix XenApp VM’s, we have some VM’s that are ‘application siloed’ — that is they only host a single application and some VM’s that are ‘generic’.

In order to determine the impact I looked at our cluster, summed up the total of each type of VM and then just divided by the number of hosts.  We have 3 different geographical areas that have different VM types and user loads.  I am going to examine each of these workload types across the different geographical areas and see what are the outcomes.

Since Meltdown impacts applications and workloads that have lots of context switches I used perfmon each server to record the context switches of each server.

The metrics I am interested in are the context switch values as they’ve been identified as the element that highlight the impact.  My workloads look like this:

Based on this chart, our largest impact should be Location B, followed by Location A, last with Location C.

However, the processors for each location are as follows:

Location A: Intel Xeon 2680
Location B: Intel Xeon 2650 v2
Location C: Intel Xeon 2680

The processors may play a roll as newer generation processors are supposed to fair better.

In order to test Meltdown in a side-by-side comparison of systems with and without the mitigation I took two identical hosts and populated them with an identical amount and type of servers. On one host we patched all the VM’s with the mitigation and with the other host we left the VM’s without the patches.

Using the wonderful ControlUp Console, we can compare the results in their real-time dashboard.  Unfortunately, the dashboard only gives us a “real time” view, ControlUp offers a product called “Insights” that can show the historical data, our organization has not subscribed to this product and so I’ve had to try and track the performance by exporting the ControlUp views on a interval and then manually sorting and presenting the data.  Using the Insights view would be much, much faster.

ControlUp has 3 different views I was hoping to explore.  The first view is the hosts view, this will be performance metrics pulled directly from the VMWare Host.  The second view will be the computers view, and the last will be the sessions view.  The computers and sessions view are metrics pulled directly from the Windows server itself.  However, I am unable to accurately judge the performance of Windows Server metrics because of how it measures CPU performance.

Another wonderful thing about ControlUp is we can logically group our VM’s into folders, from there ControlUp can sum the values and present it in an easily digestible presentation.  I created a logical structure like so and populated my VM’s:

 

And then within ControlUp we can “focus” on each “Location” folder and if we select the “Folder” view it presents the sums of the logical view.

HOSTS

In the hosts view, we can very quickly we can see impact, ranging from 5%-26%.  However, this is a realtime snapshot, I tracked the average view and examined only the “business hours” as our load is VERY focused on the 8AM-4PM.  After these hours and we see a significant drop in our load.  If the servers are not being stressed the performance seems to be a lot more even or not noticeable (in a cumulative sense).

FOLDERS

Some interesting results.  We are consistently seeing longer login times and application launch times.  2/3 of the environments have lower user counts on the unpatched servers with the Citrix load balancing making that determination.  The one environment that had more users on the mitigation servers are actually our least loaded servers in terms of servers per host and users per server, so it’s possible that more users would drive into a gap, but as of now it shows that one of our environment can support an equal number of users.

Examining this view and the historical data presented I encountered oddities — namely the CPU utilization seemed to be fairly even more often than not, but the hosts view showed greater separation between the machines with mitigation and without.  I started to explore why and believe I may have come across this issue previously.

2008R2-era servers have less accurate reporting of CPU utilization.

I believe this is also the API that ControlUp is using with these servers to report on usage.  When I was examining a single server with process explorer I noticed a *minimum* CPU utilization of 6%, but task manager and ControlUp would report 0% utilization at various points.  The issue is an accuracy, adding and rounding issue.  The more users on a server with more processes and those processes consuming ever so slightly CPU, the more the inaccuracy.  Example:

Left, Task Manager. Right, Process Explorer

We have servers with hundreds of users utilizing a workflow like this where they are using just a fraction of a percent of CPU resources.  Taskmanager and the like will not catch these values and round *down*.  If you have 100 users using a process that consumes 0.4% CPU then our inaccuracy is in the 40% scale!  So relying on the VM metrics of ControlUp or Windows itself is not helpful.  Unfortunately, this destroys my ability to capture information within the VM, requiring us to solely rely on the information within VMWare.  To be clear, I do NOT believe Windows 2012 R2 and greater OS’s have this discrepancy (although I have not tested) so this issue manifests itself pretty viciously in the XenApp 6.5 -era servers.  Essentially, if Meltdown is increasing CPU times on processes by a fraction of a percent then Windows will report as if everything is ok and you will probably not actually notice or think there is an issue!

In order to try and determine if this impact is detectable I have two servers with the same base image, with one having the mitigation installed and other did not.  I used process explorer and “saved” the process list over the course of a few hours.  I ensured the servers had a similar amount of users using a specific application that only presented a specific workload so everything was as similar as possible.  In addition, I looked at processes that aren’t configurable (or since the servers have the same base image they are configured identically).  Here were my results:

Just eye balling it, it appears that the mitigation has had an impact on the fractional level.  When taking the average of the Winlogon.exe and iexplore.exe processes into account:

 

These numbers may seem small, but once you start considering the number of users the amount wasted grows dramatically.  For 100 users, winlogon.exe goes from consuming a total of 1.6% to 7.1% of the CPU resulting in an additional load of 5.5%.  The iexplore.exe is even more egregious as it spawns 2 processes per user, and these averages are per process.  For 100 users, 200 iexplore.exe processes will be spawned.  The iexplore.exe CPU utilization goes from 15.6% to 38.8%, for an additional load of 23.2%.  Adding the mitigation patch can impact our load pretty dramatically, even though it may be under reported thus impacting users on a far greater scale by adding more users to servers that don’t have the resources that Windows is reporting it has For an application like IE, this may just mean greater slowness — depending on the workload/workflow — but if you have an application more sensitive to these performance scenarios your users may experience slowness even though the servers themselves look OK from most (all?) reporting tools.

Continuing with the HOSTS view, I exported all data ControlUp collects on a minute interval and then added the data to Excel and created my pivot tables with the hosts that are hosting servers with the mitigation patches and the ones without.  This is what I saw for Saturday-Sunday, these days are lightly loaded.

This is Location B, the host with VM’s that are unpatched is in orange and the host with patched VM’s is in blue.  the numbers are pretty identical when CPU utilization on the host is around or below 10%, but once it starts to get loaded the separation becomes apparent.

Since these datapoints were every minute, I used a moving average of 20 data points (3 per hour) to present the data in a cleaner way:

Looking at the data for this Monday morning, we see the following:

Location B

 

Some interesting events, at 2:00AM the VM’s reboot.  We reboot odd and even servers each day, and in my organization of testing this, I put all the odd VM’s on the blue host, and the even VM’s on the orange host.  So the blue line going up at 2:00AM is the odd (patched) VM’s rebooting.  The reboot cycle is staggered to take place over a 90 minute interval (last VM’s should reboot around 3:30AM).  After the reboot, the servers come up and do some “pre-user” startup work like loading AppV packages, App-V registry prestaging, etc.  I track the App-V registry pre-staging duration during bootup and here are my results:

Registry Pre-staging in AppV is a light-Read heavy-Write exercise.  Registry reading and writing are slow in 2008R2 and our time to execute this task went from 610 seconds to 693 seconds for an overall duration increase of 14%.

Looking at Location A and C

Location A

Location C (under construction)

We can see in Location A the CPU load is pretty similar until the 20% mark, then separation starts to ramp up fairly drastically.  For Location C, unfortunately, we are undergoing maintenance on the ‘patched’ VM’s, so I’m showing this data for transparency but it’s only relevant up to the 14th.  I’ll update this in the next few days when the ‘patched’ VM’s come back online.

Now, I’m going to look at how “Windows” is reporting CPU performance vs the Hosts CPU utilization.

Location A

 

Location B

 

The information this is conveying is to NOT TRUST the Windows CPU utilization meter (at least 2008 R2).  The CPU Utilization on the VM-level does not appear to reflect the load on the hosts.  While the VM’s with the patch and without the patch both report nearly identical levels of CPU utilization, on the host level the spread is much more dramatic.

 

Lastly, I am able to pull some other metrics that ControlUp tracks.  Namely, Logon Duration and Application Launch duration.  For each of the locations I got a report of the difference between the two environments

Location A: Average Application Load Time

Location B: Average Application Load Time

 

Location A: Logon Duration

 

Location B: Logon Duration

 

 

 

In each of the metrics recorded we experience a worsening experience for our user base, from the application taking longer to launch, to logon times increasing.

What does this all mean?

In the end, Meltdown has a significant impact on our Citrix XenApp 6.5 environment.  The perfect storm of older CPU’s, an older OS and applications that have workflows that are impacted by the patch means our environment is grossly impacted.  Location A has a maximum hit (as of today) of 21%.  Location B having a spread of 12%.  I had originally predicted that Location B would have the largest impact, however the newer V2 processors may be playing a roll and the performance of the V2 processors maybe more efficient than the older 2680.

In the end, the performance hit is not insignificant and reduces our capacity significantly once these patches are deployed.  I plan on adding new articles once I have more data on Meltdown and then further again once we start adding the mitigation’s against Spectre.

CPU Utilization on the hosts. Orange is a host with VM’s without the Meltdown patches, blue is with the patches.

 

Read More

Using VMWare Remote Console with ControlUp

2016-08-13
/ /
in Blog
/

I wanted to connect to the console session of some of our VM’s but ControlUp doesn’t have a native way of doing so.  Enter Script-Based-Actions and the ability to create those features!  Here is a video of it in action:

VMWare Remote Console on ControlUp

We use multiple individual vCenter servers so I have a list of them I need to connect to in order to find the VM and get the required data.  This takes a bit longer but is still faster than running 6 different vCenter consoles.  You will need to modify the vCenter list in my script and add your own:

 

Read More

Citrix Provisioning Services – Updating VMWare Tools and Target Device software — with all native tools

2016-04-05
/ /
in Blog
/
  1. Prerequistes

Step
Detail
2 servers:
  1. Staging server with Windows 2008 R2 SP1
  2. VMWare machine configured identically to your PVS target devices (BLD server)
‘Windows Server Backup’ feature installed on the staging server
The Citrix utility ‘CVHDMOUNT.EXE’ is installed on the staging server
‘Backup’ space (approx. 100GB)
VMWare VMDK hard disk of a greater size than the PVS disk:
<

 

Copy and install required drivers on staging server

 

Step
Detail
From a Windows Server, browse to a server with Citrix Provisioning Services installed on it.

 

Copy the drivers folder and CVhdMount.exe to the server

 

Open the drivers folder, right-click on cfsdep2.inf and select “Install”

 

Open Device Manager, right-click the computer name and choose “Add Legacy Hardware…”
Select “Next”
Select “Install the hardware that I manually select from a list (Advanced)” and click “Next”
Click “Next”
Click “Have Disk…”
Click “Browse” and go to the “drivers” folder and select “cvhdbusp6.inf” and click “Open”
Click “OK”
Ensure “Citrix Virtual Hard Disk Enumerator PVS” is listed and click “Next”
Click “Finish”

 

Setup VMWare virtual disk

Step
Detail
Open the “VMWare vSphere Client”

 

Right-click on the server you want to do the cloning on and click ‘Edit Settings…’
Under the ‘Hardware’ tab, click ‘Add…’
Select ‘Hard Disk’ and click ‘Next’
Select ‘Create a new virtual disk’ and click ‘Next’

 

Set the ‘Disk Size’ to be greater than the size of the VHD file, select the ‘Disk Provisioning’ options you require, select the ‘Location’ you want to store the disk and remember where it is stored.  You will need this location soon.  Click ‘Next’
Click ‘Next’
Click ‘OK’
Format the disk and set it as active

 

  1. Clone VHD to VMDK

      1. Backup Citrix vDisk

Step
Detail
RDP into the staging server and mount the VHD file you want to update:
Cvhdmount.exe –p 1 \serversharevDisks-XenAppXenApp65Tn01.13.avhd

 

Open Disk Management and confirm your Citrix VHD is mounted and the new VMWare disk is present

 

Open ‘Windows Server Backup’
Click ‘Backup Once…’
Select ‘Different options’ then ‘Next’
Select ‘Custom’ than ‘Next’
Select ‘Add Items’
Select the PVS disk and click ‘OK’
Click ‘Next’
Select ‘Local drives’ and click ‘Next’
Select the ‘Backup Destination’ and click ‘Next’
Click ‘Backup’
Wait for the backup to complete
Click ‘Close’

 

      1. Recover backup to VMDK

Step
Detail
Click ‘Recover’
Select ‘This server’ and click ‘Next’
Click ‘Next’
Select ‘’Volumes’ and click ‘Next’
Select the checkbox beside the volume and choose the ‘VMDK’ for the destination volume and click ‘Next’
Click ‘Yes’
Click ‘Recover’
Wait for the Recovery to finish
Click ‘Close’

 

      1. Fix BCD file for VMDK

Step
Detail
Unmount the Citrix vDisk.  Cvhdmount -U 0
In the command prompt, switch to the ‘Destination’ drive and check the BCD file:

 

Notice there are 3 entries that need to be corrected.
Execute the following commands, substituting the ‘E:’ for the proper drive letter:

 

bcdedit /store bcd /set {bootmgr} device partition=E:
bcdedit /store bcd /set {default} device partition=E:
bcdedit /store bcd /set {default} osdevice partition=E:

 

Confirm the BCD file now contains the correct entries:

 

      1. Configure BLD Virtual Machine and attach the VMDK

Step
Detail
Open the vCenter console, select the staging server and ‘Right-click’ and select ‘Edit Settings…’
Select the VMDK file, note the path of the Disk File and click ‘Remove’
Under ‘Removal Options’ select ‘Remove form virtual machine’ and click ‘OK’
Select the associated BLD server of this vDisk and right-click and select ‘Edit Settings…’  In this example, the vDisk I am modifying is XenApp65Tn01 which is associated with BLD server WSCTXBLD351T
Click ‘Add…’
Select ‘Hard Disk’ and click ‘Next’
Select ‘Use an existing virtual disk’ and click ‘Next’
Select ‘Browse’
Navigate to the path noted earlier, select the disk and click ‘Open’
Click ‘Next’
Click ‘Next’
Click ‘Finish’
Click ‘OK’
      1. Disable CDROM attachment on bootup

Step
Detail
Select the associated BLD server of this vDisk and right-click and select ‘Edit Settings…’  In this example, the vDisk I am modifying is XenApp65Tn01 which is associated with BLD server WSCTXBLD351T
Select the ‘CD/DVD drive 1’ and ‘Uncheck’ the ‘Connect at power on’ and click OK

 

Start the VM and uninstall the target device software

Step
Detail
Right-click on the VM and select “Power > Power On”
Right-click on the VM and select “Open Console”
Login to the VM once it boots
Click “Start”
Click “Control Panel”
Click “Program and Features”
Click on the “Citrix Provisioning Services Target Device x64”
Right-click and choose “Uninstall”
Click “Yes”
Click “OK”
Wait for the uninstall to complete then restart the computer

 

Upgrade VMWare Tools

Step
Detail
Login to the VM once it boots
Browse to the VMWare Tools install and open ‘setup64.exe’

 

Select ‘Next’
Select ‘Custom’ and click ‘Next’
Ensure the ‘NSX’ options are set to ‘Entire feature will be unavailable’ and click ‘Next’
Select ‘Close the applications and attempt to restart them’ and click ‘OK’
Click ‘Finish’
Click ‘Yes’ to restart

 

Install new Citrix Provisioning Services Target Device software

 

Step
Detail
Login to the VM once it boots
Browse to the share that holds the updated software and open ‘PVS_Device_x64.exe’
Click ‘Install’
Click “Next”
Select ‘Acknowledged’ and click ‘Next’
Choose “I accept the terms in the license agreement” and click “Next”
Click “Next”
Click “Next”
Select ‘Complete’ and click ‘Next’
Click “Install”
Uncheck “Launch Imaging Wizard” and click “Finish”
Click “Yes” To restart the computer.
      1. Remove VMWare Disk from BLD server

Step
Detail
Right-click the VM and select ‘Edit Settings…’
Select the VMDK disk, note the ‘Disk File’ path and click ‘Remove’
Ensure ‘Remove from virtual machine’ is selected and click ‘OK’
Select the CD/DVD Drive and check the ‘Connect at power on’ box and click ‘OK’

 

  1. Clone VMDK to VHD

      1. Backup VMWare Disk

Step
Detail
Right-click on the staging server and select ‘Edit Settings…’
Select ‘Add…’
Select ‘Hard Disk’ and click ‘Next’
Select ‘Use an existing virtual disk’ and click ‘Next’
Click ‘Browse’ and select the disk you noted earlier
Click OK
Click ‘Next’
Click ‘Next’
Click ‘Finish’
Click ‘OK’
RDP into the staging server, browse the backup drive and delete the contents of ‘WindowsImageBackup’

 

Open Disk Management and confirm your VMDK is mounted

 

Right-click on the VMDK and select ‘Shrink Volume…’
Enter a number to shrink the partition so it is *smaller* then your Citrix VHD disk size

 

NOTE If you do NOT shrink the partition you will be unable to restore the partition to the smaller Citrix VHD file.
Confirm the shrink worked successfully
Open ‘Windows Server Backup’
Click ‘Backup Once…’
Select ‘Different options’ then ‘Next’
Select ‘Custom’ than ‘Next’
Select ‘Add Items’
Select the VMDK disk and click ‘OK’
Click ‘Next’
Select ‘Local drives’ and click ‘Next’
Select the ‘Backup Destination’ and click ‘Next’
Click ‘Backup’
Wait for the backup to complete
Click ‘Close’
Go to the vCenter console and Right-click on the staging server and select ‘Edit Settings…’
Select the VMDK used for updating VMWare Tools/Target Device software ‘’ and click ‘Remove’
Select ‘Remove from virtual machine and delete files from disk’

 

      1. Recover backup to VMDK

Step
Detail
RDP into the staging server and mount the VHD file you want to update:
Cvhdmount.exe –p 1 \serversharevDisks-XenAppXenApp65Tn01.13.avhd

 

Open ‘Windows Server Backup’
Click ‘Recover’
Select ‘This server’ and click ‘Next’
Click ‘Next’
Select ‘’Volumes’ and click ‘Next’
Select the checkbox beside the volume and choose the ‘Citrix vDisk’ for the destination volume and click ‘Next’
Click ‘Yes’
Click ‘Recover’
Wait for the Recovery to finish
Click ‘Close’

 

      1. Fix BCD file for PVS vDisk

Step
Detail
In the command prompt, switch to the ‘Destination’ drive and check the BCD file:

 

Notice there are 3 entries that need to be corrected.
Execute the following commands, substituting the ‘E:’ for the proper drive letter:

 

bcdedit /store bcd /set {bootmgr} device partition=E:
bcdedit /store bcd /set {default} device partition=E:
bcdedit /store bcd /set {default} osdevice partition=E:

 

Confirm the BCD file now contains the correct entries:

This process could be scripted to make it less manual, faster, and less error prone, but because of the frequency we actually do these type of updates, I have just created a manual document for now.

Read More

Extend disk space on a VMWare PVS system

2014-07-14
/ / /

 

 

Read More

Install VMWare drivers offline into a Citrix PVS vDisk

2014-02-28
/ / /

I am attempting to disable interrupt coalescing for some testing that we are doing with a latency sensitive application and I have 2 VMWare virtual machines configured as such that work as expected.

The latency settings I have done are here:
http://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf

Essentially, we turned off interrupt coalescing on the physical NIC by doing the following:
Logging into the ESXi host with SSH

(e1000e found as our NIC driver)

(we see InterruptThrottleRate as a parameter)

Then we modified the VMWare virtual machines with these commands:
To do so through the vSphere Client, go to VM Settings  Options tab  Advanced General  Configuration
Parameters and add an entry for ethernetX.coalescingScheme with the value of “disabled”

We have 2 NIC’s assigned to each of our PVS VM’s.  One NIC is dedicated for the provisioning traffic and one for access to the rest of the network.  So I had to add 2 lines to my configuration:

For the VMWare virtual machines we just had the one line:

Upon powering up the VMWare virtual machines, the per packet latency dropped signficantly and our application was much more responsive.

Unfortunately, even with the settings being identical on the VMWare virtual machines and the Citrix PVS image, the PVS image will not disable interrupt coalescing, consistently showing our packets as have higher latency.  We built the vDisk image a couple years ago (~2011) and the vDisk now has outdated drivers that I suspect may be the issue.  The VMWare machines have a VMNET3 driver from August of 2013 and our PVS vDisk has a VMNET3 driver from March 2011.

To test if a newer driver would help, I did not want to reverse image the vDisk image as that is such a pain in the ass.  So I tried something else.  I made a new maintenance version of the vDisk and then mounted it on the PVS server:

This mounted the vDisk as drive “D:”

I then took the newer driver from the VMWare virtual machine and injected it into the vDisk:

I could see my newer driver installed alongside the existing driver:
Published Name : oem57.inf
Original File Name : vmxnet3ndis6.inf
Inbox : No
Class Name : Net
Provider Name : VMware, Inc.
Date : 08/28/2012
Version : 1.3.11.0

Published Name : oem6.inf
Original File Name : vmmouse.inf
Inbox : No
Class Name : Mouse
Provider Name : VMware, Inc.
Date : 11/17/2009
Version : 12.4.0.6

Published Name : oem7.inf
Original File Name : vmaudio.inf
Inbox : No
Class Name : MEDIA
Provider Name : VMware
Date : 04/21/2009
Version : 5.10.0.3506

Published Name : oem9.inf
Original File Name : vmxnet3ndis6.inf
Inbox : No
Class Name : Net
Provider Name : VMware, Inc.
Date : 11/22/2011
Version : 1.2.24.0

Then unmount the vDisk:

I then set the vDisk to maintenance mode, set my PVS target device as maintenance and booted it up.  When I checked device manager I saw that the driver version was still 1.2.24.0

But clicking “Update Driver…” updated our production NIC to the newer version.  I chose “Search automatically and it found the newer, injected driver.  I then rebooted the VM and success!

 

Read More

PowerCLI Fix VMWare Time Sync issue

2013-10-15
/ / /

 

Read More

Reboot Server and PVS Streaming Service is started but the console shows the service is down

2013-04-25
/ / /

Upon rebooting a server we found the Citrix PVS Console showed the server as down.  When we investigated the server we found the service was started and their were no errors in the logs that we could see.  Restarting the service brought the server as up in the console.  We did see one particular error though, the date was suddenly incorrect in the event viewer.

 

Further investigation showed EventID 52, the time service resync’ed a offset.

Since this was a virtual machine we checked the VMWare settings to confirm that the time was not being sync’ed

But the time was still getting offset.  Further investigation showed the VMWare hosts time was not set correctly and the server was having it’s time set to the hosts time; even though the above check box was not set.

It appears VMWare has additional time synchronization settings that are enabled by default and must be set to explicitly deny to not have the time synchronize from different scenarios.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1189

Upon VMWare Tools starting on reboot, a “resume”, or the tools being restarted or other scenarios.  To prevent it from happening you must edit the VMX file and set the values as stated in the kb article above.

Read More

Utilizing PowerShell to make Citrix VM Templates

2012-03-05
/ / /

Because my company doesn’t utilize provisioining servers for deploy new Citrix XenApp servers, I’ve had to come up with a couple of PowerShell scripts to make VMWare Templates that I can then deploy multiple XenApp servers. You need VMWare PowerCLI to run this script. This is my script:

This script does the following:
1) Sets the inputs from a piped in object (get-vm VMTOTEMPLATE | create-template)
2) Sets a series of variables ($vm, $name, $newname, $date, $templatename, etc.)
3) We setup a startup script on the target server to make into a template that:
a) Removes the computer from the domain
b) renames the computer to a generic name (XATEMPLATE)
c) Adds registry keys that will allow sysprep to run
d) Configures XenApp to “Image” mode
e) Shuts itself down once running the script is complete
f) deletes the script from running on startup
4) We then set the target to autologin with the local admin user name and password so the startup script in step 3 will be run
5) Begins the cloning by making a new-vm with the target machine
6) We unplug the NIC from VMWare so that when it starts up the script won’t actually remove the machine from the domain, but will remove itself from the domain
7) start the clone
8) the PowerCLI will now wait till the machine turns itself off…
9) Then it will reconnect the NIC, remove any stale templates and then makes a new template and then removes the clone VM.

Done! 🙂

Read More

Purposeful limitations

2012-02-15
/ / /

I have a Citrix environment without a provisioning server. This means that since I’m going to build my VM’s out as opposed to up I need to script a method to automate deployment of Citrix servers within the VM environment. Fortunately, VMWare gives us the PowerCLI. A PowerShell that you can use to manipulate VMWare.

While I was working on this I ran into some some weird issues while working with VMWare and deploying OS Customizations.

It turns out if you burn your 3 activations VMWare cannot customize the OS anymore from VMWare. BUT(!) there is a dirty little trick that works for Server 2008 R2 (only one I’ve tested anyhow) that can allow you to work around the issue. The issue is VMWare’s OS Customizations does a Generalize pass using Sysprep. If you exceed the 3 activations, sysprep will fail because it will notice this at the Generalize pass. The solution that I’ve read is to add this line to the sysprep.xml file:

http://support.microsoft.com/kb/929828

Unfortunately, VMWare does not appear to have a way to allow you to add the SkipRearm to the XML that it generates through the OS Customization GUI. But you can add a couple of registry keys that appear to have the same effect. They are:

These two keys will signal to sysprep that this image can be generalized even though it has exceeded its activation count.

So, what I’ve found is you need to set these registry keys *prior* to shutting it down to convert into a template. Then when you deploy from template and it starts up and engages sysprep, sysprep will run without issue.

So how do you put this all together? Here is my scenario:
We do NOT have a golden image template of our Citrix environment. This is because it is incredibly fluid. Changes occur to applications on the servers fairly regularly and documentation/memory is difficult to ensure that when we commit to putting these changes into the template that they are actually done. So how are we going to do this? We are going to take a 3 prong approach:
1) We have a dev environment where developers can modify/change and generally mess up their VM’s to their hearts content. The whole goal of this environment is to get their application working at 100%. The developers do not need to answer to anyone and have free reign to modify and experiment.
2) We have a test environment where, when the developers think the application is tweaked/modified/configured exactly right; will pass off documentation to me to install in this environment. Any errors or modifications that happen outside of their documentation will be further documented and verified.
3) Once step two is validated we can then push on to install on the production servers.

The issues we encounter is some of our application installs are huge, multi-step non-automated processes with large configuration tweaks post install. Once we get a solid install on one of the production servers our perferred method of redployment (because I’ve automated this process thus eliminating possible failure points) will be to template and redeploy with VMWare. This is the script I’ve written to accomplish this:

I’ve shamelessly stolen and modified this script from elsewhere. To execute this script, copy and paste it in a PowerGUI prompt then run:

It will execute the following:
1) Pull the following parameters from the target VM to reclone:
The VM Object
The VM Objects Name
The New Name for the temp clone
A Template Name
The Datastore the VM resides on
The folder the VM resides in
And the VM Host
2) We then copy a script to our target machine that does the following:
Removes the machine from the domain
Renames the machine
Prepares the addition of our two Sysprep registry key fixes
Prepares a 20 second count down
Deletes the script that executes step 2.
3) We add the default username and password and autologon registry values. As of this writting this is not working for some reason
4) We execute the command that preps the machine for cloning with Citrix. The machine you run this against will need to be rebooted as this will prevent new logons, but existing should continue (I think).
5) We start the cloning process and unplug the NIC on the new machine. We don’t want the new machine to come up and unjoin the original machine from the domain. From here, it will auto-power on. Ideally, it will login automatically and run the preconfigured script. (You may have to login manually to get it to do its thing) Once done it should shut itself down.
6) Now VMWare will wait until the machine is powered off. Once it’s powered off it will reconnect the NIC and make a template out of the VM and delete the temporary clone.
7) Lastly it will now setup the OS Customization dynmically and create a VM with it.

Voila!

Read More

PowerShell to Clone VM’s

2012-01-31
/ / /

I need to create a Provisioning Server of my own. We don’t want to purchase the software to do so and we may not need to do so… VMWare has PowerCLI which may provide enough to do the following:

1) Notify the VM that it needs to disable Citrix logins
2) Have the VM disable logins and then check for logins. If none are found kick-off the cloning process. Kick off consists of:
2a) Unjoin from the domain
2b) Shutdown
If users are still logged in, log them off forcefully at midnight and start the process
3) Use VMWare Templates to clone the VM with the VM name.
4) Shutdown original VM’s with the same names
5) Startup new VM’s and join to domain…

Done?

I have a script that does the cloning… I got it from this site:
http://www.vtesseract.com/post/16447807254/clone-list-powercli-function
http://communities.vmware.com/docs/DOC-18155

I had issues with running it though. For some reason my PowerShell wouldn’t run it with the comments in it so I had to take them out:

You can run it with a command like so:

Get-VM MyVM | Clone-List

I will need to modify this script to see if I can use VMWare Templates (I think that’s the right terminology) and Citrix XA PowerShell to see if I can get this to work… We shall see 🙂

EDIT – It’s not VMWare Templates… It’s OSCustomizationSpec I think.

Read More