Uncategorized

AppV5 – Package failed configuration in folder with error 0xA040132A-0x3

2015-03-04
/ / /

When publishing AppV5 applications on a PVS server we sometimes encounter an issue where packages do not load.  There are usually events logged to event viewer with the following:

Package {5075e8a4-4335-4101-991e-be88f5862575} version {940e4d49-af37-428c-a129-3bd37e3e4539} failed configuration in folder ‘D:\AppVData\PackageInstallationRoot\5075E8A4-4335-4101-991E-BE88F5862575\940E4D49-AF37-428C-A129-3BD37E3E4539’ with error 0xA040132A-0x3. with error 0xA040132A-0x3.

With another event that follows:

Part or all packages publish failed.
published: 14
failed: 10
Please check the error events of ‘Configure/Publish Package’ before this message for the details of the failure.

This issue is typically caused by two factors: folders not existing in the “PackageInstallationRoot”, in my example that is D:\AppVData\PackageInstallationRoot.  You can find this value in the registry here:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\AppV\Client\Streaming /v PackageInstallationRoot /d D:\AppVData\PackageInstallationRoot

and registry entries existing here:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\AppV\Client\Streaming\Packages

 

There are more values in the registry then the folder above.

What you will find is a mismatch between the number of keys in Packages and the number of folders in your PackageInstallationRoot path.  For us, our error was we had 14 folders existing in that path, but 10 of them were missing.  But we had all 24 values existing in the registry.

To fix this error I created a script to create the missing folders that were found in the registry and doing a get-appvpublishingserver | sync-appvpublishingserver

And then all applications were able to be published without issue.

Read More

APPV5 – Virtualization Template

2015-02-20
/ / /

Use if you want, or not.  This virtualization template is to be applied against the sequencer.  I’ve found it removes a lot of useless captured information that can get caught in a sequence.

 

Read More

Becareful of your AVHD to VHD chains with Citrix PVS with multiple sites

2015-02-05
/ / /

Citrix PVS is a great product.  With a single VHD file you can boot multiple machines and with the new RAM caching technology introduced in PVS 7.1 it’s super fast.

We have Citrix PVS 7.1 setup across 5 data centres in 3 geographically disperse regions with 2 primary sites each having a primary datacenter and a DR datacenter.  Each datacenter has high-speed local share for our PVS images.  Our Active Directory architecture is 2003 level.  Our PVS setup is configured to stream the vDisks from a local file share.  This information is important to add context for our process.

I found an issue in one of our datacenters when we had an issue and had to reboot some VM’s.  Essentially the issue is our target devices were booting slow.  Like really, really, really slow.  In addition to that, once they did boot; logging into them was slow.  Doing any action once you were logged in was slow.  But it did seem to get faster the more you click’ed and prodded your way around the system.  For instance, clicking the ‘Start’ button took 15 seconds to open the first time but was normal afterwards, but clicking on any sub-folders was slow with the same symptoms.

This is a poor performing VM.  Notice the high number of retries, slow Throughput and long Boot Time

Conversely, connecting to a target device at the other city showed these statistics:

Example of a much, much better performing VM.  14 second boot time, throughput about 10x faster than the slower VM and ZERO retries.

So we have an issue.  These two separate cities are configured nearly identically but one is suffering severely in performance.  We looked at the VMHost that it the VM was hosted at, but moving it to different hosts didn’t seem to make any difference.  I then RDP’ed to our VM and launched ProcessMonitor so I could watch the bootup process.  This is what I saw:

chfs04 is the local file server hosting our vDisk.  citrixnas01 is a remote file server on a WAN connection in the datacenter in the other city.  For some reason, PVS is reading and sharing the *versioned* avhd file that resides on the local file share, but the base VHD file it is reading from citrixnas01.  This is, obviously, a huge issue.  Reading the base image over a WAN probably will result in the poor performance we are experiencing and the high retry counts for packets.

But why is it going over the WAN?  It turns out that the avhd file contains a section in it’s header that describes the location of the parent VHD file.  PVS is simply using the native sequence built in to the VHD spec for the chain’ed disks.

Hex editing the avhd file reveals the chain path

(Un)Fortunately for us, our PVS servers can read the file share across the WAN and pull and cache data locally so the speedups tended to gain the longer the VM was in use.  In order to fix this issue immediately, we edited our hosts file on the PVS server to point to the local file server for citrixnas01.

After executing that change I rebooted one of the target devices in the *slow* site.

Throughput is now *much* better, but the number of retries is still concerning

Now, the tricky thing about this issue is we thought we had it covered because we configured each site to have it’s own store:

Our thoughts were that this path is what the PVS service would look for when trying to find VHD’s and AVHD’s.  So when it would look for XenAppPn01.6.vhd it would look in that store.  Obviously, this line of thinking is not correct.  So it is important if you have sites that are distant that the path you use to create your version will correspond to the fastest, local, share in all your sites.  For us, our folder structure is identical in all sites so creating a host file entry pointing citrixnas01 to the local file share works in our scenario to start with.

EDIT – I should also note that you can see the incorrect file paths in process explorer as well.  Changing the host file wasn’t enough, we also needed to restart the streaming service as the streaming service held cached data on how to reach the files across the WAN.  Process Explorer can show you the VHD files that the stream service has access to and where (under files and handles):

Citrixnas01 shouldn’t be in there…  Host file has been changed…

After a streaming service restart:

Muuuuch better.

Read More

How important is it to launch your application in a AppV package while sequencing?

2015-01-22
/ / /

For a bit I was of the mindset that sequencing a AppV package should be as clean as humanely possible.  This would include finding all configuration tweaks to files/registry keys ahead of time and implementing them so any registry keys generated would be unique to the user.  I’ve seen some applications generate a unique GUID that vendor’s would use to lock it so that the application was tied to one machine.  Since this key would be generated in HKLM so all users would be able to see the key, it prevented new launches.

But if you didn’t launch the application while sequencing, the key wouldn’t get generated until the user launched it in their bubble.  This effectively allowed multiple users to use the same application on one server.  With this new information in mind, and a new outlook on keeping the AppV registry hive to a bare minimum; forward I strode.  And hard into a wall I ran.

What I eventually ended up finding was applications that error’ed on first launch:

But would launch just fine the second time:

So what would cause it to fail the first time?  I imagine for most cases the error message is caused by missing file(s) or registry key(s) or values.  So how do you find these newly generated profile?  Well, the nice thing about AppV is it stores all these things in two places.  Your registry hive or your user profile.

Before launching the application I did a “dir /s /b C:UsersAdtest91 >>Clean.txt” and saved that to a text file.  I then launched the program twice and ran this command “dir /s /b C:UsersAdtest91 >> Working.txt”  I then compared the files and found the following new paths generated:

C:\Users\adtest91\AppData\Local\Microsoft\AppVClient\VFS\ADB25534-3FE9-44BD-9FC8-D5AAD8C0E728
C:\Users\adtest91\DesktopFax

To see if these two paths caused my issue I deleted them and relaunched the application.  The application launched just fine.  I then backed up those two folders.  To rule out the file system with some more finality, I deleted my user profile then copied my two backup folders to their recorded paths and tried launching the program and got the error message again.  With that I felt confident I could rule out files/folders as the cause.

The beautiful thing about AppV registry changes is they are recorded to your user profile.  This is stored here:
HKEY_USERS\!GUID!_Classes\AppV\Client\Packages

Export this key prior to launching your application, launch your application, and export the key again and do a difference.  Any created or modified registry keys will reside in this location for you to examine.

For this issue, this was the key that was generated:

[HKEY_USERS\S-1-5-21-38857442-2693285798-3636612711-15053136_Classes\AppV\Client\Packages\ADB25534-3FE9-44BD-9FC8-D5AAD8C0E728\REGISTRY\MACHINE\SOFTWARE\Classes\TypeLib\{3B7C8863-D78F-101B-B9B5-04021C009402}\1.2\0\win32]
@=”C:\\Windows\\system32\\Richtx32.ocx”

Deciphering the key results in the following missing value:
[HKLM\SOFTWARE\Classes\TypeLib\{3B7C8863-D78F-101B-B9B5-04021C009402}\1.2\0\win32]

Looking locally on the server, this is what I saw in the registry:

There was nothing in the Data field!  AppV5 ‘integrates’ the Classes key in your AppV package when you publish it.  I resequenced and launched the application after the install and checked the key again:

Surprise, surprise.  And now the application launches without issue.  So it appears that some application installers don’t completely register all files (OCX files are two instances of this issue happening that I noticed) until the application is launched.  So now, our policy will always to, at a minimum, launch the application while sequencing.

Read More

AppV 5 – Measuring RegistryStaging load times

2014-12-14
/ / /

Per my previous posting, I have an issue where the AppVClient.exe consumes significant CPU resources upon application launch.  From a Microsoft forum where another member did some further investigation, he discovered that the slowness and delayed launch times are related to registry staging.  To confirm and measure the impact of registry staging I wrote a script that measures the length of time it takes to finish registry staging for all AppV5 applications on your computer/server.

What this script does is iterate through all your AppV5 applications and then loads a cmd.exe with the AppV environment.  It then checks for the RegistryStagingFinished registry key, and once it is found, it moves on to the next program.  It records all this information than exports it as a CSV file.

By utilizing this script as a AppV prelaunch/startup script we can optimize our users first application startup times and reduce CPU utilization of first-run applications.

Read More

Exploring the Citrix XML 6.5 Broker in more detail

2014-11-28
/ / /

The Citrix XML broker actually relies on many pieces to ensure fast and proper operation. This CTX article describes the process for XenApp 6 (seems applicable to 6.5 as well).

The part that is relevant to the XML broker is steps 4-9.

4. The user’s credentials are forwarded from XML to the IMA service in HTTP (or HTTPS) form.

5. The IMA then forwards them to the local Lsass.exe.

6. The Lsass.exe encrypts the credentials and passes them to the domain controller.

7. The domain controller returns the SIDs (user’s SID and the list of group SIDs) back to Lsass.exe and to IMA.

8. IMA uses the SIDs to search the Local Host Cache (LHC) for a list of applications and the Worker Group Preference policy for that authenticated user.

9. The list of the applications together with the user’s worker group preference policy are returned to the Web Interface.

So what does this look like (click to blow it up)?

Starting with packet #32 we see the initial POST request for a list of applications.

Steps 5, 6 and 7 are packets 36-86.  LSASS goes back to AD to grab the SID’s.

Step 7.5: It appears that the first time you enumerate applications on a broker that information is queried to the SQL database and stored in the local host cache.  This would be packets 87-94.  Additional queries done do not show this traffic.

Lastly, step 9 is all the traffic we see after packet 95 in red; the return of the XML data.  For our setup, our XML brokers responded with the following timings:

Step 4: 1ms
Step 5: 2ms
Steps 6 and 7: 47ms
Step 7.5 (DB query is not in LHC): 2ms
Step 8: 14ms
Step 9: 14ms

Total: 80ms

With a freshly created LHC this is what process monitor sees:

Again, the IMASrv.exe will actually NOT be present in this list if you have executed at least one query against the XML broker as it is only displayed when it queries the database (wssql011n02) and stores the response in the LHC.

So, what could contribute to slow XML broker performance?

Utilizing the WCAT script we can continuously hammer the XML broker with however many connections we desire.  The XML brokers have a maximum of 16 threads to deal with the incoming traffic but at 80ms per request/response the queue would have to get fairly long to create a noticeable performance impact.  In addition, previous tests on the 4.5 broker showed additional CPU’s help improve the performance of the XML broker, I think the 6.5 broker shows better performance with lesser CPU’s.

Utilizing the bandwidth emulator, clumsy, I’m going to simulate some poor network performance on the XML broker to see what the effects will be of how XML response times will vary.  The only network hits I can see are from the source (Web Interface), Active Directory, and (potentially) the SQL database.

Just starting the clumsy software with it’s filtering by IP capabilities added about 800ms to the total roundtrip time.  Something to consider with other network management/threat protection software I imagine…

Anyways, adding just 20ms of lag to and from the web interface to the XML broker increased processing time by another 500ms of total time.  That is, 1500ms on the low end on 2200 on the high.  Increasing packet latency to 100ms brought the total processing time to 2700ms on the lowend and 3800ms on the high end.  I think it’s safe to say that having the web interface and XML brokers beside each other for the lowest possible latency is a big performance win.

Targeting Active Directory with a latency of 20ms brought times from 420ms to 550ms.  Increasing that to 100ms brought the response times put to 890ms.  Not too shabby.  Seems AD is more resilient to latency.

Targeting the SQL database with a latency of 100ms showed the first query after the LHC was rebuilt went to 600ms and then back down to 420ms there after.  Locality to the database seems to have the lowest impact, but the 100ms lag did increase the LHC rebuild time to about 3 minutes from near instantly before.
I did try to test a heavy disk load against the XML broker but I was running this server with PVS with RAMDisk Cache overflow enabled which means my LHC is stored in RAM and no matter how hard I punted the C:\ drive I couldn’t make an impact.

Read More

Load testing Citrix XML broker

2014-11-27
/ / /

Previously, we encountered performance issues with the Citrix Web Interface due to our user load.  I devised a test using the Microsoft WCAT to hammer the web interface servers.  We found that after removing the ASPX processing limitation, logins were slow and we found that some XML brokers were taking a long time to respond.

I’ve been tasked with finding out why.  The Citrix XML server is a basic web server that takes an XML post, processes it and spits back a XML file in response.  To test the performance of the XML server I created a PowerShell script to send the same XML request that occurs when you login through the web interface.

XML-Test.ps1:

 
The list_of_xml.csv looks like this:
Farm,Broker,Port
farm,wsctxrshipxml1,80
The output of the file looks like so:
11/26/2014 3:34:29 PM,wsctxrshipxml1,80,312.197
11/26/2014 3:34:30 PM,wsctxrshipxml1,80,345.0929
11/26/2014 3:34:31 PM,wsctxrshipxml1,80,255.165
11/26/2014 3:34:33 PM,wsctxrshipxml1,80,294.3027
11/26/2014 3:34:34 PM,wsctxrshipxml1,80,300.1806
The times are in milliseconds on the far right.
Utilizing this information, we can gather how quickly the XML brokers respond and using this as a baseline we can start to load test to see the impact of how quickly the XML brokers will respond.
To do this, we go back to WCAT.  I set my XML.ubr file like so:
I then started the
start “” wcclient.exe localhost
And let it roll.  When monitoring the server the process that takes up the most time is the IMASRV.EXE.  I imagine that is because the XML service is really just a simple web server that accepts the traffic and hands it off to the IMASRV.exe to actually process and gets the response back and sends it back to the requestor.
Started load testing at 11:49:00.  After 40 concurrent connections XML service is responding to requests at 2000ms per request.
With this testing we can now try to improve the performance of the XML broker.  We monitored one of our brokers and made some changes to it and reran the test to see the impact.  The largest positive impact we saw was adding CPU’s to the XML brokers.  The following graphs illustrates the differences we saw:

 

I stopped the graph at around 20,000ms for responding to XML request for the top graph, and the maximum number of concurrent connections for the bottom graph at 3,000ms.  3,000ms would be very high for XML enumeration in my humble opinion, but still tolerable.  On a single CPU system, XML enumeration can only sustain about 12-15 concurrent requests before it tops out, 2 CPU systems do slightly better at 24-28 concurrent requests and 4CPU at 60 concurrent requests.  8 CPU’s and we exceed 3,000ms at 120 concurrent requests.  Ideally, you would keep all requests under 1,000ms, which for 8 CPU is at 30 concurrent sessions, and 21 sessions for 4CPU.  1 CPU can only sustain about 2 concurrent requests to stay under 1,000ms while 2 CPU can sustain about 5 concurrent requests.
Again, this is the same query that the web interface sends when you login to a farm.  So if you have 10 farms and they all take 1,000ms to respond to an XML request you will sit at the login screen for 10 seconds.  Storefront allows parallel requests which would reduce the time to 1 second (potentially) but for Web Interface (and even Store Front), having an optimized XML broker configuration is ideal and, apparently, is very dependent on the number of CPU’s you can give it.  Recommendation:
As many as possible.  Unfortunately, I did not have the ability to test 16 or 32 core systems but for an Enterprise environment, I would try to keep it at 8 as a minimum.
Read More