Logons

Printers and their impact on logon duration

2019-03-05
/ /
in Blog
/

40 second logons.  100 second logons.  200 second logons.  

Our users had an average logon time of 8-11 seconds.  But some users were hitting 40-200 seconds.  Users were frustrated and calling the help desk.

Why?

I did not want to individually examine all of the users whom had a long logon duration.  There was dozens of dozens of users, but after sampling 5 of them, the root cause was consistent.  Printers were causing long logons.  

What I found was users were encountering the long logons on specific workstations.  Further investigation revealed these workstations had globally mapped printers.  Some workstations had dozens of printers.  Some of the printers were long gone, having been moved/re-IP’ed, shutdown, or re-provisioned and the entry in the workstation not updated.  Citrix, depending on Receiver version, attempts to do a check to validate the printer’s operation before mapping and this would wait for a network timeout (or a crash of Receiver).  When I stopped the print spooler on the local workstation, logons came back down to the proper range.

What I sought to do was create something to make identifying this root cause easier, quicker, and with more specificity — in other words was there a particular printer causing the delay?

Citrix Printing Methods

Citrix has two methods of printing.  “Direct connection” and local printing.  Direct connection has the potential for larger impact on logon duration as it executes additional steps and actions on the Citrix server.

To enable “Direct Connection”, Citrix has a policy “Direct connection to print servers”.  By default, this policy is enabled.

With this policy enabled, it changes the behavior of how Citrix connects printers you have mapped locally as network printers.  Citrix has a diagram here:

But I feel this is missing temporal information.  I’ve created a video to highlight the steps.

Direct Connection Process

Why is this important?

Citrix has an option that is an absolute requirement for some applications.  
“Wait for printers to be created (server desktop)” (Virtual Apps and Desktops 7.x) or “Start this application without waiting for printers to be created. (Unchecked)” (Citrix XenApp 6.5).

This policy needs to be enabled for numerous applications that require a specific printer is present before the application starts.  This is usually due to applications that have pre-defined printers like label printers and do a check on app launch.  If you have an application like that you probably  have this feature enabled.

Citrix has many options for managing printers and they can impact your logon process.

Focusing on Direct Connection being enabled, if “Wait for printers to be created (server desktop)” policy is also enabled, care MUST be taken to minimize logon times.

Direct connection does something that can have a very adverse affect on logon time that is enabled by default.  When it connects directly to the print server it will check and install the print driver on the Citrix server.  The installation activity is visible via process monitor.

The DrvInst.exe process is the installation of printer drivers

The installation of printer driver, by default, will only occur if the driver is inbox or prestaged on the Citrix server.  But the check to see if a print driver needs to be installed occurs every time a session is created.  This behavior can be changed by policy.

Why does this impact logon times?  Driver installation can take a while, especially if there is communication issues between the Citrix server and print server.  Or if the print server is under stress and just simply slow to respond.  Or if the driver package that needs to be loaded is especially large, has lots of forms or page formats, multiple trays, etc.  This all adds up.

How can I tell what the impact of printers might be on Logon Duration?

I’ve been working on updating a Script-Based Action (SBA) originally posted by Guy Leech on the Citrix blogs.  This enhancement to the SBA will output more information breaking down what’s consuming your logon times.  Specifically, this tackles how long printers take in the ControlUp column: “Logon Duration – Other”

I’ve created a video showing how to enable the additional logging and how to run the new Script Based Action and what the output looks like:

What’s new in the output?

For the individual printers, direct connection printers now show two lines, the Driver load time and Printer load/connection time. 

The driver load time is titled with Driver and then the UNC path to the printer. 

The Printer and UNC path shows the the actual connection, printer configuration and establishment time.  All Direct Connection printers will be shown with UNC paths.

Citrix Universal Print Driver (UPD) mapped printers using the Citrix Universal Print Driver (print jobs that get sent back to the client for processing) are shown with their “Friendly Name” and do not have a matching “Driver” component as no driver loading is required.  In the example output above, “\\printsrv\HP Color LaserJet 5550” is a direct connection printer and “Zebra R110Xi HF (300 dpi)” is a Citrix UPD printer.

Here is an example of the full output:

The “Connect to Printers” is a sub-phase under “Pre-Shell (Userinit)“.  Each “Printer:” and “Driver:” is a sub-phase under “Connect to Printers“.  In this example screenshot, “Pre-Shell (Userinit)” was consuming 117.6 seconds of a 133.2 second logon duration.  Prior to this script that was all that was known.  Now we can see that connecting to printers consumes 97.7 of those 117.6 seconds!

Awesome!  What’s the catch?

In order to generate this information, some more verbose logging is required on the Citrix server side.   I specifically worked to ensure that no 3rd party utilities would be required so we simply need to enable 4 additional features.

Command Line Audit
PrintServer/Operational Logs
Audit Process Creation/Termination

I’ve written a batch script to enable these features:

Easy!

Wait a minute!  Windows Server 2008R2 doesn’t have Command Line Auditing!

Good catch!  Windows Server 2008R2 doesn’t have command line auditing so it will miss the driver load information.  There is no way around this using native tools, but modification could be done to use a 3rd party tool like sysmon.  The output on a 2008R2 system would show the following (with all other features enabled).

Awesome!  Let me at this script!

This script is available on the ControlUp site here.  If you have ControlUp you can run simply add it via the “Script Based Actions” button and further derive more information on your users logon performance!  

Last Gotcha

In some instances the “Connect to Printers phase” may start before “Pre-Shell”. This has been observed and is actually happening! The operating system is attempting to start *some* printer events asynchronously in certain situations and so the connect to Printers may start before userinit, but overall connecting to printers may still be blocking logons while it trys and completes.

Steve Elgan kindly pointed out that it’s important you size your event logs to ensure the data is present for the users whom you want to monitor. If your event logs are sized too small than the data being queried by the script won’t be present. So ensure you have your Security and the Printer event logs sized large enough to capture all of this data! For an idea of scale, you can look at your log size and the oldest entry there. If your log is 1MB in size and your oldest entry is from an hour ago, than sizing your log to 24MB should capture about a day’s worth of information.

Hope this helps you reduce your long logon durations!

Read More

Citrix Logon Simulator’s – Part 2

2019-02-06
/ /
in Blog
/

In my previous post I was looking at utilizing a Logon Simulator to setup some proactive monitoring of a Citrix environment.  I setup some goals for myself:

  1. Minimize the number of VM’s to run the robots
  2. As little resource consumption as possible
  3. Still provide operational alerts
  4. Operate on-premise

I want the footprint of these robots to be tiny.  This must done on Server Core.

I want to run multiple instances of the logon simulator concurrently.

I need to be able to test “Stores” that do not have “Receiver for Web” sites enabled.

I want it so if I reboot the robot he picks up and starts running.

The Choice.

In order to successfully hit these specified targets I opt’ed to use ControlUp’s Logon Simulator.  It can target the Store Service so it works with our “Receiver for Web”-less stores.  It also has the features to generate events that can be targeted to send out notifications of an application launch failure.

The Setup

In order to achieve my goals I need the following:

  • A Service Account that will be logging onto the Citrix servers
  • The robot (Windows 2019 Server Core)

I installed Server Core 2019 and added it to the domain.

Configure Autologon

I configured group policy preferences to setup AutoLogon for my service account.  This group policy object is set to the OU the robots reside in.

Group Policy Preferences settings to configure Autologon for our service account.

However, I did not include the required “DefaultPassword” registry with the password.  In order to embed the password in a more secure fashion, I had to manually use Sysinternals AutoLogons.  This keeps the password from being stored in plain text in the registry but this does need to be manually executed on each robot.

Configuring Autologon

The account MUST be a regular user and not a member of the “Administrators” group.  This is a requirement of the ControlUp Logon Simulator.

Prerequisites gotcha’s

Because I selected to use pure Server Core, there are some components that require fixing for full compatibility.  This can actually be alleviated immediately by installing the Feature On Demand (FoD) “Server App Compatibility”, but this would increase both memory utilization and consume more disk space for our robot.  However, if you prefer the easy way out, adding the FoD fixes everything and you can skip the “Fixes” section.  Or just run the Logon Simulator on a operating system with the desktop experience.  Otherwise, follow the steps in each of the solutions.

Fixes

Unable to install Citrix Reciever/Workspace App

“TrolleyExpress.exe – System Error”
The code execution cannot proceed because oledlg.dll was not found.  Reinstalling the program may fix this problem.

Solution

Copy oledlg.dll from the SysWow64 in either the install.wim or another “Windows Server 2019 (Desktop Experience)” install and put it in the C:\Windows\SysWow64 folder of your robot.

 

ControlUp Logon Simlulator is cropped on smaller displays

Solution

Set the resolution larger in your VM.

 

ControlUp Logon Simulator errors when you attempt to save the configuration

“The logon simulator failed for the following reason: Creating an instance of the COM component with CLSID {C0B4E2F3-BA21-4773-8DBA-335EC946EB8B} from the IClassFactory failed due to the following error: 80040111 ClassFactory cannot supply requested class (Exception from HRESULT: 0x80040111 (CLASS_E_CLASSNOTAVAILABLE)).  An application event log has been written to handle this crash.”

Solution

Copy ExplorerFrame.dll from the SysWow64 in either the install.wim or another “Windows Server 2019 (Desktop Experience)” install and put it in the C:\Windows\SysWow64 folder of your robot.  Add the following registry:

ControlUp Logon Simulator detects admin rights

Admin rights detected The logon simulator should not be run as an administrator, please restart the app as a standard user.

Solution

Run the logon simulator as a standard user.


Configuration

Once you’ve implemented all the fixes, install Citrix Workspace App and ControlUp Logon Simulator with an account that is an administrator.

Configure ControlUpLogonSim.  With the simulator open, enter your Storefront details, ensuring to use the “Store” account as seen in the Storefront console.

 

 

For the “Resource to Launch” ensure the name matches the display name in Storefront:

 

In order to avoid session stealing in the simulator, each application will require a unique user name.  Setup a unique account per application you are going to test.

 

From here, enter your logon credentials for the account associated with the application.  Run your first test by clicking the green triangle and ensure it works correctly.

 

Now that we have a successful run we set “Repeat Test” to ON and save the configuration.

I then created another application to monitor by renaming the “Resource to Launch” as another application and saved a second configuration.  I saved all my files to a C:\Swinst folder.

 

The point of all of this is to ensure the simulator is running in an automated fashion.  To do so, we need to be able to configure the simulator to “launch” multiple different applications when the operating system starts.  We have already configured autologon, we’ve setup our configuration files for each application we want to monitor, now we need to set the monitor to auto-start.

Add the following registry key:

And create a file “C:\Swinst\StartAppMonitors.cmd” with the following contents:

And watch the magic fly!

And so the final question and the point of all this work, how much does this consume for our resources?

 

1.1GB of RAM for the ENTIRE system, a peak CPU consumption of 7%, and the processes required to do the monitor use no CPU and only ~55MB of RAM.  Each Citrix process consumes ~20MB of memory and is the most significant consumer of CPU but at the single digit % range.

I anticipate doing some more stress testing to determine what the maximum amount of monitors I can get on one system, but I’m thrilled with these results.  With this one box I would expect to be able to monitor dozens of application…  Maybe a hundred?

In the end, this was a fair bit of work to get this setup on Server Core, but I do believe the savings in resource consumption and overhead reduction will pay off.

Read More

Citrix Logon Simulator’s – Part 1

2019-02-04
/ /
in Blog
/

“Help!  I can’t launch my application!”
“Is something wrong with Citrix?”
“Hey man, I heard Citrix was down?”
“Can you help? I need to get this work done!  The deadline is today and I can’t open my app!”

Welcome to the world of a Citrix Administrator.  If an application stops working then the calls flood in and you get pinged a million times.  Is there a way to be proactive about when an application goes down so you maximize your time trying to fix the issue between the failure and first call?

The answer to this is a Citrix Logon Simulator.

There are a few different logon simulators out there including two powershell scripts I wrote for performance testing.  One for testing the “Web” service and one for the “Store” service.  The difference between the two services is found in the name, with “Web” typically appended to the web service (eg, “/Citrix/StoreWeb”) and the “Store” service ending thusly (eg, “/Citrix/Store”).

The “Web” service is the user-facing front end for Storefront.  When you open a browser and go to your Storefront URL you are using the web service.

“Web” service. User logs into Storefront and launches an app using a web browser

Each “Web” service has a corresponding “Store” service.  However, the “Store” service does not require a “Web” service and you can create Store services without a Web Service.  Store services are used when you configure Citrix Reciever/Workspace App to connect to Storefront.  When you launch apps via Citrix Receiver/Workspace App you are using the Store service.  In addition, when thin clients are configured to use Storefront they, typically, use the Store service.

Using the “Store” service. Not the program is “Citrix Workspace” and not a web browser.

I haven’t found a logon simulator that tests each service, the products out there only test one or the other.

In Summary

Web Service

Testing the web service simulates a users experience authenticating and launching an application via a web browser.  These simulators launch a web browser, browse to your URL, login, find your application and then click it to launch it.  This does make an impressive demo of automation watching the web browser do actions without a human.

Store Service

Testing the store service simulates a user authenticating and launching an application, via a Citrix client.  This is done through API’s or REST API calls, it’s up to the client to generate a GUI from the information returned (if desired).

Drawbacks

Web Service Simulators

Executing the logon simulators that use the web service is impressive watching the web browser manipulate itself.  However, the requirement of using a web browser was limiting in that executing multiple concurrent applications was not feasible.  Options seemed to range from providing a list of applications to be tested (which are tested sequentially) or expanding the number of VM’s that will run the logon simulator.  In addition, it’s possible to create “Store” services that host applications without the corresponding “Web” service. Simulators that only test the Web Service will be unable to do any type of testing for these Stores.

Single VM

If you opt for a single VM to test as many applications as possible, the time between applications increases linearly.  For 10 applications to be tested to validate they are operational, with a 2 minute interval you will only be testing that application every 20 minutes (at best).  If you have 100 applications you’ll be waiting over 3 hours between tests.

Multiple VM’s

Multiple VM’s operate very well with the web service.  But they introduce another wrinkle.  How feasible is it to expand your environment 100 VM’s to test if 100 applications are operational?  With a operating system overhead of ~1GB at best, you’ll be consuming at least 100vCPU and 100GB more of memory.  This is pretty much dedicating a host just for robots.  This may be worth it when compared to the cost of time when a application is down… But this is now becoming a very, very expensive solution.  Hardware costs for hosts, hypervisor licenses, and Windows licenses compound to the pain of a multi-VM solution.

Store Service Simulators

In terms of a demo, the store service is less visually impressive.  The automation is done programmatically so you don’t get to see those gratifying “clicks”.  However, there is no browser requirement.  This does mean you don’t need a full GUI operating system to run a store service logon simulator.  In addition, because the Store Service simulator operates via API calls, it’s completely possible to run multiple in parallel.  This means there is a very large opportunity to save VM costs by consolidating all the testing into one, or just a couple VM’s and for each test it will keep that tight interval.  However, the draw back of the Store Service simulator is additional configuration is required for testing through a Netscaler.  Essentially, if you have not setup the DNS SRV record to allow Store Service communication than a Store Service simulator will not work externally.

What’s the plan?

After exploring these considerations, I set out to design something with some goals.

  1. Minimize the number of VM’s to run the robots
  2. As little resource consumption as possible
  3. Still provide operational alerts
  4. Operate on-premise

My next post will explore how to achieve this and the solution I settled on.

Read More

ControlUp – Dissecting Logon Times a step further (invalid Home Directory impact)

2016-09-07
/ /
in Blog
/

Continuing on from my previous post, we were still having certain users with logons in the dozens of seconds to minutes.  I wanted to find out why and see if there is anything further that could be done.

60second_profile

 

After identifying a user with a long logon with ControlUp I ran the ‘Analyze Logon Duration’ script:

51-1second_profile

 

Jeez, 59.4 seconds to logon with 51.2 seconds of that spent on the User Profile portion.  What is going on?  I turned to process monitor to capture the logon process:

screen-shot-2016-09-07-at-8-24-18-pm

Well, there appears to be a 1 minute gap between the cmd.exe command from when WinLogon.exe starts it.  The stage it ‘freezes’ at is “Please wait for the user profile service”.

 

Since there is no data recorded by Process Monitor I tested by deleting the users profile.  It made no difference, still 60 seconds.  But, since I now know it’s not the user profile it must be something else.  Experience has taught me to blame the user object and look for network paths.  50 seconds or so just *feels* like a network timeout.  So I examined the users AD object:

screen-shot-2016-09-07-at-8-46-19-pm

 

Well well well, we have a path.  Is it valid?

screen-shot-2016-09-07-at-8-49-42-pm

 

 

It is not valid.  So is my suspicion correct?  I removed the Home Directory path and relaunched:

without_homedir_logon_time

Well that’s much, much better!

So now I want ControlUp to identify this potential issue.  Unfortunately, I don’t really see any events I can key in on that says ‘Attempting to map home drive’.  But what we can do is pull that AD attribute and test to see if it’s valid and let us know if it’s not.  This is the output I now generate:

new_script

 

I revised the messaging slightly as I’ve found the ‘Group Policy’ phase can be affected if GPP Drive Maps reference the home directory attribute as well.

 

So I took my previous script and updated it further.  This time with a check for valid home directories.  I also added some window sizing information to give greater width for the response as ‘Interim Delay’ was getting truncated when there were long printer names.  Here is the further updated script:

Read More