I joined Citrix Consulting three years ago, and one of my first projects was to deploy a moderate sized XenDesktop environment using Citrix Cloud and Azure for about 600 VDI desktops. I had not worked with either Citrix Cloud or Azure before, so the project was really interesting, and we ended up with a very pleased customer that was moving their entire infrastructure into Azure. What I did not understand then, because the size of the deployment was modest, was how different the architecture has to be if the design calls for scaling the environment to thousands of VDAs.

If your Azure deployment will scale beyond 2,500 VDAs, a multi-subscription architecture will be required and you might want to scale out the number of subscriptions even before this point. This is because, in VDI, we do a significant amount of interaction with Azure Resource Manager, including reading the power state of machines, starting machines, de-allocating machines, and creating and deleting disks. Each of these tasks requires many API calls into Azure and for a long time these API calls were the limiting factor for sizing an Azure deployment.  Microsoft now allows for these types of limits to be called against many ARM servers and I have been told these are no longer a bottleneck.  However, integration using API calls with Azure for VDI also goes against a set of lower level Resource Managers including those for Compute Resources and Storage Resources.  These integrations have now become the bottleneck for managing a VDI solution within a single subscription.  To read more about resource limits in Azure see this article – Throttling Resource Manager Requests.

Due to these limitations, Machine Creation Services (MCS) developers have built rate limits into MCS to safeguard customers. To stay under these limits, a single Azure subscription can support 2,500 desktops when integrated with Citrix Cloud and 1,000 desktops when integrated with an on-premises delivery controller.

Note: Prior to the middle of August 2021 the limit of desktops per Azure subscription was 1,200.  This was changed when using Citrix Cloud due to enhanced integration methods between MCS and Azure.

So, if a design calls for more desktops than that, you need a multi-subscription architecture. The way this is often architected is that control resources like delivery controllers or cloud connectors, StoreFront servers, ADC gateways, Director servers, licensing servers are installed into a hub subscription, which includes all central services along with being the centralized location for network traffic. The “spoke subscriptions” are created for every 1,000 (or now 2,500) desktops, and they are connected to the hub subscription using VNET Peering.

One design that I worked on recently looks like this:

Three XenDesktop sites were built in three separate Azure regions. Each region had a hub subscription and one to three spoke subscriptions. VNET peering was configured between each spoke subscription and the hub subscription in the same Azure region. This architecture is extensible by adding a new subscription for every additional 1,000 desktops. Each spoke site was added to a single /22 IP subnet supporting up to 1,022 desktops to make IP management easy. Each subscription will have a separate Citrix Virtual Apps and Desktops host connection. If you are using Citrix Cloud, the host connection settings should be modified as described in this great blog post by Katie Gould.

So now you know that a multi-subscription model is required for scaling VDI on Azure. But what do you do if you have already built an environment with thousands of desktops in a single Azure subscription?

If most of your desktops are pooled non-persistent, the solution is fairly easy. Set up the new subscriptions, add the networks, and configure VNET peering between the hub and spoke subscriptions. Create the new host connections and deploy new desktops using the new architecture. Using cloud resources makes this pretty easy for such a big change.

But what if you already deployed many thousands of persistent desktops to users? That is a much different task because your users will not be very happy if you just give them a new desktop in a new subscription. They they will lose all their installed applications and whatever configuring they have done on their desktops.

I have worked with several customers lately that are up against this issue. I helped by creating a migration script to move the persistent desktops from one Azure subscription to another. The basic flow of the script is as follows:

  1. Get the old catalog and new catalog.
  2. Get the old Hosting Connection and new Hosting Connection.
  3. Get a list of assigned user(s).
  4. Copy the OSDisk to the new Azure Resource Group in the new Azure subscription.
  5. Create a new Azure VM with a new NIC in the new network using the copied OS disk.
  6. Delete the machine from the old Delivery Group.
  7. Delete the machine from the old Catalog.
  8. Add the machine to the new Catalog.
  9. Add the machine to the new Delivery Group.
  10. Assign the user(s) to the machine.
  11. If enabled delete the old machine from Azure.
    1. VM
    2. NIC
    3. Disks

The script requires that the new Catalog and Delivery Group are pre-created. The catalog will not be an MCS0integrated catalog. It will be a power-managed static machine catalog because we need to be able to manually add desktop to it. The main drawback to this type of catalog compared to those integrated with MCS is that when you delete a desktop from the catalog, it will not be deleted from Azure. So, you will need to develop a separate process to handle that requirement. But from a day-to-day perspective, the power-managed static catalog allows administrators to start, stop, restart, shut down, and add desktops to maintenance mode.

When creating the catalog, you will have to add one VM to the catalog. Normally, I just pick any VM I know of in Active Directory that is not a Citrix desktop. You can add it, then just remove it after the catalog has been created.

The script is very configurable. It can be used with on-premises Citrix Virtual Apps and Desktops or Citrix Cloud. It can be run from a CSV file to feed the desktops to migrate or just pointed at a catalog. You can define a time limit so that the script will only run for, say, eight hours, and you can limit the number of migrations that will run in one batch. You can skip desktops with active sessions or disconnected sessions. You can delete successfully migrated desktops from the source Azure Resource Group or leave them in place until you are sure the user is fine on their new desktop.

There are also two versions of the script. Initially I wrote a single threaded script (MigrateStatic.ps1) that migrates one desktop at a time. This is the easier one to troubleshoot because you can set breakpoints at each stage and watch the migration as it runs. But this script will not scale unless you run several versions simultaneously.

I then created a version that uses two scripts. The first script (MigrateStaticJobs.ps1) acts as a job manager to run x number of the second (MigrateScript.ps1). In my testing if I run more than five simultaneous jobs, Azure has issues performing the proper tasks and the script is unreliable. I recommend that you perform some migrations with test desktops using different numbers of parallel migrations to be sure the script will work reliably at whatever level of parallelism you choose.

During a customer deployment, we had to make a change to the initial script to disable enabling the BGInfo service on the newly created virtual machiens. It turns out that Azure automatically enables BgInfo extensions on newly created virtual machines. If you don’t have the BGInfo extensions installed in your images, then the create VM command takes about 40 minutres to run and timeout. Therefore, we included -DisableBginfoExtension in the New-AzVM command. If you need this enabled remove that from the migration scripts.

New-AzVM -VM $VirtualMachine -ResourceGroupName $targetResourceGroupName -Location $Location -DisableBginfoExtension


Script Legal Disclaimer

The following legal disclaimer must be understood before using the scripts I have created. Citrix Tech Support will not provide support for this script, and I might not be able to respond to inquiries in a timely fashion.

This software / sample code is provided to you “AS IS” with no representations, warranties or conditions of any kind. You may use, modify and distribute it at your own risk. CITRIX DISCLAIMS ALL WARRANTIES WHATSOEVER, EXPRESS, IMPLIED, WRITTEN, ORAL OR STATUTORY, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NONINFRINGEMENT. Without limiting the generality of the foregoing, you acknowledge and agree that (a) the software / sample code may exhibit errors, design flaws or other problems, possibly resulting in loss of data or damage to property; (b) it may not be possible to make the software / sample code fully functional; and (c) Citrix may, without notice or liability to you, cease to make available the current version and/or any future versions of the software / sample code. In no event should the software / code be used to support of ultra-hazardous activities, including but not limited to life support or blasting activities. NEITHER CITRIX NOR ITS AFFILIATES OR AGENTS WILL BE LIABLE, UNDER BREACH OF CONTRACT OR ANY OTHER THEORY OF LIABILITY, FOR ANY DAMAGES WHATSOEVER ARISING FROM USE OF THE SOFTWARE / SAMPLE CODE, INCLUDING WITHOUT LIMITATION DIRECT, SPECIAL, INCIDENTAL, PUNITIVE, CONSEQUENTIAL OR OTHER DAMAGES, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Although the copyright in the software / code belongs to Citrix, any distribution of the code should include only your own standard copyright attribution, and not that of Citrix. You agree to indemnify and defend Citrix against any and all claims arising from your use, modification or distribution of the code.


Preparing the Environment

To be ready to use the script to migrate desktops there are a set of prerequisites that need to be taken care of including:

  1. Create new subscriptions.
  2. Create new VNETs.
  3. Configure VNET Peering relationships.
  4. Create new Hosting Connections.
  5. Create new Catalogs.
  6. Create new Delivery Groups.
  7. Prepare a machine to run the scripts on. I used a Windows 10 Desktop.
  8. If using Citrix Cloud you can optionally save the credentials to make the script easier to run

The service principal used to interface Citrix Virtual Apps and Desktops to Azure should be dedicated for use by MCS. That will ensure the best API rate performance allowed because some limits are enforced by service principal.

Installing the Scripts

Installing the scripts is easy. Just follow these steps:

  1. First decide what machine will be used. If it does not have the matching Broker SDK install it. For Citrix Cloud install the Remote PowerShell SDK; for an on-premises implementation, install the Studio Console, which will install the Broker Admin SDK.
  2. Create a folder
  3. Download (here) and unblock the scripts.
  4. Configure the scripts for your environment

There are two separate scripts:

  • ps1 – This can be used to migrate one desktop at a time. This one is much easier to troubleshoot.
  • ps1 – This script is a master script that calls another script to migrate desktops in parallel using a job manager. The script that is called is MigrateScript.ps1.

Most organizations will use the MigrateStaticJobs script so that several migrations can be performed in parallel. The script allows running many jobs in parallel. In my testing, the script works reliably with up to five jobs running in parallel. If more than five migrations at a time are required, then first set up test desktops to determine if more can be run in parallel in your environment.

Script Settings

The following settings should be modified to match the environment and requirements. Most of the settings should be self-explanatory.

#### Global Settings ####################################################

$UsingCitrixCloud = $true

#Set this to $true to migrate the desktops even if there is a user session.  Set to $false to skip if there is a logged on session.
$SkipActiveSessions = $true
$SkipDisconnectedSessions = $false
$DeleteOldMachineOnSuccess = $true

#Set this to $true to limit the number of hours the script will run.
$UseTimeLimit = $true
#Enter processing time limit in hours
$TimeLimit = 10

#For On-Premises deployments enter a DDC to run Citrix SDK commands against.
#note $adminAddress is ignored when using the cloud sdk
$adminAddress = “ddc1.zylowski.com”

#From Information
$SourceCatName = “AzureStaticDesktop”
$sourceDTGroup = “AzureStaticDesktop”
$sourceConn = “AzureMCSTest”

#Source Azure Info
$sourceResourceGroupName=’rz_mcsmanaged’
$sourceSubscriptionId=’0967Y12-914b-434d-8769-f117e62574f0′

#Set this to true to capture desktops from the source catalog.  If this is $false then the CSV must be pre-populated
$UseMachineCatalogAsSource = $true
#Enter the limit of number of desktops to capture from the catalog.  This is a good way to limit the number of possible migrations
$MaxRecordCount = 100
$csvFileName = “MigrateStatic.csv”

#To Information
$targetCatName = “AzureStaticNew”
$targetDTGroup = “AzureStaticNew”
$targetConn = “AzureBennet”

#Target Azure Info
$targetResourceGroupName=’rz_MCSManaged2′
$targetSubscriptionId=’552a04d0-1237-4992-b407-255k33a1dbf’
$targetVirtualNetworkName = ‘RZResourceGroup-vnet’
$targetNetworkResourceGroupName = ‘RZResourceGroup’
$Location = “eastus”
#Location Listing – Use first column entry
#This list can be updated using get-AzLocation
#eastasia           East Asia
#southeastasia      Southeast Asia
#centralus          Central US
#eastus             East US
#eastus2            East US 2
#westus             West US
#northcentralus     North Central US
#southcentralus     South Central US
#northeurope        North Europe
#westeurope         West Europe
#japanwest          Japan West
#japaneast          Japan East
#brazilsouth        Brazil South
#australiaeast      Australia East
#australiasoutheast Australia Southeast
#southindia         South India
#centralindia       Central India
#westindia          West India
#canadacentral      Canada Central
#canadaeast         Canada East
#uksouth            UK South
#ukwest             UK West
#westcentralus      West Central US
#westus2            West US 2
#koreacentral       Korea Central
#koreasouth         Korea South
#francecentral      France Central
#francesouth        France South
#australiacentral   Australia Central
#australiacentral2  Australia Central 2
#uaecentral         UAE Central
#uaenorth           UAE North
#southafricanorth   South Africa North
#southafricawest    South Africa West
#switzerlandnorth   Switzerland North
#switzerlandwest    Switzerland West
#germanynorth       Germany North
#germanywestcentral Germany West Central
#norwaywest         Norway West
#norwayeast         Norway East
#brazilsoutheast    Brazil Southeast
#Enter the number of migrations to run in parallel
#Be cautious I am not sure where this will have issues with API limits
#I have only tested up to 5 reliably
$SimultaneousJobs = 5
#The number of times are commands will be retried if they do not succeed the first time.
$AzureCommandRetries = 5

#############################################################################

Authentication for the Script

When using an on-premises delivery controller, the permissions of the logged-on user are used when connecting to the DDC to run commands.

When the script is run against Citrix Cloud, the SDK allows for saving API credentials so that they can be used for every run without having to log on again at the beginning of each script session. To save the credentials, first obtain an API Key in Citrix Cloud by going to Identity and Access Management → API Access → Secure Clients.

Take note of your CustomerID. In the box add a name for the computer that will run the scripts, then click CreateClient. You will be presented with an ID and Secret to use in your scripting, as shown below.

In an administrative PowerShell session, run a command like the following using your CustomerID, API ID and API Secret. This will save an authentication profile in the logged-on user’s windows profile that can then be used by the script when it is run.

Set-XDCredentials -CustomerId “drilt561er20” -APIKey 9c432d88-66f7-232f-a4gg-cv2d99cf5194   -ProfileType CloudAPI -StoreAs “MigrateStatic” -SecretKey wBQCC0eRTXbqIkhKPcT4lQ==

The script then uses the following command to retrieve the credentials. Please note, this line must be uncommented in the script to be used. The default is to ask for credentials every time. Check lines 112 to 123 in the script.

get-xdauthentication -ProfileName “MigrateStatic”

For Azure the script uses Connect-AZAccount during every run.

Logging

The last thing to go over is the script’s logging. The script will create a log folder under the script folder. It will create a set of log files during every run. One is a normal log for the managing script, and one is a PowerShell transcript. Every migrated desktop will have one. The transcript will capture all errors experienced during the script run, but the standard log will be easier to read, as shown below.

The log shows the migration process and at the end has a summary. When using the MigrateStaticJobs script it will create an overall log file, as well as one for every desktop being migrated.

The summary information is as follows:

  • Desktops Migrated
  • Desktops with Failed Migrations
  • Desktops Skipped Due to Time Limit
  • Desktops Skipped Due To Some Problem
  • Desktops Skipped In Use
  • Desktops Skipped Already Migrated
  • Old Desktops Delete Failed
  • Old Desktops Delete Succeeded

I hope this post gives you the information you need on what’s required to scale large, Azure-based VDI deployments. As with most projects that require scale, getting the architecture right up front will be critical to the success of the project. With Azure the multi-subscription hub and spoke model will provide the most simple architecture to enable scalable VDI infrastructure.