Kubernetes innately supports scalability. Horizontal Pod Autoscaler (HPA) is one such Kubernetes-provided resource that scales deployments, replica sets, replication controllers, and more. Autoscaling not only increases an application’s reliability by improving its availability and fault tolerance but also helps in cost management. Scaling takes place based on metrics of the concerned resource, which are provided by the metrics server. Traditionally, HPA obtains the necessary metrics from this metrics server and decides whether to scale up or scale down the application.

HPA periodically collects metrics from the metrics server. It then calculates the mean value, compares it with the user-specified target value, and takes an appropriate action (or no action).

Citrix ADC CPX and Custom Metrics

By default, the metrics server only provides CPU and memory metrics of the pod(s). There’s a nonlinear correlation between the CPU/memory metrics and the throughput of the container.

What does that mean?

Say, at 10 percent CPU usage, a container is responding to 100RPS and at 20 percent CPU usage it is responding to 200RPS. That doesn’t mean that at 30 percent CPU usage the same container will respond to 300RPS. It may consume around 35 percent to 40 percent CPU to serve 300RPS. In such situations, custom metrics play a pivotal role in making a more accurate scaling decision.

In addition to requests per second, another useful metric is application latency. Similar to the requests per second metric, application latency doesn’t have a linear correlation to the CPU/memory usage. In situations where the latency of an application increases, a custom metric like application latency proves to be a precise metric to autoscale the application.

Users generally have a better idea of the most appropriate metric for scaling their application. For example, a user might want to scale based on bandwidth usage of the service or need to scale a database service based on the number of queries. This warrants scaling based on custom metrics.

Citrix ADC is a battle-tested proxy with a rich set of metrics, making it the right choice of proxy to deploy along with applications. This blog post covers how you can set up a custom metric server that will enable users to expose any metric provided by Citrix ADC CPX and autoscale with the help of HPA. For this post, requests per second (RPS) will be used as an example custom metric for HPA.

This example can also be found in the Citrix Ingress Controller documentation.

Deployment

Figure 1. Visual representation of CPX autoscaling with custom metrics from Prometheus-adapter

Figure 1 is a visual representation of how an HPA works. This is a two-tier model of Citrix ADCs. The Tier 1 VPX/MPX is load balancing the CPX deployment, and CPX pods in Tier 2 are, in turn, load balancing the application. A Prometheus, a Prometheus Adapter, and an HPA controller for the CPX deployment are also deployed.

The HPA controller will keep polling the Prometheus Adapter for custom metrics like HTTP requests rate or bandwidth. Whenever the limit defined by the user in HPA is reached, it scales the CPX deployment and creates another CPX pod to handle the additional load.

Let’s look at the components from the figure above.

  • Citrix ADC VPX/MPX — Citrix ADC VPX/MPX, in Tier 1, load balances client requests among various CPX pods in the cluster.
  • Citrix ADC CPX — Citrix ADC CPX acts as a Tier 2 load balancer for microservice apps. This CPX pod runs with Citrix Ingress Controller and Citrix ADC Metrics Exporter as sidecars.
  • Citrix Ingress Controller — Citrix Ingress Controller (CIC) is an ingress controller built around the Kubernetes ingress resource. It programmatically configures Citrix ADC using NITRO API based on the ingress resource configuration. There are two types of CICs in Figure 1: One is used for configuring the VPX, and the other for configuring the CPX where it’s running as a sidecar container.
  • Citrix ADC Metrics Exporter — This Exporter is a sidecar container that exposes the CPX’s metrics. Exporter collects metrics from the CPX and exposes them in a format that Prometheus can understand.
  • Prometheus — A CNCF graduated project, Prometheus is used to pull metrics from the exporter and expose them to the HPA Controller via Prometheus Adapter. HPA uses this data for autoscaling.
  • Prometheus Adapter — Prometheus Adapter contains an implementation of the Kubernetes resource metrics API and custom metrics API. You can use this adapter with the Horizontal Pod Autoscaler v2. It can also replace the traditional metrics server deployed on the K8s cluster, which is already running Prometheus and collecting appropriate metrics.

Put HPA Into Action

Step 1: Clone repo and change directory

git clone https://github.com/citrix/citrix-k8s-ingress-controller.git

cd citrix-k8s-ingress-controller/blob/master/example/hpa-demo/

Step 2: Set values for VPX

VPX_IP=”VPX_IP”
CPX_IMAGE=”quay.io/citrix/citrix-k8s-cpx-ingress:13.0-47.103″
CIC_IMAGE=”quay.io/citrix/citrix-k8s-ingress-controller:1.6.1″
EXPORTER=”quay.io/citrix/citrix-adc-metrics-exporter:1.4.0″
VPX_PASSWORD=”VPX_PASSWORD”
VPX_VIP=”VIRTUAL_IP_VPX”

Open values.sh in the current directory and update it to the appropriate values. VPX_PASSWORD should be the credentials of the nsroot user. VIRTUAL_IP_VPX will be the IP on which the guestbook application (the sample application used for the demo) will be accessed.

Step 3: Configure custom metric

HPA can be configured with any (or multiple) metrics provided here. In the example, RPS is used as a custom metric. Values of RPS are represented by the counter name citrixadc_http_requests_rate in Citrix ADC. This metric information needs to be given in the files below:

hpa.yaml

metrics:
  - type: Pods
    pods:
      metricName: citrixadc_http_requests_rate
      targetAverageValue: 20

values.yaml (under the rules section)

rules:
  default: false
  custom: 
    - seriesQuery: '{__name__= "citrixadc_http_requests_rate"}'
      seriesFilters: []
      resources:
        overrides:
          k8s_namespace:
            resource: namespace
          k8s_pod_name:
            resource: pod
      name:
        matches: "citrixadc_http_requests_rate"
        as: ""
      metricsQuery: <<.Series>>{<<.LabelMatchers>>,container_name!="POD"}

Step 4: Create all the resources

Create all Kubernetes resources by running the create_all.sh file. This will deploy all the required resources. HPA has been configured to scale CPX whenever RPS goes above 20. In other words, one CPX pod will handle a maximum of 20 RPS.

./create_all.sh

Step 5: Add DNS entry in the hosts file

The domain name (hostname) needs to be added in the hosts file to resolve requests for http://www.guestbook.com. This hostname should be mapped to the VIP on the Tier-1 ADC (VPX/MPX). For most Unix-based distros, the hosts file is present in the /etc folder.

Step 6: Send traffic and see the CPX deployment autoscale

There are two shell scripts in the folder: one for sending traffic below the threshold (16_curl.sh) and one for sending traffic above the threshold (30_curl.sh). All this can be visualized in Grafana dashboard, as shown in Figure 2 and Figure 6.

Run traffic within limit

Run the 16_curl.sh script to send 16 HTTP requests per second to the CPX.

kubectl get hpa cpx-builtin will show the output in figures 3, 4, and 5.

Figure 2. Grafana dashboard when 16 HTTP requests are sent per second.
Figure 3. HPA state with 16 RPS (requests per second)

Run traffic above limit

Now, run the 30_curl.sh script to send 30 requests per second to the CPX. As soon as the threshold of 20 RPS is crossed, CPX deployment is autoscaled from one pod to two pods. The average value of the metric “HTTP request rate” goes down from 30 to 15 (see Figure 5).

Figure 4. State of HPA when the average target is overshoot
Figure 5. The number of replicas has gone up from 1 to 2 and the average is 15 RPS
Figure 6. Grafana dashboard with 2 CPXs load balancing the traffic

Step 7: Clean up

Clean up by executing the delete_all.sh script.

./delete_all.sh

Learn More

In this post, I’ve shown how to successfully deploy an HPA for autoscaling Citrix ADC CPX using custom metrics provided by the Citrix ADC. The metric I used as an example is HTTP Requests Rate, but users can choose from a variety of metrics provided by Citrix ADC. You can find them in the metrics.json on the Citrix ADC Metrics Exporter Github page. And please note, if a Tier 1 VPX isn’t present, you can use the Nodeport service to expose the CPX.

Learn more about Citrix ADC, and check out our Citrix GitHub page for more exciting information.