HiberScale technology provides hibernated nodes that can be resumed to reduce costs and startup delays. For more information, see HiberScale technology.
You can enable Kompass HiberScale technology for workloads by defining a WorkloadDescriptor custom resource.
The WorkloadDescriptor references a workload and defines protection options - Headroom reduction (spike protection) and Spot protection - that control when HiberScale resumes hibernated nodes.
When protection is activated for a workload, HiberScale creates hibernated node pools for that workload, based on the QScaler CRD.
You can override the local nodepool settings, such as instance types and number of nodes as well as the global HiberScale settings as described in Advanced configuration and observability.
WorkloadDescriptor overview
A WorkloadDescriptor is a namespaced resource.
The main fields include:
- workloadReference: Identifies the workload by apiVersion, kind, and name. Supported workload types include Deployment and StatefulSet. 
- protection: Defines conditions that trigger HiberScale for spike and Spot protection. 
- resources: Specifies the vCPU and memory reserved for hibernated nodes that protect the workload. 
Example:
apiVersion: kompass.zesty.co/v1alpha1
kind: WorkloadDescriptor
metadata:
  name: test-deployment
  namespace: test
spec:
  workloadReference:
    apiVersion: apps/v1
    kind: Deployment
    name: test-deployment
  protection:
    spike:
      active: true
      threshold: "10%"
      strategy: "default"
    spot:
      active: false
  resources:
    cpu: 8000m
    memory: 8GiFor a full reference of the fields, see WorkloadDescriptor field reference.
Lifecycle
After a WorkloadDescriptor is created, Kompass performs the following operations before the resource is activated:
- Image size calculation: Kompass calculates the uncompressed size of container images using the - QCacheRevisionCreationCRD. For the first WorkloadDescriptor, the calculation starts immediately. For later ones, the calculation respects the cooldown defined in the QubexConfig (- revisionMinCreationInterval, default 30 minutes).
- Shard updates: Kompass creates or updates - QCacheShardCRs with image size and distribution details.
- QNode updates: For the first WorkloadDescriptor, Kompass creates new QNodes. For subsequent ones, existing QNodes wake up and download new images. 
- Active state: When QNodes reach - CacheLag=0(all images downloaded), Kompass sets the WorkloadDescriptor state to Active. Pre-pulled images reduce startup delays when HiberScale resumes hibernated nodes.
The protection section contains options that define when HiberScale resumes hibernated nodes:
- Spike protection (Headroom reduction): reacts to unschedulable Pods 
- Spot protection: reacts to Spot interruption notices 
Spike protection
The spike section specifies how Kompass manages unschedulable Pods. When active, it can adjust the HPA minReplicas according to a strategy that you configure.  For more information on how Kompass protects workloads, see Headroom reduction.
- active: Determines whether the workload is protected for spikes and minimum replica management. - true: Kompass resumes hibernated nodes when the number of unschedulable Pods exceeds the threshold. 
- false: Kompass ignores unschedulable Pods and does not resume hibernated nodes. 
 
- threshold: The percentage of Pods in the workload that must be unschedulable before Kompass resumes hibernated nodes. 
- strategy: Defines how Kompass manages the HPA - minReplicasvalue.- manual: Kompass does not control - minReplicas; the user manages it directly.
- default: Kompass sets - minReplicasbased on past usage to reduce costs without affecting SLAs.
- conservative: Similar to default, but uses more cautious calculations. This reduces savings but provides additional protection for sensitive workloads. 
 
Argo CD
When using ArgoCD, configure it so that Kompass changes to the HPA are not treated as conflicts; otherwise, ArgoCD may revert Kompass adjustments.
Spot protection
The spot section specifies how Kompass responds to Spot interruptions. 
- active: Determines whether the workload is protected for Spot interruptions. For more information on how Kompass protects workloads, see Spot management. - true: Kompass resumes hibernated nodes in response to interruption notices. 
- false: Kompass ignores interruption notices. 
 
Resources
The resources section defines CPU and memory values for hibernated nodes.
- Values determine the resources allocated to the pool of hibernated nodes used to protect the workload. - Recommended vCPU and RAM allocation - It is recommended to allocate an amount of resources equal to the current Pod requests multiplied by the maximum number of Pods expected. 
- You can prevent workload protection in these ways: - Delete the WorkloadDescriptor. - This is preferred when using IaC tools. 
- Set both - spike.active=falseand- spot.active=false.
 This prevents HiberScale from creating hibernated nodes without deleting the WorkloadDescriptor. If there are previously created hibernated nodes, they will be removed.
 
Deactivation
You can deactivate HiberScale by disabling the active fields for both spike and spot.
For more information, see Resources.
Advanced configuration and observability
The following options provide additional configuration and integration. They are not required for basic operation but can be used to tune behavior or integrate with observability systems.
QScaler local setting overrides
The QScaler CRD defines hibernated node pools. For each autoscaler or Karpenter nodepool that protects a workload, Kompass creates a corresponding QScaler object.
You can override the following settings from the upstream nodepool:
- instanceTypes: compatible instance types that support hibernation 
- maxHibernatedQNodes: maximum number of hibernated nodes 
- maxRunningQNodes: maximum number of Kompass-created nodes 
Example:
spec:
 overrides:
   instanceTypes:
   - c5.xlarge
   - c5.2xlarge
   maxHibernatedQNodes:
     type: Absolute
     value: 10
   maxRunningQNodes:
     type: Absolute
     value: 10Sizing types for maximum values:
- Absolute: fixed number of nodes 
- PercentOfPool: percentage of the maximum size of the upstream nodepool 
- PercentOfRunning: percentage of the current number of nodes in the upstream nodepool 
Recommendation: For predictability, use Absolute sizing.
QubexConfig global setting overrides
The QubexConfig CRD defines global HiberScale settings.
You can override the default settings.
Common setting:
- cache.revisionMinCreationInterval: Time to wait between checking for new images in the cluster. 
Metrics ingestion
Kompass control plane components expose Prometheus metrics on a /metrics endpoint. These endpoints can be scraped by Prometheus to ingest Kompass metrics into an observability system.