April 28, 2026
    12 min read

    vCluster on EKS with Karpenter: Dev & QA Environments in Under 5 Minutes

    Hardik Shah

    Hardik Shah

    Cloud Architect & AWS Expert

    AWS
    EKS
    Kubernetes
    vCluster
    Karpenter
    DevOps
    QA
    Platform Engineering
    Cost Optimization
    GitOps
    Containers
    vCluster on EKS with Karpenter: Dev & QA Environments in Under 5 Minutes

    Running dedicated Amazon EKS clusters for every developer or QA engineer is slow (30–45 minutes per cluster), expensive, and operationally heavy. This post shows how combining vCluster with a shared EKS host cluster and Karpenter for node autoscaling lets your teams spin up fully isolated environments in under 5 minutes — at up to 70% lower infrastructure cost.

    The Problem: One Cluster Per Environment Doesn't Scale

    Modern engineering teams rightly insist on environment isolation — dev, QA, staging, and production should never share the same cluster. But taking this to its logical extreme creates a painful reality:

    Pain PointImpact
    Provisioning a new EKS cluster takes 30–45 minutesQA engineers sit idle; sprint velocity drops
    Every cluster needs its own ALB, Route 53, monitoring agentsInfrastructure cost multiplies linearly with team count
    Platform team is the sole gatekeeperBottleneck slows down every squad
    IAM roles and RBAC configs multiplySecurity and access management becomes unwieldy
    Idle clusters run 24/7 even after tests finishWasted AWS spend every month
    Deloitte faced exactly these challenges. After adopting EKS + vCluster they achieved 89% faster provisioning and reclaimed 500+ engineering hours per year.

    What Is vCluster?

    vCluster is an open-source project from Loft Labs that creates virtual Kubernetes clusters running as pods inside a real host cluster. Think of it as Kubernetes-in-Kubernetes — but lightweight and fast.

    Each virtual cluster has:

    • Its own kube-apiserver, controller manager, and CoreDNS
    • Its own namespaces, RBAC, and resource quotas
    • Complete isolation — teams can't see each other's workloads
    • Syncing to the host cluster for actual scheduling, networking, and storage

    Unlike plain namespaces (which share the same apiserver), vClusters are architecturally isolated. Unlike real clusters (which carry full control-plane cost), vClusters are just pods on the host.

    vCluster on Amazon EKS Architecture
    Figure 1: vCluster on a shared Amazon EKS host cluster with Karpenter and shared controllers.

    Architecture: Four Layers

    1. EKS Host Cluster + Karpenter

    One shared EKS cluster serves as the foundation. Karpenter replaces managed node groups — it watches for unschedulable pods and provisions right-sized EC2 nodes in ~60 seconds. When vClusters are idle, Karpenter's consolidation policy bins-packs workloads and terminates underused nodes, so ephemeral QA environments cost near-zero when not active.

    2. Virtual Clusters (vCluster)

    Each dev team or QA environment gets its own vCluster — provisioned in under 5 minutes via the vCluster web console, vcluster create CLI, or a Helm chart in an ArgoCD/Flux GitOps pipeline.

    3. Shared Controllers (once on the host)

    ControllerPurpose
    AWS Load Balancer ControllerProvisions ALBs for Ingress objects created inside vClusters
    KarpenterAutoscales EC2 nodes based on actual pod demand across all vClusters
    EBS CSI DriverDynamically provisions gp3 EBS volumes for PVCs in vClusters
    Monitoring AgentSingle Prometheus/Datadog agent covers all virtual clusters

    4. Single ALB with Path-Based Routing

    One Application Load Balancer fronts all virtual clusters. Each app inside a vCluster creates an Ingress with a unique path prefix, and the ALB routes traffic using listener rules — no separate load balancer per env.

    ALB Path-Based Routing to vClusters
    Figure 2: Single Application Load Balancer using path-based rules to route traffic across virtual clusters.

    Step-by-Step Setup

    Step 1: Install Karpenter

    Start with a standard EKS cluster (no Auto Mode). Keep one managed node group for system/Karpenter workloads, then install Karpenter:

    export CLUSTER_NAME=<your-cluster>
    export KARPENTER_VERSION=v0.37.0
    
    helm repo add karpenter https://charts.karpenter.sh/
    helm upgrade --install karpenter karpenter/karpenter \
      --namespace kube-system \
      --version $KARPENTER_VERSION \
      --set settings.clusterName=$CLUSTER_NAME \
      --set settings.interruptionQueue=$CLUSTER_NAME \
      --wait

    Then create a NodePool and EC2NodeClass:

    apiVersion: karpenter.sh/v1
    kind: NodePool
    metadata:
      name: vcluster-pool
    spec:
      template:
        spec:
          nodeClassRef:
            group: karpenter.k8s.aws
            kind: EC2NodeClass
            name: vcluster-nodeclass
          requirements:
            - key: karpenter.sh/capacity-type
              operator: In
              values: ["spot", "on-demand"]   # Spot-first
            - key: karpenter.k8s.aws/instance-category
              operator: In
              values: ["m", "c", "r"]
      limits:
        cpu: "200"
        memory: 800Gi
      disruption:
        consolidationPolicy: WhenUnderutilized
        consolidateAfter: 5m
    ---
    apiVersion: karpenter.k8s.aws/v1
    kind: EC2NodeClass
    metadata:
      name: vcluster-nodeclass
    spec:
      amiSelectorTerms:
        - alias: al2023@latest
      role: KarpenterNodeRole-<your-cluster>
      subnetSelectorTerms:
        - tags:
            karpenter.sh/discovery: <your-cluster>
      securityGroupSelectorTerms:
        - tags:
            karpenter.sh/discovery: <your-cluster>

    Step 2: Configure Shared IngressClass and StorageClass

    # IngressClass — shared ALB ingress controller
    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: alb
      annotations:
        ingressclass.kubernetes.io/is-default-class: "true"
    spec:
      controller: ingress.k8s.aws/alb
    ---
    # StorageClass — gp3 EBS via standard EBS CSI driver
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: ebs-gp3-sc
      annotations:
        storageclass.kubernetes.io/is-default-class: "true"
    provisioner: ebs.csi.aws.com
    volumeBindingMode: WaitForFirstConsumer
    parameters:
      type: gp3
      encrypted: "true"

    Step 3: Deploy vCluster Platform via Helm

    helm repo add vcluster https://charts.loft.sh
    
    helm upgrade --install vcluster-pro vcluster/vcluster-platform \
      --namespace vcluster-platform \
      --create-namespace \
      --version 4.0.1 \
      --set config.loftHost=$DOMAIN_NAME \
      --set admin.create=true \
      --set admin.username=admin \
      --set admin.password=<strong-password> \
      --set ingress.enabled=true \
      --set ingress.host=$DOMAIN_NAME \
      --set ingress.ingressClass=alb \
      # Pin vCluster platform to on-demand nodes
      --set nodeSelector."karpenter\.sh/capacity-type"=on-demand

    Step 4: Create a Virtual Cluster

    Log into the vCluster console and create a new virtual cluster with this sync configuration:

    sync:
      fromHost:
        ingressClasses:
          enabled: true   # Virtual cluster sees host's ALB IngressClass
        storageClasses:
          enabled: true   # Virtual cluster sees host's gp3 StorageClass
      toHost:
        ingresses:
          enabled: true   # Ingresses sync to host → ALB creates path rules
    
    controlPlane:
      coredns:
        enabled: true
        embedded: true

    Why this sync config matters:

    • fromHost — makes the shared ALB controller and EBS StorageClass transparently available inside the vCluster
    • toHost — Ingress objects created by app teams inside the vCluster propagate to the host, triggering ALB path-rule creation automatically

    Before vs After: The Developer Experience

    ❌ Before: Dedicated EKS Clusters

    • 🕐 30–45 min wait for a new environment
    • 👷 Platform team bottleneck on every request
    • 💸 1 ALB + Route 53 + monitoring per environment
    • 🔑 New IAM roles and RBAC per cluster
    • 🖥️ Idle nodes running 24/7

    ✅ After: vCluster + Karpenter

    • ⚡ Under 5 minutes, fully self-service
    • 🚀 No platform team involvement
    • 💰 1 shared ALB for all environments
    • 🔐 vCluster RBAC — isolated without extra IAM
    • 📉 Karpenter consolidates idle nodes automatically

    Cost & Efficiency Impact

    MetricBeforeAfterGain
    Provisioning time30–45 min< 5 min89% faster
    Platform team involvement~2 hrs/env0 hrs (self-service)500+ hrs/yr reclaimed
    EKS control planes1 per environment1 shared90%+ reduction
    Load balancers1 per environment1 shared ALBCost eliminated
    EC2 cost (Karpenter + Spot)On-demand onlySpot-first, auto-consolidatedUp to 70% cheaper

    GitOps: Tie vCluster Lifecycle to Pull Requests

    Use ArgoCD to create and destroy vClusters automatically based on PR lifecycle:

    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: qa-env-feature-xyz
      namespace: argocd
    spec:
      source:
        repoURL: https://github.com/myorg/vcluster-configs
        targetRevision: HEAD
        path: environments/qa-env-feature-xyz
      destination:
        server: https://kubernetes.default.svc
        namespace: qa-env-feature-xyz
      syncPolicy:
        automated:
          prune: true     # Delete vCluster when PR is merged/closed
          selfHeal: true

    PR opened → ArgoCD creates vCluster → App deployed → QA tests run. PR merged → ArgoCD prunes the Application → vCluster deleted → Karpenter consolidates idle nodes → cost drops to near-zero.

    When to Use This Pattern

    ✅ Great fit

    • Multiple dev/QA teams
    • Ephemeral environments (PR-scoped)
    • Platform team bottlenecks
    • Cost-conscious AWS workloads

    ⚠️ Needs care

    • Cluster-level CRD installs
    • Very high I/O workloads
    • Node-level compliance isolation

    ❌ Not ideal

    • Production environments
    • Shared GPU node pools

    Conclusion

    The combination of vCluster + EKS + Karpenter is one of the most impactful platform engineering patterns available today. With a single shared host cluster, you can support 100+ isolated virtual clusters, provision them in under 5 minutes, and let every dev and QA team operate independently — no platform team handoffs, no waiting, no wasted spend.

    Karpenter handles the compute efficiency problem: Spot-first provisioning, ~60 second node startup, and automatic consolidation when environments go idle. vCluster handles the isolation problem: full Kubernetes API separation without the cost of real cluster control planes.

    If your team is still waiting 45 minutes for a test environment, this is the architecture change worth making next sprint.
    Hardik Shah

    About Hardik Shah

    Hardik is a dedicated Cloud Architect specializing in AWS solutions and DevOps automation. With years of industry experience, he focuses on building scalable, resilient architectures and sharing technical insights to help teams optimize their cloud-native journeys.