Reference
Create Cluster input fields
| Field | Type | Required | Notes |
|---|---|---|---|
| name | string | Yes | 1-128 chars, alphanumeric + hyphens. Used in AWS resource naming. |
| description | string | Yes | Human-readable cluster description. |
| cloud_account_id | int | Yes | Must reference a supported cloud account with valid credentials. |
| cluster_type | enum | Yes | Must be k8s. |
| secret | string | No | 24-32 chars. Keycloak client secret. Auto-generated if omitted. |
| settings | JSON | Yes | Provider-specific configuration (see below). |
Settings (AWS K8s)
| Field | Type | Required | Notes |
|---|---|---|---|
| region_name | string | No | AWS region. Default: us-east-1. |
| key_pair | string | Yes | EC2 key pair name for SSH access (required for control plane). |
| control_plane_instance_type | string | No | EC2 instance type for control plane. Default: t3.xlarge. |
| vpc_id | string | No | Existing VPC ID. A new VPC is created if omitted. |
| subnet_id | string | No | Existing subnet ID. Resolved automatically if omitted. |
| autoscaler_node_groups | [NodeGroup] | No | Node group definitions for the cluster autoscaler. |
| integrations | [string] | No | List of integrations to enable (e.g., ["jupyterhub"]). |
Settings (Cudo K8s)
| Field | Type | Required | Notes |
|---|---|---|---|
| autoscaler_node_groups | [NodeGroup] | No | Node group definitions. Uses id field instead of name. |
| control_plane_size | string | No | Shorthand: sm, md, or lg. Resolves to a control plane profile. Default: md. |
| ssh_public_key | string | No | SSH public key for debugging access. |
| dev_mode | bool | No | Enable development mode builds. |
| no_vault | bool | No | Disable LUKS encryption and Vault KMS. |
Node group fields (AWS)
| Field | Type | Required | Notes |
|---|---|---|---|
| name | string | Yes | Node group identifier (e.g., control-plane, gpu-workers). |
| role | string | Yes | control or worker. |
| min_size | int | Yes | Autoscaler minimum node count. |
| max_size | int | Yes | Autoscaler maximum node count. |
| instance_types | [string] | Yes | List of EC2 instance types (e.g., ["t3.xlarge"]). Multiple types enable Fleet diversification. |
| labels | JSON | No | Kubernetes node labels. |
| taints | [Taint] | No | Kubernetes node taints. |
| allocation_strategy | string | No | EC2 Fleet strategy: lowest-price, diversified, capacity-optimized. Default: lowest-price. |
Node group fields (Cudo)
| Field | Type | Required | Notes |
|---|---|---|---|
| id | string | Yes | Node group identifier. Cudo uses id instead of name. |
| role | string | Yes | control or worker. |
| min_size | int | Yes | Autoscaler minimum. |
| max_size | int | Yes | Autoscaler maximum. |
| vcpus | int | Yes | Number of vCPUs per node (instead of instance type). |
| memory_gib | int | Yes | Memory in GiB per node. |
| boot_disk_size_gib | int | Yes | Boot disk size in GiB. |
| data_center_id | string | Yes | Cudo data center (e.g., us-dallas-1). |
| machine_type | string | Yes | Hardware family (e.g., intel-broadwell, intel-broadwell-v100). |
| gpus | int | No | Number of GPUs per node. |
| gpu_model | string | No | GPU model (e.g., V100, A100). |
| labels | JSON | No | Kubernetes node labels. |
Cudo control plane profiles
| Profile | vCPU | Memory (GiB) | Boot Disk (GiB) |
|---|---|---|---|
vantage-k8s-control-sm | 4 | 8 | 50 |
vantage-k8s-control-md (default) | 8 | 16 | 100 |
vantage-k8s-control-lg | 16 | 32 | 200 |
Set via instance_type or control_plane_size in settings.
Limits
| Limit | Value | Notes |
|---|---|---|
| Cluster name length | 1-128 chars | Alphanumeric + hyphens. |
| Cluster name characters | Alphanumeric + hyphens | Must start and end with alphanumeric. |
| Node groups per cluster | 20 | Contact support to increase. |
| Max nodes per node group | 100 | Contact support to increase. |
| Cluster limit per tier | Varies by subscription | Check your subscription settings. |
| Slurm-on-K8s cluster name | Lowercase letters, numbers, dashes | Must start with a letter, no trailing dash. |
Error codes
| Condition | Error type | Phase |
|---|---|---|
| Name already in use | ClusterNameInUse | Validation |
| Name format invalid | InvalidInput | Validation |
| key_pair missing from settings | InvalidInput | Validation |
| Instance type not available in region | InvalidInput | Validation |
| Cloud account missing role_arn | ClusterCouldNotBeDeployed | Background |
| VPC / IAM creation fails | ClusterCouldNotBeDeployed | Background |
| Keycloak client creation fails | UnexpectedBehavior | Synchronous |
AWS resources created
| Resource | Naming convention |
|---|---|
| VPC | Auto-created 10.0.0.0/16 |
| Subnets | Public + private |
| IAM Role | vantage-{client_id}-node-role |
| Instance Profile | vantage-{client_id}-instance-profile |
| EC2 Instance | vantage-{client_id}-control-plane |
| Launch Templates | vantage-{client_id}-{node_group_name} (created by autoscaler) |
| EC2 Fleet Instances | Tagged vantage-cluster={client_id} |
Database operations
| Table | Operation | When |
|---|---|---|
| cloud_account | SELECT | Provider + credential resolution |
| cluster | SELECT | Name uniqueness check |
| cluster | INSERT | Cluster creation |
| cluster | UPDATE | Status → ready (via markClusterReady) |
External service calls per provider
AWS
| Service | Call | Phase |
|---|---|---|
| Keycloak | Create OAuth2 client | Synchronous |
| AWS STS | AssumeRole | Background |
| AWS EC2 | VPC, subnet, SG creation | Background |
| AWS IAM | Role, instance profile, policies | Background |
| AWS EC2 | RunInstances (control plane) | Background |
| vdeployer-web | POST /deploy | markClusterReady |
Cudo Compute
| Service | Call | Phase |
|---|---|---|
| Keycloak | Create OAuth2 client | Synchronous |
| Cudo Compute | List machine types | Background |
| Cudo Compute | Create disk | Background |
| Cudo Compute | Create VM | Background |
| Cudo Compute | Poll disk/VM status | Background |
| vdeployer-web | POST /deploy | markClusterReady |