Create a Slurm cluster
Prerequisites
Before creating a Slurm cluster, you need:
- A Vantage account and organization.
- A configured Cloud Account for your chosen provider — see Compute Providers.
- AWS only: An SSH key pair created in the target AWS region. Vantage uses the key pair name to provision cluster nodes — it never receives the private key.
AWS
The most common Slurm path. Vantage uses CloudFormation to provision a VPC, Auto Scaling groups, and IAM roles, then installs Slurm on the controller and worker nodes.
-
Open Clusters — Click Clusters in the left sidebar, then click Prepare Cluster.
-
Choose type — Select Slurm and click Continue.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique). The name is used as the CloudFormation stack name.
- Select your AWS Cloud Account. The provider is detected automatically.
- Pick a Region — the dropdown loads after you select the cloud account.
- The Head Node Machine Type auto-fills a default — click Select Head Node to browse by vCPU, GPU, and price.
- Select an SSH Key Name — the list loads after you pick a region. If it's empty, create a key pair in the AWS EC2 console first.
-
Networking (optional) — Click Advanced Options to pin the cluster to a specific VPC, Head Node Subnet, and Compute Node Subnet. Leave these empty to use AWS-managed defaults (Vantage creates a VPC, public and private subnets, Internet Gateway, NAT Gateway, and security groups automatically).
-
Set partitions — A default partition named
computeis pre-filled. For each partition:- Give it a Partition Name.
- Click Select Compute Node to choose the instance type for worker nodes.
- Set the Maximum node count — Vantage scales up to this limit when jobs are waiting.
- Click Add Partition to create additional partitions for different workload types (e.g., a GPU partition alongside a CPU partition).
-
Submit — Click Prepare Cluster. Vantage generates a CloudFormation template and creates the stack. Provisioning typically takes a few minutes.
What Vantage provisions on AWS
| Resource | Details |
|---|---|
| VPC | 10.0.0.0/16 CIDR (only created if VPC not provided) |
| Subnets | Public + private subnets |
| Internet Gateway | For public subnet outbound |
| NAT Gateway | For private subnet outbound |
| Security groups | Slurm inter-node communication |
| IAM instance profiles | Grant nodes access to assume the cluster role |
| EC2 Auto Scaling group | Worker nodes with configured instance type and limits |
| Slurm controller | Always-on head node (EC2 instance) |
Azure
-
Open Clusters — Click Clusters, then Prepare Cluster.
-
Choose type — Select Slurm and click Continue.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your Azure Cloud Account.
-
Submit — Click Prepare Cluster. Azure Slurm clusters use Vantage-managed defaults for node configuration and networking. Partitions are configured post-creation from the Partitions tab on the cluster detail page.
GCP
-
Open Clusters — Click Clusters, then Prepare Cluster.
-
Choose type — Select Slurm and click Continue.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your GCP Cloud Account.
-
Submit — Click Prepare Cluster. GCP Slurm clusters use Vantage-managed defaults for node configuration and networking. Partitions are configured post-creation from the cluster detail page.
Cudo Compute
-
Open Clusters — Click Clusters, then Prepare Cluster.
-
Choose type — Select Slurm and click Continue.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your Cudo Compute Cloud Account.
-
Submit — Click Prepare Cluster. Cudo Slurm clusters use Vantage-managed defaults for node configuration and networking. Partitions are configured post-creation.
On-premises / LXD
On-premises clusters connect through a lightweight agent deployed on your infrastructure. Vantage does not provision cloud resources — you provide the compute.
-
Open Clusters — Click Clusters, then Prepare Cluster.
-
Choose type — Select Slurm and click Continue.
-
Configure:
- Enter a Cluster Name.
- Select your On-Premises or LXD cloud account.
-
Get the agent command — The wizard shows a Vantage Agent installation command. Copy it.
-
Install the agent — Run the installation command on your cluster's head node (or multiple nodes). The agent establishes an outbound HTTPS connection to Vantage — no inbound firewall rules required.
-
Watch it connect — The cluster flips to
readyonce the agent is reporting. Nodes appear in the detail page as they register.
The agent only needs outbound HTTPS access to Vantage servers (port 443). If your cluster is behind a firewall, ensure outbound connectivity is not blocked.
What happens after submission
After you submit the creation form, the cluster immediately enters preparing status. The exact provisioning steps depend on the provider:
- AWS — Vantage generates a CloudFormation template and calls
create_stack. The stack provisions VPC, subnets, IAM roles, and EC2 instances asynchronously. Once the head node boots, Vantage Agent registers the node and uploads the Slurm configuration. The cluster transitions toready. - Non-AWS cloud — Vantage provisions infrastructure through your provider's API. The cluster transitions to
readyonce provisioning completes. - On-premises — Vantage creates the database record and waits for the agent to connect. The cluster transitions to
readywhen the agent first phones home.
Poll the cluster status from the Clusters list or via the API. Start with low max node counts — you can raise them later from the Partitions tab. Idle provisioned nodes bill at full rate.