On-Premises clusters
On-premises clusters run on infrastructure you control — bare-metal servers, local VMs, or LXD containers. Vantage supports multiple methods for connecting your infrastructure, each suited to a different use case and level of automation.
AnsiblePrerequisites
All on-premises methods require:
- A Vantage account and organization (Sign Up)
- A configured On-Premises or LXD Cloud Account
- Outbound HTTPS access (port 443) from your infrastructure to Vantage servers
- Ansible
- Terraform
- Kubernetes
- Juju
- Multipass
- Manual
Use the vantage-agents Ansible role to automate connecting your Slurm cluster to Vantage. The role installs and configures both vantage-agent and jobbergate-agent on your controller node.
Prerequisites
- An existing Slurm cluster with SSH access to the controller node
- Ansible 2.9+ installed on your workstation
- OIDC credentials from your Vantage cluster (client ID and client secret)
1. Create a Slurm cluster in Vantage
-
Open Clusters — Click Clusters in the left sidebar, then click Slurm in the cluster type navigation, then click Prepare Cluster.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your On-Premises or LXD cloud account.
-
Submit — Click Create Cluster. Note the OIDC Client ID and OIDC Client Secret from the cluster detail page — you will need these in the next step.
2. Install the Ansible role
ansible-galaxy install git+https://github.com/vantagecompute/ansible-role-vantage_agents.git,vantage_agents
3. Create a playbook
- hosts: util_node
become: true
vars:
oidc_client_id: "<your-oidc-client-id>"
oidc_client_secret: "<your-oidc-client-secret>"
cluster_name: "<your-cluster-name>"
install_type: "snap" # or "pypi"
roles:
- role: vantage_agents
4. Run the playbook
ansible-playbook -i your_inventory your_playbook.yml
5. Verify the connection
Once the playbook completes:
- The cluster transitions from
preparingtoreadyin the Vantage UI - Nodes appear in the cluster detail page as they register
- You can submit jobs and launch notebooks immediately
The Ansible role supports both Snap (default) and PyPI installation methods. Set install_type: "pypi" in your playbook vars to use Python virtual environments with systemd services instead of Snap packages.
Next steps
Use the terraform-vantage-agents module to automate connecting your Slurm cluster to Vantage. The module SSH-es into your controller node and installs both vantage-agent and jobbergate-agent.
Prerequisites
- An existing Slurm cluster with SSH access to the controller node
- Terraform 1.0+ installed on your workstation
- OIDC credentials from your Vantage cluster (client ID and client secret)
1. Create a Slurm cluster in Vantage
-
Open Clusters — Click Clusters in the left sidebar, then click Slurm in the cluster type navigation, then click Prepare Cluster.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your On-Premises or LXD cloud account.
-
Submit — Click Create Cluster. Note the OIDC Client ID and OIDC Client Secret from the cluster detail page — you will need these in the next step.
2. Configure the Terraform module
module "vantage_agents" {
source = "github.com/vantagecompute/terraform-vantage-agents"
ssh_user = var.ssh_user
ssh_private_key = file(var.ssh_private_key)
host = var.host
oidc_client_id = var.oidc_client_id
oidc_client_secret = var.oidc_client_secret
cluster_name = var.cluster_name
# Optional: set to "pypi" for Python venv + systemd instead of Snap
# install_type = "pypi"
}
3. Apply the configuration
terraform init
terraform apply
4. Verify the connection
Once Terraform completes:
- The cluster transitions from
preparingtoreadyin the Vantage UI - Nodes appear in the cluster detail page as they register
- You can submit jobs and launch notebooks immediately
The Terraform module supports both Snap (default) and PyPI installation methods. Set install_type = "pypi" to use Python virtual environments with systemd services instead of Snap packages.
Next steps
Connect an existing Kubernetes cluster to Vantage. Vantage does not provision cloud resources for on-premises K8s — you provide the compute infrastructure.
Prerequisites
- An existing Kubernetes cluster with a reachable API server
- A configured On-Premises or LXD cloud account in Vantage
- Outbound HTTPS access (port 443) from your cluster to Vantage servers
1. Create a Kubernetes cluster in Vantage
-
Open Clusters — Click Clusters in the left sidebar (the Kubernetes view is shown by default), then click Prepare Cluster.
-
Configure:
- Enter a Cluster Name.
- Select your On-Premises or LXD cloud account.
-
Submit — Click Create Cluster. The modal closes and you are redirected to the cluster view.
2. Connect your cluster
After creating the cluster, the cluster detail page shows that the connection is not yet established and provides instructions for connecting your infrastructure.
Follow the on-screen instructions to install the Vantage connector on your cluster. The connector only needs outbound HTTPS access to Vantage servers (port 443). If your cluster is behind a firewall, ensure outbound connectivity is not blocked.
Once the connector is active:
- The cluster transitions from
preparingtoready - Nodes appear in the cluster detail page as they register
- You can launch notebooks and deploy workloads immediately
Multipass and Juju on-premises clusters only support Slurm, not Kubernetes. For Kubernetes on your own hardware, use this method.
Troubleshooting
Cluster stays in "preparing"
- Verify the connector was installed and is running on your control plane node.
- Check that port 443 outbound is not blocked by a firewall.
- Confirm the cloud account credentials are valid.
Nodes not appearing
- Install the connector on each worker node, not just the control plane.
- Check that each node can reach Vantage servers over HTTPS.
Next steps
Deploy a production-like Slurm cluster on your local machine using Juju charms and LXD containers. Charmed HPC provides a full HPC stack — controller, compute, database, REST API, login node, and Vantage integration — all running inside LXD containers on a single host.
Prerequisites
- Ubuntu 22.04 or 24.04 (host machine)
- LXD installed and initialised (
lxd init) - Juju snap installed
- Internet access to download charms and container images
1. Bootstrap a Juju controller
juju bootstrap lxd
This creates a controller named lxd-default inside an LXD container. It manages all models and workloads.
2. Add a model for your Slurm cluster
juju add-model myslurmcluster
Models are logical environments. Name yours whatever you like — myslurmcluster is used throughout this guide.
3. Deploy the required charms
Use Ubuntu 24.04 as the base for Slurm charms. MySQL uses Ubuntu 22.04 (charm default).
juju deploy slurmctld --base ubuntu@24.04 --channel 25.11/edge
juju deploy slurmdbd --base ubuntu@24.04 --channel 25.11/edge
juju deploy mysql --channel 8.0/stable
juju deploy slurmrestd --base ubuntu@24.04 --channel 25.11/edge
juju deploy slurmd --base ubuntu@24.04 --channel 25.11/edge
juju deploy sackd --base ubuntu@24.04 --channel 25.11/edge
juju deploy jobbergate-agent --channel latest/edge --base ubuntu@24.04
juju deploy vantage-agent --channel latest/edge --base ubuntu@24.04
4. Add relations (integrations)
juju relate slurmctld slurmdbd
juju relate slurmctld slurmrestd
juju relate slurmctld slurmd
juju relate slurmdbd mysql
juju relate sackd slurmctld
juju relate jobbergate-agent sackd
juju relate vantage-agent sackd
These connections enable:
- Accounting and configuration sharing
- REST API access
- Compute node registration
- Vantage integration with Slurm
5. Configure the charms
Compute node initial state
Set slurmd nodes to start in idle (ready to run jobs) or down (manual resume). Use lowercase only:
juju config slurmd default-node-state=idle
OIDC credentials (for Vantage integration)
Replace the example values with your real OIDC client ID and secret.
juju config jobbergate-agent jobbergate-agent-oidc-client-id=your-client-id
juju config jobbergate-agent jobbergate-agent-oidc-client-secret=your-client-secret
juju config vantage-agent vantage-agent-oidc-client-id=your-client-id
juju config vantage-agent vantage-agent-oidc-client-secret=your-client-secret
Vantage cluster name
juju config vantage-agent vantage-agent-cluster-name=myslurmcluster
6. Monitor deployment
juju status --watch 1s
Wait until all applications show active and all units are idle.
Expected status:
| Application | Expected status |
|---|---|
slurmctld, slurmdbd, mysql, slurmd, slurmrestd, sackd | active |
jobbergate-agent, vantage-agent | active (once OIDC and cluster name are set) |
7. Verify the Slurm cluster
SSH into the controller unit (machine 0):
juju ssh 0
Check the partition and nodes:
sinfo
You should see the slurmd partition and at least one node in idle state.
If the node is down, resume it:
sudo scontrol update nodename=slurmd-0 state=resume
Submit a test job:
sbatch -p slurmd --wrap="sleep 10; hostname"
squeue
After a few seconds, squeue should show the job running (R) then disappear. Check the output:
cat slurm-*.out
Exit the controller:
exit
Troubleshooting
Deployment failures
juju status
juju debug-log --include slurmctld
Container issues
lxc list
lxc logs <container-name>
lxc restart <container-name>
Network connectivity
lxc network show lxdbr0
lxc exec <container-name> -- ping google.com
Slurm services not running
juju ssh slurmctld/0
sudo systemctl status slurmctld
sudo systemctl status slurmdbd
sudo systemctl restart slurmctld slurmdbd
sudo journalctl -u slurmctld -f
Next steps
- Submit jobs to your cluster
- Create a Multipass cluster — for simpler single-node setups
- Connect manually — for production on-premises deployments
Deploy a single-node Slurm cluster in a Multipass VM on your local machine. Multipass provides lightweight Ubuntu VMs managed through a simple CLI — ideal for development, testing, and learning Vantage without cloud infrastructure.
Multipass clusters are created and managed entirely through the terminal using the Vantage CLI.
Use Multipass when you want to:
- Try Vantage without cloud infrastructure
- Develop and test Slurm job scripts locally
- Learn HPC workflows on your laptop
For a multi-node HPC environment, consider Juju (Charmed HPC) instead. For connecting existing production hardware, use the Manual method.
Prerequisites
- A Vantage account and organization (Sign Up)
- A Linux, macOS, or Windows machine with at least 8 GB RAM and 20 GB disk space
- Snap package manager (Linux) or Homebrew (macOS)
1. Install the Vantage CLI
pip install vantage-cli
Verify the installation:
vantage version
Log in to your Vantage account:
vantage login
2. Install Multipass
# Linux (via snap)
sudo snap install multipass
# macOS (via Homebrew)
brew install --cask multipass
# Windows — download the installer from https://multipass.run/install
Verify the installation:
multipass --version
3. Deploy the cluster
Deploy a single-node Slurm cluster:
vantage app deploy slurm-multipass-localhost
Monitor the deployment progress:
vantage app status slurm-multipass-localhost
Wait until the VM is running and Slurm services are ready:
multipass list
The output should show the VM in a Running state.
Custom resources
By default, the cluster uses 4 CPUs, 8 GB RAM, and 50 GB disk. Customize resources during deployment:
vantage app deploy slurm-multipass-localhost \
--cpus=4 \
--memory=8G \
--disk=50G
4. Access the cluster
Connect to the VM:
multipass shell slurm-node
Or use SSH:
ssh ubuntu@$(multipass info slurm-node --format json | jq -r '.info."slurm-node".ipv4[0]')
5. Verify Slurm
Once connected to the VM, verify that Slurm is running:
sinfo
squeue
Submit a test job:
srun --nodes=1 --ntasks=1 hostname
Or submit a batch job:
sbatch <<EOF
#!/bin/bash
#SBATCH --job-name=hello-world
#SBATCH --output=hello-output.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:01:00
echo "Hello from Slurm on Multipass!"
hostname
date
EOF
Check the job output:
cat hello-output.txt
Managing the cluster
| Action | Command |
|---|---|
| Start the cluster | vantage app start slurm-multipass-localhost or multipass start slurm-node |
| Stop the cluster | vantage app stop slurm-multipass-localhost or multipass stop slurm-node |
| Restart the cluster | vantage app restart slurm-multipass-localhost or multipass restart slurm-node |
| Delete the cluster | vantage app delete slurm-multipass-localhost or multipass delete slurm-node --purge |
| View cluster info | vantage app info slurm-multipass-localhost or multipass info slurm-node |
File transfer
# Copy files to the VM
multipass transfer myfile.txt slurm-node:/home/ubuntu/
# Copy files from the VM
multipass transfer slurm-node:/home/ubuntu/results.txt ./
# Mount a local directory in the VM
multipass mount ./workspace slurm-node:/mnt/workspace
Troubleshooting
VM won't start
# Check Multipass status
multipass version
sudo snap logs multipass
# Restart Multipass
sudo snap restart multipass
# Check system resources
free -h
df -h
Network issues
# Check VM connectivity
multipass exec slurm-node -- ping google.com
# Reset network
multipass stop slurm-node
multipass start slurm-node
Slurm services not running
# SSH into the VM
multipass shell slurm-node
# Check service status
sudo systemctl status slurmctld
sudo systemctl status slurmd
sudo systemctl status slurmdbd
# Restart services
sudo systemctl restart slurmctld slurmd slurmdbd
# Check logs
sudo journalctl -u slurmctld -f
Next steps
- Submit jobs to your cluster
- Create a Juju cluster — for multi-node HPC environments
- Connect manually — for production on-premises deployments
Connect your existing Slurm or Kubernetes cluster to Vantage by manually installing the Vantage connector on your infrastructure. This method gives you full control over the installation process and works with any Linux server that meets the prerequisites.
Prerequisites
- A Vantage account and organization (Sign Up)
- A configured On-Premises or LXD Cloud Account
- Outbound HTTPS access (port 443) from your infrastructure to Vantage servers
- SSH access to your cluster nodes
Create a Slurm cluster
-
Open Clusters — Click Clusters in the left sidebar, then click Slurm in the cluster type navigation, then click Prepare Cluster.
-
Configure the cluster:
- Enter a Cluster Name (max 27 characters, must be unique).
- Select your On-Premises or LXD cloud account.
-
Submit — Click Create Cluster. A success toast confirms the cluster was created. The modal closes and you are redirected to the cluster view.
Create a Kubernetes cluster
-
Open Clusters — Click Clusters in the left sidebar (the Kubernetes view is shown by default), then click Prepare Cluster.
-
Configure:
- Enter a Cluster Name.
- Select your On-Premises or LXD cloud account.
-
Submit — Click Create Cluster. The modal closes and you are redirected to the cluster view.
Connect your cluster
After creating the cluster, the cluster detail page shows a message that the connection is not yet established and provides a link to installation instructions. Follow those instructions to install the Vantage connector on your cluster node.
The connector only needs outbound HTTPS access to Vantage servers (port 443). If your cluster is behind a firewall, ensure outbound connectivity is not blocked.
Once the connector is active:
- The cluster transitions from
preparingtoready - Nodes appear in the cluster detail page as they register
- You can submit jobs and launch notebooks immediately
On-premises clusters report their location as configured by your admin. Partitions and compute pools are configured post-creation from the cluster detail page.
Troubleshooting
Cluster stays in "preparing"
- Verify the connector was installed and is running on your head node.
- Check that port 443 outbound is not blocked by a firewall.
- Confirm the cloud account credentials are valid.
Nodes not appearing
- Install the connector on each compute node, not just the head node.
- Check that each node can reach Vantage servers over HTTPS.
Next steps
Which method should I use?
- Ansible — You manage your Slurm infrastructure with Ansible and want automated, repeatable deployments.
- Terraform — You manage your Slurm infrastructure with Terraform and want infrastructure-as-code deployments.
- Kubernetes — You have an existing Kubernetes cluster and want to connect it to Vantage.
- Juju (Charmed HPC) — You want a multi-node Slurm environment that simulates a production HPC cluster, running locally in LXD containers.
- Multipass — You want a quick, single-node Slurm environment on your laptop for development and testing.
- Manual — You have existing servers or bare-metal hardware and want full control over the installation process. Supports both Slurm and Kubernetes.
Ansible, Terraform, and Juju clusters are configured through their respective tooling. Manual clusters are configured through the Vantage web UI. Multipass clusters are created through the Vantage CLI.