Mount storage in endpoints
Endpoints can mount PersistentVolumeClaims to serve models or data directly from persistent storage, instead of pulling from a container registry on every pod restart. This is useful for large model files that are expensive to download.
Prerequisites
- A Kubernetes cluster in ready status
- A PVC containing the model or data files (see Work with PVCs)
- A registered model (see Register a model)
Mount storage during endpoint creation
- Navigate to Workbench > Endpoints and click
Deploy Endpoint. - Configure the model, runtime, and sizing as usual.
- In the Storage section of the creation wizard, click
Add Volume. - Select the PVC containing your model files.
- Set the Mount path (for example,
/models). - In the runtime configuration, point the model path to the mount location:
MODEL_PATH=/models/my-model.pt
- Complete the rest of the form and click
Deploy.
The endpoint pods mount the PVC at startup. Scaling replicas share the same PVC (requires ReadOnlyMany or ReadWriteMany access mode).
Access mode considerations
| Access mode | Works with endpoints? | Notes |
|---|---|---|
| ReadWriteOnce | Single replica only | Pod is locked to one node |
| ReadOnlyMany | Yes (recommended) | Multiple replicas, read-only access |
| ReadWriteMany | Yes | Multiple replicas, read-write access |
For serving, ReadOnlyMany is usually sufficient and avoids write contention.
tip
If your storage class does not support ReadOnlyMany, consider using an NFS-backed PVC which supports multi-reader access by default.