Skip to main content

Pods

The running instances behind your inference endpoints.

Pods

Pods are the running containers that serve inference requests for your endpoints. Each endpoint manages one or more pods — the endpoint's autoscaler creates and destroys pods based on traffic.

The pods list

The list view shows every pod across all endpoints in your workspace. Each row displays:

  • Name — the pod's identifier.
  • Endpoint — which endpoint this pod belongs to.
  • Status — current phase (Running, Pending, Failed).
  • Ready — whether the pod is accepting requests.
  • Restarts — restart count.
  • Age — how long the pod has been running.

Click a pod to see its full specification, conditions, and events.

When to check pods

  • Endpoint not serving — check if pods are stuck in Pending or CrashLoopBackOff.
  • High latency — check pod resource utilization and readiness probes.
  • Debugging — pod logs and events give you container-level detail.
info

Pods are managed by the endpoint's autoscaler. You don't create or delete pods directly — they're created and terminated automatically based on traffic and scaling policy.

Next steps

Ask AI
Ask a question about Vantage Compute...