Pods
Pods are the running containers that serve inference requests for your endpoints. Each endpoint manages one or more pods — the endpoint's autoscaler creates and destroys pods based on traffic.
The pods list
The list view shows every pod across all endpoints in your workspace. Each row displays:
- Name — the pod's identifier.
- Endpoint — which endpoint this pod belongs to.
- Status — current phase (Running, Pending, Failed).
- Ready — whether the pod is accepting requests.
- Restarts — restart count.
- Age — how long the pod has been running.
Click a pod to see its full specification, conditions, and events.
When to check pods
- Endpoint not serving — check if pods are stuck in Pending or CrashLoopBackOff.
- High latency — check pod resource utilization and readiness probes.
- Debugging — pod logs and events give you container-level detail.
info
Pods are managed by the endpoint's autoscaler. You don't create or delete pods directly — they're created and terminated automatically based on traffic and scaling policy.
Next steps
- Deploying an endpoint — how endpoints are created
- Autoscaling — how pod count adjusts to traffic