Pods

Pods are the running containers that serve inference requests for your endpoints. Each endpoint manages one or more pods, the endpoint's autoscaler creates and destroys pods based on traffic.

The pods list

The list view shows every pod across all endpoints in your workspace. Each row displays:

Name: the pod's identifier.
Endpoint: which endpoint this pod belongs to.
Status: current phase (Running, Pending, Failed).
Ready: whether the pod is accepting requests.
Restarts: restart count.
Age: how long the pod has been running.

Click a pod to see its full specification, conditions, and events.

When to check pods

Endpoint not serving: check if pods are stuck in Pending or CrashLoopBackOff.
High latency: check pod resource utilization and readiness probes.
Debugging: pod logs and events give you container-level detail.

info

Pods are managed by the endpoint's autoscaler. You don't create or delete pods directly, they're created and terminated automatically based on traffic and scaling policy.

Next steps

Deploying an endpoint: how endpoints are created
Autoscaling: how pod count adjusts to traffic

The pods list​

When to check pods​

Next steps​

The pods list

When to check pods

Next steps