Skip to content

/devices/verify returns generic 500 when dashboard assets are missing instead of a clear error #4603

@strickvl

Description

@strickvl

When a ZenML server is running without bundled dashboard assets, hitting the browser device-auth route:

  • /devices/verify?...

can return a generic:

{"detail":"An unexpected error occurred."}

with a server-side TemplateNotFound: 'index.html' traceback.

This makes device authorization failures much harder to diagnose and leaves the CLI stuck polling with authorization_pending.

Why this matters

The underlying problem may be packaging/build-related (missing dashboard assets), but the current runtime behavior hides that root cause behind an opaque 500.

That creates a confusing user story:

  1. CLI starts OAuth device flow successfully
  2. browser opens /devices/verify?...
  3. user sees generic error JSON
  4. CLI keeps polling /api/v1/login
  5. CLI reports authorization_pending / device still pending
  6. actual root cause is only visible in server logs

Reproduction

This was reproduced with a local server setup where the installed ZenML package did not include:

  • zenml/zen_server/dashboard/index.html

Observed behavior:

  • GET /health -> 200
  • GET / -> 404
  • GET /devices/verify?device_id=test&user_code=test -> 500 {"detail":"An unexpected error occurred."}

Server logs showed:

jinja2.exceptions.TemplateNotFound: 'index.html' not found in search path: '/opt/venv/lib/python3.11/site-packages/zenml/zen_server/dashboard'

Root cause

From code inspection, the problem appears to be:

  • zen_server_api.py root route (/) checks whether dashboard/index.html exists and returns 404 if it does not.
  • But the catch-all route used for SPA paths like /devices/verify directly calls TemplateResponse("index.html", ...) without first checking whether the file exists.
  • Middleware then converts the resulting exception into the generic:
    • {"detail":"An unexpected error occurred."}

So the runtime discovers “dashboard assets missing” late and opaquely.

Expected behavior

If dashboard assets are missing, routes like /devices/verify should fail with a clear, explicit error, for example:

  • 404 Dashboard assets are not installed
  • or 500 ZenML dashboard bundle missing from installation

Anything would be better than the current generic 500.

Suggested fix

One of these would help:

  1. Make the catch-all route perform the same existence check that / already performs before rendering index.html
  2. Return a clearer error message when dashboard assets are absent
  3. Optionally fail earlier at startup if dashboard-backed routes are enabled but dashboard assets are missing

Additional context

This showed up while debugging a local Docker login flow on kitaru, but the issue seems broader than Kitaru specifically: any ZenML server installation missing dashboard assets could hit the same opaque failure mode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    core-teamIssues that are being handled by the core team

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions