Skip to content

WordPress import: media endpoint returns absolute URLs, breaking internal-media recognition and port/origin changes #657

@shinobiworks

Description

@shinobiworks

Description

The WordPress import media endpoint (/_emdash/api/import/wordpress/media) builds absolute URLs for imported media by reading request.url directly, instead of returning the relative /_emdash/api/media/file/{storageKey} form used by every other media endpoint.

In packages/core/src/astro/routes/api/import/wordpress/media.ts:

const url = new URL(requestUrl);
const baseUrl = `${url.protocol}//${url.host}`;
// ...
const newUrl = `${baseUrl}/_emdash/api/media/file/${storageKey}`;

This diverges from the codebase's convention for internal media URLs:

Endpoint URL form
api/media.ts (listing) /_emdash/api/media/file/{key} (relative)
api/media/upload-url.ts /_emdash/api/media/file/{key} (relative)
api/media/[id]/confirm.ts /_emdash/api/media/file/{key} (relative)
api/import/wordpress/media.ts ${baseUrl}/_emdash/api/media/file/{key} (absolute)

This matters for two reasons:

  1. media/normalize.ts only recognizes the relative prefix as internal media. INTERNAL_MEDIA_PREFIX = "/_emdash/api/media/file/" is matched with url.startsWith(INTERNAL_MEDIA_PREFIX), so an absolute URL like http://localhost:4321/_emdash/api/media/file/... fails that check and gets treated as an external URL, bypassing the local provider's enrichment (dimensions, storage key, etc.).

  2. The absolute URL is pinned to whatever origin request.url had at import time. In dev, Astro auto-increments the port when 4321 is in use (so import runs on 4322), then later rendering on 4321 hits ERR_CONNECTION_REFUSED. Behind a reverse proxy, request.url is the internal origin (http://localhost:4321) rather than the public origin — the same scenario api/public-url.ts explicitly warns about. The getPublicOrigin(url, config) helper exists precisely to resolve config.siteUrlEMDASH_SITE_URL env → url.origin, but the WP import path never calls it.

The existingUrl branch (content-hash dedup reuse) has the same issue.

Possible fixes (if a fix is welcome):

  1. Minimal (preferred): drop the baseUrl prefix entirely and return /_emdash/api/media/file/${storageKey}, matching the other media endpoints. Unused requestUrl parameter can be removed. Renders resolve the URL against whatever origin serves the content; normalize.ts recognizes it as internal.
  2. Alternative: route URL generation through getPublicOrigin(url, config) from api/public-url.ts. Yields a stable absolute URL when config.siteUrl / EMDASH_SITE_URL is set, but the relative form already covers the same need and is simpler.

I have a working local patch for option 1 I can turn into a PR if that direction is acceptable.

Steps to reproduce

  1. Start a fresh EmDash dev server (emdash@0.4.0) with some other process already bound to :4321, so Astro falls back to :4322. (Equivalent setup: any reverse-proxied deployment where the internal origin differs from the public origin.)
  2. POST /_emdash/api/setup/dev-bypass?token=1 to get a PAT.
  3. Run the WP import endpoints in order (analyzeprepareexecutemediarewrite-urls) against a WXR with at least one post that references an attachment image.
  4. Inspect the media response urlMap — every value is http://localhost:4322/_emdash/api/media/file/{key}.
  5. Stop the dev server, start it again, this time on :4321.
  6. Visit a post that embeds an imported image.

Expected: the image loads (URL is relative so it resolves against the current origin).
Actual: the browser requests http://localhost:4322/_emdash/api/media/file/{key} and gets ERR_CONNECTION_REFUSED. The DB content has the import-time origin baked in.

Additionally: media/normalize.ts never enriches these values because the absolute form fails the INTERNAL_MEDIA_PREFIX check.

Environment

  • emdash: 0.4.0
  • @emdash-cms/cloudflare: 0.4.0
  • astro: 6.1.6
  • Node.js: 22.22.2
  • OS: Linux (Docker sandbox)
  • Template: starter-cloudflare

Logs / error output

GET http://localhost:4322/_emdash/api/media/file/01KPFPPG9XSST1WM1RR1BMGN4G.png
    net::ERR_CONNECTION_REFUSED


No server-side errors; the endpoint returns `success: true` and the `urlMap` with absolute URLs is persisted into content by the subsequent `rewrite-urls` call.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions