Skip to content

Thumbnail management#436

Open
aliceif wants to merge 11 commits intofedify-dev:mainfrom
aliceif:thumbnail_management
Open

Thumbnail management#436
aliceif wants to merge 11 commits intofedify-dev:mainfrom
aliceif:thumbnail_management

Conversation

@aliceif
Copy link
Copy Markdown
Contributor

@aliceif aliceif commented Apr 19, 2026

I decided to implement a way to clean up old thumbnails from storage.

Idea here:

  • provide a cutoff date
  • only delete thumbnails of remote posts that have not been ...
    • bookmarked
    • liked
    • reacted to
    • shared
    • quoted
  • ... by any user on the server

This is being done in a new page in the admin area titled "Thumbnail Cleanup":
image

The media table entry stays intact, so that the information of media existing does not get lost. In addition, some clients seem to (be able to) ignore thumbnails.

I have already tested it with both a locally hosted testing environment and my productive instance, on which it let me reclaim over 3GB out of 10GB storage used according to minio.

Related issue: #409

@dahlia dahlia self-assigned this Apr 21, 2026
@dahlia dahlia added enhancement New feature or request labels Apr 21, 2026
@dahlia dahlia linked an issue Apr 21, 2026 that may be closed by this pull request
@dahlia dahlia modified the milestone: Hollo 0.8 Apr 21, 2026
Copy link
Copy Markdown
Member

@dahlia dahlia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for your work! Since this is a new feature, could you change the target of this pull request to branch main instead of stable?

@aliceif aliceif changed the base branch from stable to main April 21, 2026 09:11
@aliceif aliceif force-pushed the thumbnail_management branch from dc1f74a to 5a06003 Compare April 21, 2026 21:53
@aliceif
Copy link
Copy Markdown
Contributor Author

aliceif commented Apr 21, 2026

and, after some effort, rebased onto main...

@aliceif aliceif requested a review from dahlia April 21, 2026 22:00
Copy link
Copy Markdown
Member

@dahlia dahlia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks for your contribution! I left several review comments.

Also, could you add a changelog item to the CHANGES.md document?

Comment thread src/cleanup/processors.ts
);
}

const key = medium.thumbnailUrl.replace(STORAGE_URL_BASE as string, "");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key derivation is off for S3-style storage URLs. If STORAGE_URL_BASE is something like https://bucket.example.com, replacing the prefix here leaves /media/.../thumbnail.webp, but the object was uploaded under media/.../thumbnail.webp without the leading slash. disk.delete() will miss the real object, so the job can look successful without actually reclaiming anything. Could we derive the key from the URL pathname and normalize the leading slash away before deleting?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure about this. I did try this on a filesystem-based install in my home network and also my productive instance which uses minio, and it worked on both, but my ASSETS_URL_BASE there is https://minio-hollo.x27.one/hollo-live/ - with trailing slash and a sub directory.
Wouldn't the best approach be to normalize the STORAGE_URL_BASE to include (or not include) a trailing slash (maybe in the storage part of the codebase?). Seems simpler than trying to reverse what media/${id}/thumbnail.webp does in media#uploadThumbnail with string manipulation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think normalizing STORAGE_URL_BASE would help, but I'd still prefer to derive the delete key from the URL pathname here.

The worker needs the exact storage key that was uploaded, and simple prefix replacement feels a bit fragile once STORAGE_URL_BASE can vary by trailing slash or include a subpath. Parsing the URL and normalizing from there seems safer to me.

Comment thread src/cleanup/processors.ts

const disk = drive.use();
await disk.delete(key);
await db
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the file is removed we still keep serving medium.thumbnailUrl everywhere. serializeMedium() and the web UI both use it as-is, so any cleaned item will point clients at a dead local thumbnail and show a broken preview. I think we need to either clear or fall back the thumbnail URL here, or teach the renderers to ignore thumbnails once thumbnailCleaned is true.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is something that needs figuring out before this is fully released - for my immediate needs breaking thumbnails felt file, but we will need actual handling for it.
In the mastodon API, preview URL is declared as nullable, so we can send that to the clients, if thumbnailCleaned is set (and probably also clear the meta:small properties) - they should know their preferred way of handling missing thumbnails: https://docs.joinmastodon.org/entities/MediaAttachment/#preview_url
For the post renderer that is built into hollo, I am not sure. The non-thumbnail url could be fallen back on, but that may be an external resource and could still disappear, or we could use a placeholder?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me.

For the Mastodon API, returning preview_url: null when thumbnailCleaned is set sounds like the right direction, and clearing the related thumbnail metadata at the same time would keep the response consistent.

For Hollo's own renderer, I don't think we need a perfect UX solution in this PR, but we do need to stop emitting a dead local thumbnail URL. Even a simple first pass such as “don't render the thumbnail when it has been cleaned” would be better than serving broken images.

Comment thread src/pages/thumbnail_cleanup.tsx
Comment thread src/cleanup/worker.ts Outdated
Comment thread src/pages/thumbnail_cleanup.tsx Outdated
Comment thread src/pages/thumbnail_cleanup.tsx Outdated
Comment thread src/components/DashboardLayout.tsx Outdated
Comment thread src/pages/thumbnail_cleanup.tsx Outdated
@aliceif aliceif requested a review from dahlia April 22, 2026 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to clear the cache?

2 participants