Add image processors refactor to v5 migration guide#45556
Open
yonigozlan wants to merge 1 commit intohuggingface:mainfrom
Open
Add image processors refactor to v5 migration guide#45556yonigozlan wants to merge 1 commit intohuggingface:mainfrom
yonigozlan wants to merge 1 commit intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
stevhliu
approved these changes
Apr 21, 2026
|
|
||
| ### Image Processors | ||
|
|
||
| The old slow/fast dual-file design — a PIL-based `image_processing_<model>.py` paired with a torchvision-based `image_processing_<model>_fast.py` — has been replaced with a named-backend architecture: |
Member
There was a problem hiding this comment.
Suggested change
| The old slow/fast dual-file design — a PIL-based `image_processing_<model>.py` paired with a torchvision-based `image_processing_<model>_fast.py` — has been replaced with a named-backend architecture: | |
| The old slow/fast dual-file design has been replaced with a named-backend architecture. Each model previously had a PIL-based `image_processing_<model>.py` and a torchvision-based `image_processing_<model>_fast.py`. The new layout is: |
| - `image_processing_<model>.py` → **torchvision** backend (default; was previously `FooImageProcessorFast`) | ||
| - `image_processing_pil_<model>.py` → **PIL** backend (was previously `FooImageProcessor`) | ||
|
|
||
| Processor classes now inherit from `TorchvisionBackend` or `PilBackend` (defined in `image_processing_backends.py`), which provide ready-made implementations of all standard operations (`resize`, `rescale`, `normalize`, `center_crop`, `pad`) and a default `_preprocess` pipeline. `BaseImageProcessor` (in `image_processing_utils`) handles the shared preprocessing boilerplate — kwargs validation, default-filling from class attributes, and input preparation — so model-specific processors contain only what is genuinely unique to the model. Most processors now simply inherit from a backend and declare class-attribute defaults; only processors with custom logic (e.g. patch tiling) need to override `_preprocess`. |
Member
There was a problem hiding this comment.
Suggested change
| Processor classes now inherit from `TorchvisionBackend` or `PilBackend` (defined in `image_processing_backends.py`), which provide ready-made implementations of all standard operations (`resize`, `rescale`, `normalize`, `center_crop`, `pad`) and a default `_preprocess` pipeline. `BaseImageProcessor` (in `image_processing_utils`) handles the shared preprocessing boilerplate — kwargs validation, default-filling from class attributes, and input preparation — so model-specific processors contain only what is genuinely unique to the model. Most processors now simply inherit from a backend and declare class-attribute defaults; only processors with custom logic (e.g. patch tiling) need to override `_preprocess`. | |
| Processor classes now inherit from `TorchvisionBackend` or `PilBackend` (defined in `image_processing_backends.py`), which provide ready-made implementations of all standard operations (`resize`, `rescale`, `normalize`, `center_crop`, `pad`) and a default `_preprocess` pipeline. `BaseImageProcessor` (in `image_processing_utils`) handles shared preprocessing boilerplate: kwargs validation, default-filling from class attributes, and input preparation. Model-specific processors contain only what is unique to the model. Most processors inherit from a backend and declare class-attribute defaults. Only those with custom logic (e.g. patch tiling) need to override `_preprocess`. |
| - Minor change: `XXXFastImageProcessorKwargs` is removed in favor of `XXXImageProcessorKwargs` which will be shared between fast and slow processors (https://github.com/huggingface/transformers/pull/40931) | ||
|
|
||
|
|
||
| ### Image Processors |
Member
There was a problem hiding this comment.
Suggested change
| ### Image Processors | |
| ### Image processors |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
As discussed internally @vasqu
Cc @stevhliu