Skip to content

Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559

Open
ArthurZucker wants to merge 5 commits intohuggingface:mainfrom
ArthurZucker:drop-sampling-flag-warnings
Open

Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559
ArthurZucker wants to merge 5 commits intohuggingface:mainfrom
ArthurZucker:drop-sampling-flag-warnings

Conversation

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker commented Apr 22, 2026

Summary

GenerationConfig.validate() was warning about sampling-only flags (temperature, top_p, top_k, min_p, top_h, typical_p, epsilon_cutoff, eta_cutoff) whenever do_sample was not True, and about beam-only flags (early_stopping, length_penalty) whenever num_beams == 1 -- even when those values were inherited from the model's generation_config.json. In practice, nearly every popular Hub model ships with a non-default temperature/top_p, so users got a warning for every generate(do_sample=False) call.

This PR threads a user_set_attributes set through GenerationConfig.__init__ and update(), so validate() can distinguish attributes the caller explicitly provided from values inherited from the model's default config. The sampling-only and beam-only warnings now only fire for user-set attributes.

Behavior

  • generate(do_sample=False) on a model whose hub config has temperature=0.6, top_p=0.9: silent (values inherited, not user intent).
  • generate(do_sample=False, top_p=0.8): warns about top_p (user explicitly set both -- almost certainly a mistake).
  • GenerationConfig(do_sample=False, temperature=0.5): warns (both explicit).
  • generation_config.validate(strict=True) (e.g. from save_pretrained) with no user_set_attributes preserves the original "refuse to save bad configs" behavior.

cc @Cyrilvallez

ArthurZucker and others added 4 commits April 22, 2026 12:02
Setting `do_sample=False` is an explicit request for greedy decoding; at
that point we don't care if `temperature`, `top_p`, `top_k`, ... are set
-- they typically come from the model's `generation_config.json` and
have no effect. Same story for `early_stopping` / `length_penalty` when
`num_beams=1`. Remove these "minor issue" warnings from validate() and
update the affected tests to exercise a warning that still fires
(return_dict_in_generate=False + output_scores=True).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous version silenced the warnings entirely when `do_sample=False`.
That was too coarse: if a user explicitly passes `do_sample=False` AND
`top_p=0.8`, they really did make a mistake and deserve the warning.

Thread a `user_set_attributes` set through `GenerationConfig.__init__`
and `update()` so `validate()` can tell which attributes were explicitly
provided by the caller vs. inherited from the model's
`generation_config.json`. The sampling-only and beam-only flag warnings
now fire only for user-set attributes:

- `generate(do_sample=False)` on a model whose hub config has
  `temperature=0.6, top_p=0.9`: silent (values inherited).
- `generate(do_sample=False, top_p=0.8)`: warns about `top_p`.
- `GenerationConfig(do_sample=False, temperature=0.5)`: warns (both
  explicit).
- Direct `validate()` / `validate(strict=True)` calls with no
  `user_set_attributes` preserve the original behavior -- so
  `save_pretrained` still refuses bad configs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Gate section 2.1/2.2 on `do_sample` / `num_beams` themselves being in
`user_set_attributes`. A non-sampling mode inherited from the hub is not
an explicit user intent to skip sampling, so their sampling kwargs
shouldn't be flagged. The warning now fires only when the user provides
both `do_sample=False` (or `num_beams=1`) AND a conflicting flag in the
same context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers every scenario the user flagged:
- hub default sets sampling flags, user does only `generate(do_sample=False)` -> silent
- user sets both `do_sample=False` and `top_p=0.8` -> warn
- hub default forces `do_sample=False`, user sets only `top_p=0.8` -> silent
- analogous cases for `num_beams=1` + beam-only flags

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general i agree, users don't want to see warnings if they just turn off sampling iwth one flag. Though I am not sure about the reverse case, when they pass sampling params and the default config has do_sample=False. Kinda misleading if we silently fallback to greedy generation, no?

Comment on lines -639 to +660
if self.do_sample is not True:
#
# The warning is suppressed for flags that weren't explicitly set by the caller (see `_is_user_set`): values
# inherited from a model's `generation_config.json` are harmless when the user opts into greedy decoding.
# We also require `do_sample` itself to be user-set -- otherwise the non-sampling mode was inherited and the
# user never expressed intent to skip sampling, so flagging their sampling kwargs would be misleading.
if self.do_sample is not True and _is_user_set("do_sample"):
greedy_wrong_parameter_msg = (
"`do_sample` is set not to set `True`. However, `{flag_name}` is set to `{flag_value}` -- this flag is only "
"used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`."
"`do_sample` is set to `{do_sample}`. However, `{flag_name}` is set to `{flag_value}` -- this flag is "
"only used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great opportunity to move to hub-dataclass validation 😄

Comment on lines +182 to +188
# Inverse provenance case: `do_sample` inherited from a model's config (so not user-set this call), user only
# sets a sampling flag. The conflict shouldn't produce noise because the user never asked for greedy.
logger.warning_once.cache_clear()
greedy_hub_config = GenerationConfig(do_sample=False) # mimics a model's default config forcing greedy
with CaptureLogger(logger) as captured_logs:
greedy_hub_config.update(top_p=0.8)
self.assertEqual(len(captured_logs.out), 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure about this one. Beginner users might expect this to just work and sample with top-p, while we silently fallback to greedy

Comment on lines 711 to 723
# 2.4. check `num_return_sequences`
if self.num_return_sequences is not None and self.num_return_sequences > 1:
if self.num_beams is None or self.num_beams == 1:
if not self.do_sample:
raise ValueError(
"Greedy methods (do_sample != True) without beam search do not support "
f"`num_return_sequences` different than 1 (got {self.num_return_sequences})."
)
elif (
self.num_beams is not None
and self.num_return_sequences is not None
and self.num_return_sequences > self.num_beams
):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the is-user check is only when greedy decoding, no changes on other params?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants