Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559
Drop noisy generate warnings when do_sample=False (or num_beams=1)#45559ArthurZucker wants to merge 5 commits intohuggingface:mainfrom
Conversation
Setting `do_sample=False` is an explicit request for greedy decoding; at that point we don't care if `temperature`, `top_p`, `top_k`, ... are set -- they typically come from the model's `generation_config.json` and have no effect. Same story for `early_stopping` / `length_penalty` when `num_beams=1`. Remove these "minor issue" warnings from validate() and update the affected tests to exercise a warning that still fires (return_dict_in_generate=False + output_scores=True). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous version silenced the warnings entirely when `do_sample=False`. That was too coarse: if a user explicitly passes `do_sample=False` AND `top_p=0.8`, they really did make a mistake and deserve the warning. Thread a `user_set_attributes` set through `GenerationConfig.__init__` and `update()` so `validate()` can tell which attributes were explicitly provided by the caller vs. inherited from the model's `generation_config.json`. The sampling-only and beam-only flag warnings now fire only for user-set attributes: - `generate(do_sample=False)` on a model whose hub config has `temperature=0.6, top_p=0.9`: silent (values inherited). - `generate(do_sample=False, top_p=0.8)`: warns about `top_p`. - `GenerationConfig(do_sample=False, temperature=0.5)`: warns (both explicit). - Direct `validate()` / `validate(strict=True)` calls with no `user_set_attributes` preserve the original behavior -- so `save_pretrained` still refuses bad configs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Gate section 2.1/2.2 on `do_sample` / `num_beams` themselves being in `user_set_attributes`. A non-sampling mode inherited from the hub is not an explicit user intent to skip sampling, so their sampling kwargs shouldn't be flagged. The warning now fires only when the user provides both `do_sample=False` (or `num_beams=1`) AND a conflicting flag in the same context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers every scenario the user flagged: - hub default sets sampling flags, user does only `generate(do_sample=False)` -> silent - user sets both `do_sample=False` and `top_p=0.8` -> warn - hub default forces `do_sample=False`, user sets only `top_p=0.8` -> silent - analogous cases for `num_beams=1` + beam-only flags Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
in general i agree, users don't want to see warnings if they just turn off sampling iwth one flag. Though I am not sure about the reverse case, when they pass sampling params and the default config has do_sample=False. Kinda misleading if we silently fallback to greedy generation, no?
| if self.do_sample is not True: | ||
| # | ||
| # The warning is suppressed for flags that weren't explicitly set by the caller (see `_is_user_set`): values | ||
| # inherited from a model's `generation_config.json` are harmless when the user opts into greedy decoding. | ||
| # We also require `do_sample` itself to be user-set -- otherwise the non-sampling mode was inherited and the | ||
| # user never expressed intent to skip sampling, so flagging their sampling kwargs would be misleading. | ||
| if self.do_sample is not True and _is_user_set("do_sample"): | ||
| greedy_wrong_parameter_msg = ( | ||
| "`do_sample` is set not to set `True`. However, `{flag_name}` is set to `{flag_value}` -- this flag is only " | ||
| "used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`." | ||
| "`do_sample` is set to `{do_sample}`. However, `{flag_name}` is set to `{flag_value}` -- this flag is " | ||
| "only used in sample-based generation modes. You should set `do_sample=True` or unset `{flag_name}`." |
There was a problem hiding this comment.
great opportunity to move to hub-dataclass validation 😄
| # Inverse provenance case: `do_sample` inherited from a model's config (so not user-set this call), user only | ||
| # sets a sampling flag. The conflict shouldn't produce noise because the user never asked for greedy. | ||
| logger.warning_once.cache_clear() | ||
| greedy_hub_config = GenerationConfig(do_sample=False) # mimics a model's default config forcing greedy | ||
| with CaptureLogger(logger) as captured_logs: | ||
| greedy_hub_config.update(top_p=0.8) | ||
| self.assertEqual(len(captured_logs.out), 0) |
There was a problem hiding this comment.
i am not sure about this one. Beginner users might expect this to just work and sample with top-p, while we silently fallback to greedy
| # 2.4. check `num_return_sequences` | ||
| if self.num_return_sequences is not None and self.num_return_sequences > 1: | ||
| if self.num_beams is None or self.num_beams == 1: | ||
| if not self.do_sample: | ||
| raise ValueError( | ||
| "Greedy methods (do_sample != True) without beam search do not support " | ||
| f"`num_return_sequences` different than 1 (got {self.num_return_sequences})." | ||
| ) | ||
| elif ( | ||
| self.num_beams is not None | ||
| and self.num_return_sequences is not None | ||
| and self.num_return_sequences > self.num_beams | ||
| ): |
There was a problem hiding this comment.
so the is-user check is only when greedy decoding, no changes on other params?
Summary
GenerationConfig.validate()was warning about sampling-only flags (temperature,top_p,top_k,min_p,top_h,typical_p,epsilon_cutoff,eta_cutoff) wheneverdo_samplewas notTrue, and about beam-only flags (early_stopping,length_penalty) whenevernum_beams == 1-- even when those values were inherited from the model'sgeneration_config.json. In practice, nearly every popular Hub model ships with a non-defaulttemperature/top_p, so users got a warning for everygenerate(do_sample=False)call.This PR threads a
user_set_attributesset throughGenerationConfig.__init__andupdate(), sovalidate()can distinguish attributes the caller explicitly provided from values inherited from the model's default config. The sampling-only and beam-only warnings now only fire for user-set attributes.Behavior
generate(do_sample=False)on a model whose hub config hastemperature=0.6, top_p=0.9: silent (values inherited, not user intent).generate(do_sample=False, top_p=0.8): warns abouttop_p(user explicitly set both -- almost certainly a mistake).GenerationConfig(do_sample=False, temperature=0.5): warns (both explicit).generation_config.validate(strict=True)(e.g. fromsave_pretrained) with nouser_set_attributespreserves the original "refuse to save bad configs" behavior.cc @Cyrilvallez