The Geometry of Catastrophic Forgetting: Scale and Batch Size Reveal a Two-Dimensional Behavioral Shift in Language Models
transformers pytorch trend-analysis fine-tuning batch-size catastrophic-forgetting perplexity forgetting huggingface huggingface-transformers hugging-face llm llms perplexity-ai qwen qwen2-5 model-scaling
-
Updated
Apr 8, 2026 - Python