Skip to content

feat: add configurable k8s tolerations to all backend deploy templates#578

Open
jasonqinzhou wants to merge 1 commit intomainfrom
feat/k8s-tolerations-support
Open

feat: add configurable k8s tolerations to all backend deploy templates#578
jasonqinzhou wants to merge 1 commit intomainfrom
feat/k8s-tolerations-support

Conversation

@jasonqinzhou
Copy link
Copy Markdown
Contributor

@jasonqinzhou jasonqinzhou commented Mar 11, 2026

Adds optional K8sConfig.k8s_tolerations field to deployment_config.yaml and plumbs it through the TRT-LLM, vLLM, and SGLang k8s_deploy.yaml.j2 templates so generated manifests can target tainted node pools.

The field is guarded by a Jinja2 if block, so omitting or setting it to an empty string produces no tolerations section. The default ships two common tolerations as an example:

  • dedicated=user-workload (GCP GKE node pools)
  • nvidia.com/gpu Exists (GPU-only nodes)

Summary by CodeRabbit

Release Notes

  • New Features
    • Added support for configurable Kubernetes tolerations across all deployment backends (SgLang, TRTLLM, vLLM). Users can now specify custom tolerations to manage pod scheduling on nodes with specific taints, including default tolerations for dedicated workloads and GPU nodes.

Adds optional `K8sConfig.k8s_tolerations` field to deployment_config.yaml
and plumbs it through the TRT-LLM, vLLM, and SGLang k8s_deploy.yaml.j2
templates so generated manifests can target tainted node pools.

The field is guarded by a Jinja2 if block, so omitting or setting it to
an empty string produces no tolerations section. The default ships two
common tolerations as an example:
  - dedicated=user-workload (GCP GKE node pools)
  - nvidia.com/gpu Exists (GPU-only nodes)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 11, 2026

Walkthrough

The pull request adds optional Kubernetes tolerations support across three backend deployment templates (sglang, trtllm, vllm) and introduces a new K8sConfig.k8s_tolerations configuration field. Tolerations are conditionally inserted into extraPodSpec sections when configured, enabling pods to tolerate specific node taints during scheduling.

Changes

Cohort / File(s) Summary
Kubernetes Configuration Schema
src/aiconfigurator/generator/config/deployment_config.yaml
Adds new optional K8sConfig.k8s_tolerations field with default multi-line YAML tolerations list including dedicated workload and NVIDIA GPU taints.
Backend Deployment Templates
src/aiconfigurator/generator/config/backend_templates/sglang/k8s_deploy.yaml.j2, src/aiconfigurator/generator/config/backend_templates/trtllm/k8s_deploy.yaml.j2, src/aiconfigurator/generator/config/backend_templates/vllm/k8s_deploy.yaml.j2
Adds conditional tolerations blocks to extraPodSpec sections across multiple rendering contexts (worker, Frontend), gated by K8sConfig.k8s_tolerations with appropriate indentation. No changes to existing pod spec sections.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Hops through the templates with glee,
Tolerations now flow wild and free!
Taints beware, our pods shall not fail,
A carrot-patch scheduled without travail! 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding configurable Kubernetes tolerations to backend deployment templates across all three supported backends.
Description check ✅ Passed The description covers Overview and Details sections but is missing 'Where should the reviewer start?' and 'Related Issues' sections from the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aiconfigurator/generator/config/deployment_config.yaml`:
- Around line 95-104: The runtime crash happens because templates call "{{
K8sConfig.k8s_tolerations | indent(10, True) }}" which requires a string but
user YAML may parse K8sConfig.k8s_tolerations as a list; fix by coercing lists
to a YAML/ multiline string before template rendering: detect when
K8sConfig.k8s_tolerations is a list in the template rendering pipeline (where
templates use the indent filter), serialize the list to a YAML block string
(preserve indentation/newlines), and replace the value with that string so the
Jinja2 indent filter receives a string; alternatively register a custom Jinja2
filter that yaml-dumps non-string input then calls indent, and ensure templates
use that filter consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a984c60b-e225-428c-8806-382edeeace24

📥 Commits

Reviewing files that changed from the base of the PR and between d00ef1a and aa1788a.

📒 Files selected for processing (4)
  • src/aiconfigurator/generator/config/backend_templates/sglang/k8s_deploy.yaml.j2
  • src/aiconfigurator/generator/config/backend_templates/trtllm/k8s_deploy.yaml.j2
  • src/aiconfigurator/generator/config/backend_templates/vllm/k8s_deploy.yaml.j2
  • src/aiconfigurator/generator/config/deployment_config.yaml

Comment on lines +95 to +104
- key: K8sConfig.k8s_tolerations
required: false
default: |-
- key: dedicated
operator: Equal
value: user-workload
effect: NoExecute
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if there's any type conversion or validation for k8s_tolerations
rg -n "k8s_tolerations" --type py -C3

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Search for k8s_tolerations in all file types
rg "k8s_tolerations" --type-list

# Search for k8s_tolerations across all files
rg "k8s_tolerations" -C2

# Look for the indent filter usage in templates
rg "indent" --type-list | head -20
rg "indent.*10" -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 10615


🏁 Script executed:

#!/bin/bash
# Find template files
find . -name "*.jinja*" -o -name "*.j2" | head -20

# Check the generator structure
ls -la src/aiconfigurator/generator/ 2>/dev/null | head -30

Repository: ai-dynamo/aiconfigurator

Length of output: 2691


🏁 Script executed:

#!/bin/bash
# Look for YAML schema loading and type handling
rg "evaluate_expression\|apply_defaults" --type py -C3 | head -50

# Check how YAML configs are loaded
rg "yaml.load\|yaml.safe_load" --type py -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Find where configs are loaded and K8sConfig is constructed
rg "K8sConfig\|load.*config\|yaml.load" -C3 --type py | head -100

# Look for the Jinja2 environment setup
rg "jinja|Environment|indent" --type py -C2 | head -100

Repository: ai-dynamo/aiconfigurator

Length of output: 6713


🏁 Script executed:

#!/bin/bash
# Check the main API or generator entry points
cat src/aiconfigurator/generator/api.py | head -100

Repository: ai-dynamo/aiconfigurator

Length of output: 3346


🏁 Script executed:

#!/bin/bash
# Look for how user configs are merged or applied to defaults
rg "evaluate_expression\|apply_defaults\|merge.*config" --type py -C3

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Check the full rendering engine setup
cat src/aiconfigurator/generator/rendering/engine.py | head -200

Repository: ai-dynamo/aiconfigurator

Length of output: 8156


🏁 Script executed:

#!/bin/bash
# Search for where user configs are applied
rg "user.*config\|load.*yaml\|apply" --type py src/aiconfigurator/generator/ -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Look for config merging or value processing
rg "merge\|update.*config\|parse.*config" --type py src/aiconfigurator/generator/ -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Look for prepare_template_context function
rg "prepare_template_context" --type py -A 30

Repository: ai-dynamo/aiconfigurator

Length of output: 10204


🏁 Script executed:

#!/bin/bash
# Check where param_values comes from - look at aggregators and enumerate
rg "param_values\|K8sConfig\|ServiceConfig" --type py src/aiconfigurator/generator/ | head -50

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Check test files to see how k8s_tolerations is expected to work
find . -name "*test*.py" -exec rg "k8s_tolerations" {} \; -print

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🌐 Web query:

Jinja2 indent filter behavior with non-string inputs

💡 Result:

indent is a string-only filter in Jinja: its signature is indent(s: str, width=4, first=False, blank=False). It does not coerce arbitrary inputs to text for you. If you pass a non-string (e.g., list, dict, int), you’ll typically get a runtime TypeError when the filter tries to do string operations like s += "\n" / s.splitlines(). If s is a markupsafe.Markup, indent preserves that “safe” type by using Markup for the indentation and newline too. [1][2]

What to do with non-strings

  • Coerce explicitly: {{ value|string|indent(4) }} (or {{ "%s"|format(value)|indent(4) }}).
  • If it’s structured data, render it first (common patterns): tojson(indent=2) (already pretty-printed) or join lines: {{ my_list|join('\n')|indent(4) }}.

Sources
[1] Jinja “Template Designer Documentation” (filter signature for indent). (jinja.palletsprojects.com)
[2] Jinja source (do_indent implementation in filters.py, showing Markup handling and string operations). (github.com)

Citations:


🏁 Script executed:

#!/bin/bash
# Find and read the apply_defaults function
rg "def apply_defaults" --type py -A 40

Repository: ai-dynamo/aiconfigurator

Length of output: 3381


🏁 Script executed:

#!/bin/bash
# Look for how user configs are merged with defaults
cat src/aiconfigurator/generator/rendering/schemas.py 2>/dev/null | head -150

Repository: ai-dynamo/aiconfigurator

Length of output: 3634


🏁 Script executed:

#!/bin/bash
# Check aggregators for config merging/processing
cat src/aiconfigurator/generator/aggregators.py 2>/dev/null | head -150

Repository: ai-dynamo/aiconfigurator

Length of output: 6310


🏁 Script executed:

#!/bin/bash
# Check if there are any tests that use custom k8s_tolerations or test this scenario
rg "test.*tolerations\|k8s_tolerations.*:" --type py -B2 -A5

Repository: ai-dynamo/aiconfigurator

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Check aggregators more carefully for any k8s_tolerations handling
grep -n "k8s_tolerations" src/aiconfigurator/generator/aggregators.py

Repository: ai-dynamo/aiconfigurator

Length of output: 50


Runtime crash: indent filter expects string but user-provided tolerations may be parsed as a list.

The templates use {{ K8sConfig.k8s_tolerations | indent(10, True) }} which requires a string value. The |- default correctly provides a string. However, when users customize tolerations in their own YAML config:

K8sConfig:
  k8s_tolerations:
    - key: my-taint
      operator: Equal
      value: my-value
      effect: NoSchedule

This is parsed as a Python list. The Jinja2 indent filter is string-only and will crash with a TypeError during template rendering (it does not coerce non-strings).

Fix: Either:

  1. Require users to provide multi-line string syntax in their configs (document this clearly)
  2. Add a custom filter that converts lists to YAML before indenting
  3. Add pre-template type validation that coerces list → YAML string
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/aiconfigurator/generator/config/deployment_config.yaml` around lines 95 -
104, The runtime crash happens because templates call "{{
K8sConfig.k8s_tolerations | indent(10, True) }}" which requires a string but
user YAML may parse K8sConfig.k8s_tolerations as a list; fix by coercing lists
to a YAML/ multiline string before template rendering: detect when
K8sConfig.k8s_tolerations is a list in the template rendering pipeline (where
templates use the indent filter), serialize the list to a YAML block string
(preserve indentation/newlines), and replace the value with that string so the
Jinja2 indent filter receives a string; alternatively register a custom Jinja2
filter that yaml-dumps non-string input then calls indent, and ensure templates
use that filter consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant