feat: add configurable k8s tolerations to all backend deploy templates by jasonqinzhou · Pull Request #578 · ai-dynamo/aiconfigurator

jasonqinzhou · 2026-03-11T21:23:58Z

Adds optional K8sConfig.k8s_tolerations field to deployment_config.yaml and plumbs it through the TRT-LLM, vLLM, and SGLang k8s_deploy.yaml.j2 templates so generated manifests can target tainted node pools.

The field is guarded by a Jinja2 if block, so omitting or setting it to an empty string produces no tolerations section. The default ships two common tolerations as an example:

dedicated=user-workload (GCP GKE node pools)
nvidia.com/gpu Exists (GPU-only nodes)

Summary by CodeRabbit

Release Notes

New Features
- Added support for configurable Kubernetes tolerations across all deployment backends (SgLang, TRTLLM, vLLM). Users can now specify custom tolerations to manage pod scheduling on nodes with specific taints, including default tolerations for dedicated workloads and GPU nodes.

Adds optional `K8sConfig.k8s_tolerations` field to deployment_config.yaml and plumbs it through the TRT-LLM, vLLM, and SGLang k8s_deploy.yaml.j2 templates so generated manifests can target tainted node pools. The field is guarded by a Jinja2 if block, so omitting or setting it to an empty string produces no tolerations section. The default ships two common tolerations as an example: - dedicated=user-workload (GCP GKE node pools) - nvidia.com/gpu Exists (GPU-only nodes)

coderabbitai · 2026-03-11T21:41:37Z

Walkthrough

The pull request adds optional Kubernetes tolerations support across three backend deployment templates (sglang, trtllm, vllm) and introduces a new K8sConfig.k8s_tolerations configuration field. Tolerations are conditionally inserted into extraPodSpec sections when configured, enabling pods to tolerate specific node taints during scheduling.

Changes

Cohort / File(s)	Summary
Kubernetes Configuration Schema `src/aiconfigurator/generator/config/deployment_config.yaml`	Adds new optional K8sConfig.k8s_tolerations field with default multi-line YAML tolerations list including dedicated workload and NVIDIA GPU taints.
Backend Deployment Templates `src/aiconfigurator/generator/config/backend_templates/sglang/k8s_deploy.yaml.j2`, `src/aiconfigurator/generator/config/backend_templates/trtllm/k8s_deploy.yaml.j2`, `src/aiconfigurator/generator/config/backend_templates/vllm/k8s_deploy.yaml.j2`	Adds conditional tolerations blocks to extraPodSpec sections across multiple rendering contexts (worker, Frontend), gated by K8sConfig.k8s_tolerations with appropriate indentation. No changes to existing pod spec sections.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Hops through the templates with glee,
Tolerations now flow wild and free!
Taints beware, our pods shall not fail,
A carrot-patch scheduled without travail! 🥕✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: adding configurable Kubernetes tolerations to backend deployment templates across all three supported backends.
Description check	✅ Passed	The description covers Overview and Details sections but is missing 'Where should the reviewer start?' and 'Related Issues' sections from the template.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aiconfigurator/generator/config/deployment_config.yaml`:
- Around line 95-104: The runtime crash happens because templates call "{{
K8sConfig.k8s_tolerations | indent(10, True) }}" which requires a string but
user YAML may parse K8sConfig.k8s_tolerations as a list; fix by coercing lists
to a YAML/ multiline string before template rendering: detect when
K8sConfig.k8s_tolerations is a list in the template rendering pipeline (where
templates use the indent filter), serialize the list to a YAML block string
(preserve indentation/newlines), and replace the value with that string so the
Jinja2 indent filter receives a string; alternatively register a custom Jinja2
filter that yaml-dumps non-string input then calls indent, and ensure templates
use that filter consistently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a984c60b-e225-428c-8806-382edeeace24

📥 Commits

Reviewing files that changed from the base of the PR and between d00ef1a and aa1788a.

📒 Files selected for processing (4)

src/aiconfigurator/generator/config/backend_templates/sglang/k8s_deploy.yaml.j2
src/aiconfigurator/generator/config/backend_templates/trtllm/k8s_deploy.yaml.j2
src/aiconfigurator/generator/config/backend_templates/vllm/k8s_deploy.yaml.j2
src/aiconfigurator/generator/config/deployment_config.yaml

coderabbitai · 2026-03-11T21:41:40Z

+  - key: K8sConfig.k8s_tolerations
+    required: false
+    default: |-
+      - key: dedicated
+        operator: Equal
+        value: user-workload
+        effect: NoExecute
+      - key: nvidia.com/gpu
+        operator: Exists
+        effect: NoSchedule


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if there's any type conversion or validation for k8s_tolerations rg -n "k8s_tolerations" --type py -C3

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Search for k8s_tolerations in all file types rg "k8s_tolerations" --type-list # Search for k8s_tolerations across all files rg "k8s_tolerations" -C2 # Look for the indent filter usage in templates rg "indent" --type-list | head -20 rg "indent.*10" -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 10615

🏁 Script executed:

#!/bin/bash # Find template files find . -name "*.jinja*" -o -name "*.j2" | head -20 # Check the generator structure ls -la src/aiconfigurator/generator/ 2>/dev/null | head -30

Repository: ai-dynamo/aiconfigurator

Length of output: 2691

🏁 Script executed:

#!/bin/bash # Look for YAML schema loading and type handling rg "evaluate_expression\|apply_defaults" --type py -C3 | head -50 # Check how YAML configs are loaded rg "yaml.load\|yaml.safe_load" --type py -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Find where configs are loaded and K8sConfig is constructed rg "K8sConfig\|load.*config\|yaml.load" -C3 --type py | head -100 # Look for the Jinja2 environment setup rg "jinja|Environment|indent" --type py -C2 | head -100

Repository: ai-dynamo/aiconfigurator

Length of output: 6713

🏁 Script executed:

#!/bin/bash # Check the main API or generator entry points cat src/aiconfigurator/generator/api.py | head -100

Repository: ai-dynamo/aiconfigurator

Length of output: 3346

🏁 Script executed:

#!/bin/bash # Look for how user configs are merged or applied to defaults rg "evaluate_expression\|apply_defaults\|merge.*config" --type py -C3

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Check the full rendering engine setup cat src/aiconfigurator/generator/rendering/engine.py | head -200

Repository: ai-dynamo/aiconfigurator

Length of output: 8156

🏁 Script executed:

#!/bin/bash # Search for where user configs are applied rg "user.*config\|load.*yaml\|apply" --type py src/aiconfigurator/generator/ -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Look for config merging or value processing rg "merge\|update.*config\|parse.*config" --type py src/aiconfigurator/generator/ -C2

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Look for prepare_template_context function rg "prepare_template_context" --type py -A 30

Repository: ai-dynamo/aiconfigurator

Length of output: 10204

🏁 Script executed:

#!/bin/bash # Check where param_values comes from - look at aggregators and enumerate rg "param_values\|K8sConfig\|ServiceConfig" --type py src/aiconfigurator/generator/ | head -50

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Check test files to see how k8s_tolerations is expected to work find . -name "*test*.py" -exec rg "k8s_tolerations" {} \; -print

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🌐 Web query:

Jinja2 indent filter behavior with non-string inputs

💡 Result:

indent is a string-only filter in Jinja: its signature is indent(s: str, width=4, first=False, blank=False). It does not coerce arbitrary inputs to text for you. If you pass a non-string (e.g., list, dict, int), you’ll typically get a runtime TypeError when the filter tries to do string operations like s += "\n" / s.splitlines(). If s is a markupsafe.Markup, indent preserves that “safe” type by using Markup for the indentation and newline too. [1][2]

What to do with non-strings

Coerce explicitly: {{ value|string|indent(4) }} (or {{ "%s"|format(value)|indent(4) }}).

If it’s structured data, render it first (common patterns): tojson(indent=2) (already pretty-printed) or join lines: {{ my_list|join('\n')|indent(4) }}.

Sources
[1] Jinja “Template Designer Documentation” (filter signature for indent). (jinja.palletsprojects.com)
[2] Jinja source (do_indent implementation in filters.py, showing Markup handling and string operations). (github.com)

Citations:

1: https://jinja.palletsprojects.com/en/2.10.x/templates/?highlight=placeholder&utm_source=openai

2: https://github.com/pallets/jinja/blob/main/src/jinja2/filters.py

🏁 Script executed:

#!/bin/bash # Find and read the apply_defaults function rg "def apply_defaults" --type py -A 40

Repository: ai-dynamo/aiconfigurator

Length of output: 3381

🏁 Script executed:

#!/bin/bash # Look for how user configs are merged with defaults cat src/aiconfigurator/generator/rendering/schemas.py 2>/dev/null | head -150

Repository: ai-dynamo/aiconfigurator

Length of output: 3634

🏁 Script executed:

#!/bin/bash # Check aggregators for config merging/processing cat src/aiconfigurator/generator/aggregators.py 2>/dev/null | head -150

Repository: ai-dynamo/aiconfigurator

Length of output: 6310

🏁 Script executed:

#!/bin/bash # Check if there are any tests that use custom k8s_tolerations or test this scenario rg "test.*tolerations\|k8s_tolerations.*:" --type py -B2 -A5

Repository: ai-dynamo/aiconfigurator

Length of output: 50

🏁 Script executed:

#!/bin/bash # Check aggregators more carefully for any k8s_tolerations handling grep -n "k8s_tolerations" src/aiconfigurator/generator/aggregators.py

Repository: ai-dynamo/aiconfigurator

Length of output: 50

Runtime crash: indent filter expects string but user-provided tolerations may be parsed as a list.

The templates use {{ K8sConfig.k8s_tolerations | indent(10, True) }} which requires a string value. The |- default correctly provides a string. However, when users customize tolerations in their own YAML config:

K8sConfig: k8s_tolerations: - key: my-taint operator: Equal value: my-value effect: NoSchedule

This is parsed as a Python list. The Jinja2 indent filter is string-only and will crash with a TypeError during template rendering (it does not coerce non-strings).

Fix: Either:

Require users to provide multi-line string syntax in their configs (document this clearly)

Add a custom filter that converts lists to YAML before indenting

Add pre-template type validation that coerces list → YAML string

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/aiconfigurator/generator/config/deployment_config.yaml` around lines 95 - 104, The runtime crash happens because templates call "{{ K8sConfig.k8s_tolerations | indent(10, True) }}" which requires a string but user YAML may parse K8sConfig.k8s_tolerations as a list; fix by coercing lists to a YAML/ multiline string before template rendering: detect when K8sConfig.k8s_tolerations is a list in the template rendering pipeline (where templates use the indent filter), serialize the list to a YAML block string (preserve indentation/newlines), and replace the value with that string so the Jinja2 indent filter receives a string; alternatively register a custom Jinja2 filter that yaml-dumps non-string input then calls indent, and ensure templates use that filter consistently.

jasonqinzhou requested review from Arsene12358, Ethan-ES, Harrilee, ilyasher, simone-chen and tianhaox as code owners March 11, 2026 21:23

github-actions Bot added the feat label Mar 11, 2026

coderabbitai Bot reviewed Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add configurable k8s tolerations to all backend deploy templates#578

feat: add configurable k8s tolerations to all backend deploy templates#578
jasonqinzhou wants to merge 1 commit intomainfrom
feat/k8s-tolerations-support

jasonqinzhou commented Mar 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 11, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jasonqinzhou commented Mar 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 11, 2026

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jasonqinzhou commented Mar 11, 2026 •

edited by coderabbitai Bot

Loading