Skip to content

chore: clean up outdated or corrupted data#886

Open
tianhaox wants to merge 1 commit intomainfrom
chore/clean_up_outdated_data
Open

chore: clean up outdated or corrupted data#886
tianhaox wants to merge 1 commit intomainfrom
chore/clean_up_outdated_data

Conversation

@tianhaox
Copy link
Copy Markdown
Contributor

chore: clean up outdated or corrupted data

…erf alignment only.

Signed-off-by: Tianhao Xu <tianhaox@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the chore label Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Sanity Check Chart Generation Report

New perf data files were detected in this PR. Please use the link above to
download sanity check charts for the new perf data to compare the collected
perf data vs SOL (theoretical max performance).

Below is a report of whether the chart generation was successful for each op.
If doesn't validate whether the perf data itself is sane.

Chart Generation Report for system: a100_sxm, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:01 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:01 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:01 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:01 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:01 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=sglang version=0.5.8.post1.
15:29:01 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/sglang/0.5.8.post1
15:29:01 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:01 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:01 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: sglang, backend_version: 0.5.8

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:03 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:03 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:03 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:03 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:03 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=sglang version=0.5.8.
15:29:03 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/sglang/0.5.8
15:29:03 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:03 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:03 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:04 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:04 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:04 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:04 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:04 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=trtllm version=1.2.0rc5.
15:29:04 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/trtllm/1.2.0rc5
15:29:04 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:04 [aiconfigurator] [E] [main.py:650] Available versions: 1.0.0
15:29:04 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:05 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:05 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:05 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:05 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:05 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=vllm version=0.12.0.
15:29:05 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/vllm/0.12.0
15:29:05 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:05 [aiconfigurator] [E] [main.py:650] Available versions: 0.14.0
15:29:05 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: sglang, backend_version: 0.5.6.post2

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.6.post2 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:07 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:07 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:07 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:07 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:07 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=sglang version=0.5.6.post2.
15:29:07 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/sglang/0.5.6.post2
15:29:07 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:07 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:07 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: trtllm, backend_version: 1.0.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc6 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:08 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:08 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:08 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:08 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:08 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=trtllm version=1.0.0rc6.
15:29:08 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/trtllm/1.0.0rc6
15:29:08 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:08 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:08 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.14.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.14.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:10 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:10 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:10 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:10 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:10 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.14.0.
15:29:10 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.14.0
15:29:10 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:10 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:10 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.14.1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.14.1 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:12 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:12 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:12 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:12 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:12 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.14.1.
15:29:12 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.14.1
15:29:12 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:12 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:12 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.16.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.16.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:13 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:13 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:13 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:13 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:13 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.16.0.
15:29:13 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.16.0
15:29:13 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:13 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:13 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.17.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.17.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:15 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:15 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:15 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:15 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:15 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.17.0.
15:29:15 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.17.0
15:29:15 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:15 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:15 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:16 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:16 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:16 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:16 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:16 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=sglang version=0.5.8.post1.
15:29:16 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/sglang/0.5.8.post1
15:29:16 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:16 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:16 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.0.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc6 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:18 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:18 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:18 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:18 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:18 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.0.0rc6.
15:29:18 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.0.0rc6
15:29:18 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:18 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:18 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.1.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.1.0 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:20 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:20 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:20 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:20 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:20 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.1.0.
15:29:20 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.1.0
15:29:20 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:20 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:20 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.2.0rc5

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:21 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:21 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:21 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:21 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:21 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.2.0rc5.
15:29:21 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.2.0rc5
15:29:21 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:21 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:21 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.2.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc6 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:22 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:22 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:22 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:22 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:22 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.2.0rc6.
15:29:22 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.2.0rc6
15:29:22 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:22 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:22 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb300, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system gb300 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:24 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:24 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:24 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:24 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:24 [aiconfigurator] [E] [main.py:639] No perf database for system=gb300 backend=trtllm version=1.2.0rc5.
15:29:24 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb300/trtllm/1.2.0rc5
15:29:24 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:24 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:24 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb300, backend: trtllm, backend_version: 1.2.0rc6.post3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc6.post3 --system gb300 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:25 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:25 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:25 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:25 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:25 [aiconfigurator] [E] [main.py:639] No perf database for system=gb300 backend=trtllm version=1.2.0rc6.post3.
15:29:25 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb300/trtllm/1.2.0rc6.post3
15:29:25 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:25 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:25 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:27 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:27 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:27 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:27 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:27 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=sglang version=0.5.8.post1.
15:29:27 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.8.post1
15:29:27 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:27 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.6.post2, 0.5.9
15:29:27 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: trtllm, backend_version: 1.0.0rc3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc3 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:28 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:28 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:28 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:28 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:28 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=trtllm version=1.0.0rc3.
15:29:28 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/trtllm/1.0.0rc3
15:29:28 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:28 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:28 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:29 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:29 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:29 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:29 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:29 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=vllm version=0.12.0.
15:29:29 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/vllm/0.12.0
15:29:29 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:29 [aiconfigurator] [E] [main.py:650] Available versions: 0.14.0, 0.19.0
15:29:29 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h200_sxm, backend: trtllm, backend_version: 1.0.0rc3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc3 --system h200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:31 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:31 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:31 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:31 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:31 [aiconfigurator] [E] [main.py:639] No perf database for system=h200_sxm backend=trtllm version=1.0.0rc3.
15:29:31 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h200_sxm/trtllm/1.0.0rc3
15:29:31 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:31 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:31 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h200_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system h200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:33 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:33 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:33 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:33 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:33 [aiconfigurator] [E] [main.py:639] No perf database for system=h200_sxm backend=vllm version=0.12.0.
15:29:33 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h200_sxm/vllm/0.12.0
15:29:33 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:33 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:33 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: sglang, backend_version: 0.5.5.post3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend sglang --backend-version 0.5.5.post3 --system l40s --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:34 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:34 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:34 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:34 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:35 [aiconfigurator] [E] [main.py:639] No perf database for system=l40s backend=sglang version=0.5.5.post3.
15:29:35 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/l40s/sglang/0.5.5.post3
15:29:35 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:35 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:35 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system l40s --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:36 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:36 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:36 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:36 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:36 [aiconfigurator] [E] [main.py:639] No perf database for system=l40s backend=trtllm version=1.2.0rc5.
15:29:36 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/l40s/trtllm/1.2.0rc5
15:29:36 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:36 [aiconfigurator] [E] [main.py:650] Available versions: 1.0.0
15:29:36 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌
command / stdout / stderr
command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system l40s --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:37 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:37 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:37 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:37 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:37 [aiconfigurator] [E] [main.py:639] No perf database for system=l40s backend=vllm version=0.12.0.
15:29:37 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/l40s/vllm/0.12.0
15:29:37 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:37 [aiconfigurator] [E] [main.py:650] Available versions: 0.14.0
15:29:37 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

@tianhaox
Copy link
Copy Markdown
Contributor Author

Sanity Check Chart Generation Report

New perf data files were detected in this PR. Please use the link above to download sanity check charts for the new perf data to compare the collected perf data vs SOL (theoretical max performance).

Below is a report of whether the chart generation was successful for each op. If doesn't validate whether the perf data itself is sane.

Chart Generation Report for system: a100_sxm, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:01 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:01 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:01 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:01 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:01 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=sglang version=0.5.8.post1.
15:29:01 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/sglang/0.5.8.post1
15:29:01 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:01 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:01 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: sglang, backend_version: 0.5.8

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:03 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:03 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:03 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:03 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:03 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=sglang version=0.5.8.
15:29:03 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/sglang/0.5.8
15:29:03 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:03 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:03 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:04 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:04 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:04 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:04 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:04 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=trtllm version=1.2.0rc5.
15:29:04 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/trtllm/1.2.0rc5
15:29:04 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:04 [aiconfigurator] [E] [main.py:650] Available versions: 1.0.0
15:29:04 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: a100_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system a100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:05 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:05 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:05 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:05 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:05 [aiconfigurator] [E] [main.py:639] No perf database for system=a100_sxm backend=vllm version=0.12.0.
15:29:05 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/a100_sxm/vllm/0.12.0
15:29:05 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:05 [aiconfigurator] [E] [main.py:650] Available versions: 0.14.0
15:29:05 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: sglang, backend_version: 0.5.6.post2

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.6.post2 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:07 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:07 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:07 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:07 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:07 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=sglang version=0.5.6.post2.
15:29:07 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/sglang/0.5.6.post2
15:29:07 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:07 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:07 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: trtllm, backend_version: 1.0.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc6 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:08 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:08 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:08 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:08 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:08 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=trtllm version=1.0.0rc6.
15:29:08 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/trtllm/1.0.0rc6
15:29:08 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:08 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:08 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.14.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.14.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:10 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:10 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:10 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:10 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:10 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.14.0.
15:29:10 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.14.0
15:29:10 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:10 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:10 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.14.1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.14.1 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:12 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:12 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:12 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:12 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:12 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.14.1.
15:29:12 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.14.1
15:29:12 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:12 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:12 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.16.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.16.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:13 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:13 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:13 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:13 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:13 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.16.0.
15:29:13 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.16.0
15:29:13 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:13 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:13 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: b200_sxm, backend: vllm, backend_version: 0.17.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • dsa_module Error ❌: 'NoneType' object has no attribute 'query_context_dsa_module'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.17.0 --system b200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:15 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:15 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:15 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:15 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:15 [aiconfigurator] [E] [main.py:639] No perf database for system=b200_sxm backend=vllm version=0.17.0.
15:29:15 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/b200_sxm/vllm/0.17.0
15:29:15 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:15 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:15 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:16 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:16 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:16 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:16 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:16 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=sglang version=0.5.8.post1.
15:29:16 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/sglang/0.5.8.post1
15:29:16 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:16 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:16 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.0.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc6 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:18 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:18 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:18 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:18 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:18 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.0.0rc6.
15:29:18 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.0.0rc6
15:29:18 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:18 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:18 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.1.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.1.0 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:20 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:20 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:20 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:20 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:20 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.1.0.
15:29:20 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.1.0
15:29:20 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:20 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:20 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.2.0rc5

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:21 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:21 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:21 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:21 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:21 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.2.0rc5.
15:29:21 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.2.0rc5
15:29:21 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:21 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:21 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb200, backend: trtllm, backend_version: 1.2.0rc6

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc6 --system gb200 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:22 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:22 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:22 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:22 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:22 [aiconfigurator] [E] [main.py:639] No perf database for system=gb200 backend=trtllm version=1.2.0rc6.
15:29:22 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb200/trtllm/1.2.0rc6
15:29:22 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:22 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:22 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb300, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system gb300 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:24 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:24 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:24 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:24 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:24 [aiconfigurator] [E] [main.py:639] No perf database for system=gb300 backend=trtllm version=1.2.0rc5.
15:29:24 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb300/trtllm/1.2.0rc5
15:29:24 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:24 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:24 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: gb300, backend: trtllm, backend_version: 1.2.0rc6.post3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc6.post3 --system gb300 --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:25 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:25 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:25 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:25 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:25 [aiconfigurator] [E] [main.py:639] No perf database for system=gb300 backend=trtllm version=1.2.0rc6.post3.
15:29:25 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/gb300/trtllm/1.2.0rc6.post3
15:29:25 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:25 [aiconfigurator] [E] [main.py:650] Available versions: 1.3.0rc10
15:29:25 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: sglang, backend_version: 0.5.8.post1

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.8.post1 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:27 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:27 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:27 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:27 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:27 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=sglang version=0.5.8.post1.
15:29:27 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.8.post1
15:29:27 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:27 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.6.post2, 0.5.9
15:29:27 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: trtllm, backend_version: 1.0.0rc3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc3 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:28 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:28 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:28 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:28 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:28 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=trtllm version=1.0.0rc3.
15:29:28 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/trtllm/1.0.0rc3
15:29:28 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:28 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:28 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h100_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system h100_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:29 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:29 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:29 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:29 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:29 [aiconfigurator] [E] [main.py:639] No perf database for system=h100_sxm backend=vllm version=0.12.0.
15:29:29 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h100_sxm/vllm/0.12.0
15:29:29 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:29 [aiconfigurator] [E] [main.py:650] Available versions: 0.14.0, 0.19.0
15:29:29 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h200_sxm, backend: trtllm, backend_version: 1.0.0rc3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.0.0rc3 --system h200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:31 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:31 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:31 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:31 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:31 [aiconfigurator] [E] [main.py:639] No perf database for system=h200_sxm backend=trtllm version=1.0.0rc3.
15:29:31 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h200_sxm/trtllm/1.0.0rc3
15:29:31 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:31 [aiconfigurator] [E] [main.py:650] Available versions: 1.2.0rc5, 1.3.0rc10
15:29:31 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: h200_sxm, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend vllm --backend-version 0.12.0 --system h200_sxm --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:33 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:33 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:33 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:33 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=vllm
15:29:33 [aiconfigurator] [E] [main.py:639] No perf database for system=h200_sxm backend=vllm version=0.12.0.
15:29:33 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/h200_sxm/vllm/0.12.0
15:29:33 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:33 [aiconfigurator] [E] [main.py:650] Available versions: 0.19.0
15:29:33 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: sglang, backend_version: 0.5.5.post3

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend sglang --backend-version 0.5.5.post3 --system l40s --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:34 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:34 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:34 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:34 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=sglang
15:29:35 [aiconfigurator] [E] [main.py:639] No perf database for system=l40s backend=sglang version=0.5.5.post3.
15:29:35 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/l40s/sglang/0.5.5.post3
15:29:35 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:35 [aiconfigurator] [E] [main.py:650] Available versions: 0.5.9
15:29:35 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: trtllm, backend_version: 1.2.0rc5

  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • CLI smoke test ❌

command / stdout / stderr

command:
aiconfigurator cli default --backend trtllm --backend-version 1.2.0rc5 --system l40s --model Qwen/Qwen3-32B --total-gpus 16

stdout:
15:29:36 [aiconfigurator] [I] [main.py:1532] Loading Dynamo AIConfigurator version: 0.8.0
15:29:36 [aiconfigurator] [I] [main.py:1533] Number of top configurations to output: 5 (change with --top-n)
15:29:36 [aiconfigurator] [W] [main.py:1554] Using default SLA/workload parameters: ISL=4000, OSL=1000, TTFT=2000.0, TPOT=30.0. These act as filters — configurations exceeding these thresholds are excluded. Set them explicitly (e.g. --ttft, --tpot, --isl, --osl) to avoid unexpected filtering.
15:29:36 [aiconfigurator] [I] [main.py:1560] Effective parameters: ISL=4000, OSL=1000, TTFT=2000.0ms, TPOT=30.0ms, backend=trtllm
15:29:36 [aiconfigurator] [E] [main.py:639] No perf database for system=l40s backend=trtllm version=1.2.0rc5.
15:29:36 [aiconfigurator] [E] [main.py:647] Searched: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems/data/l40s/trtllm/1.2.0rc5
15:29:36 [aiconfigurator] [E] [main.py:648] Configured systems paths: /home/runner/work/aiconfigurator/aiconfigurator/src/aiconfigurator/systems
15:29:36 [aiconfigurator] [E] [main.py:650] Available versions: 1.0.0
15:29:36 [aiconfigurator] [E] [main.py:651] Fix: switch --backend-version to one of the available versions, or remove --backend-version to use latest.

Chart Generation Report for system: l40s, backend: vllm, backend_version: 0.12.0

  • gemm Error ❌: 'NoneType' object has no attribute '_gemm_data'
  • context_attention Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • context_attention_with_prefix Error ❌: 'NoneType' object has no attribute '_context_attention_data'
  • generation_attention Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • generation_attention_b Error ❌: 'NoneType' object has no attribute '_generation_attention_data'
  • context_mla_with_prefix Error ❌: 'NoneType' object has no attribute '_context_mla_data'
  • generation_mla Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • generation_mla_b Error ❌: 'NoneType' object has no attribute '_generation_mla_data'
  • moe Error ❌: 'NoneType' object has no attribute '_moe_data'
  • allreduce Error ❌: 'NoneType' object has no attribute '_custom_allreduce_data'
  • CLI smoke test ❌

command / stdout / stderr

this 'delete' also triggers a sanity check, we can just ignore this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants