Skip to content

[BUG] cudaErrorNoKernelImageForDevice on Maxwell GPU #50

@azizi-a

Description

@azizi-a

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I'm getting a cudaErrorNoKernelImageForDevice error with a GTX 960. Is it not supported by the gpu-legacy image?

Expected Behavior

Inference should complete without errors

Steps To Reproduce

Try to transcribe some audio with a GTX 960 and driver version: 575.64.05 using the gpu-legacy image

Environment

- OS: Proxmox

CPU architecture

x86-64

Docker creation

services:
  whsper:
    image: lscr.io/linuxserver/faster-whisper:gpu-legacy
    container_name: whisper
    ports:
      - "10300:10300"
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Etc/UTC
      - WHISPER_MODEL=distil-small.en
      - WHISPER_BEAM=5
      - WHISPER_LANG=en
    volumes:
      - /srv/whisper:/config
    restart: unless-stopped
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities:
              - gpu
              - utility
              - compute

Container logs

whisper  | [custom-init] No custom files found, skipping...
whisper  | [custom-init] No custom files found, skipping...
whisper  | [2025-08-28 18:42:44.355] [ctranslate2] [thread 161] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
whisper  | [2025-08-28 18:42:44.355] [ctranslate2] [thread 161] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
whisper  | INFO:__main__:Ready
whisper  | INFO:__main__:Ready
whisper  | Connection to localhost (127.0.0.1) 10300 port [tcp/*] succeeded!
whisper  | Connection to localhost (127.0.0.1) 10300 port [tcp/*] succeeded!
whisper  | [ls.io-init] done.
whisper  | [ls.io-init] done.
whisper  | INFO:faster_whisper:Processing audio with duration 00:04.110
whisper  | INFO:faster_whisper:Processing audio with duration 00:04.110
whisper  | ERROR:asyncio:Task exception was never retrieved
whisper  | future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.12/site-packages/wyoming/server.py:31> exception=RuntimeError('parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device')>
whisper  | Traceback (most recent call last):
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming/server.py", line 41, in run
whisper  |     if not (await self.handle_event(event)):
whisper  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in handle_event
whisper  |     text = " ".join(segment.text for segment in segments)
whisper  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in <genexpr>
whisper  |     text = " ".join(segment.text for segment in segments)
whisper  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1164, in generate_segments
whisper  |     encoder_output = self.encode(segment)
whisper  |                      ^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1374, in encode
whisper  |     return self.model.encode(features, to_cpu=to_cpu)
whisper  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  | RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
whisper  | ERROR:asyncio:Task exception was never retrieved
whisper  | future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.12/site-packages/wyoming/server.py:31> exception=RuntimeError('parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device')>
whisper  | Traceback (most recent call last):
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming/server.py", line 41, in run
whisper  |     if not (await self.handle_event(event)):
whisper  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in handle_event
whisper  |     text = " ".join(segment.text for segment in segments)
whisper  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in <genexpr>
whisper  |     text = " ".join(segment.text for segment in segments)
whisper  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1164, in generate_segments
whisper  |     encoder_output = self.encode(segment)
whisper  |                      ^^^^^^^^^^^^^^^^^^^^
whisper  |   File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1374, in encode
whisper  |     return self.model.encode(features, to_cpu=to_cpu)
whisper  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper  | RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
root@e2326a5e0839:/# nvidia-smi
Thu Aug 28 19:55:23 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.05              Driver Version: 575.64.05      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 960         Off |   00000000:01:00.0 Off |                  N/A |
|  0%   50C    P8             14W /  160W |       2MiB /   2048MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions