Is there an existing issue for this?
Current Behavior
I'm getting a cudaErrorNoKernelImageForDevice error with a GTX 960. Is it not supported by the gpu-legacy image?
Expected Behavior
Inference should complete without errors
Steps To Reproduce
Try to transcribe some audio with a GTX 960 and driver version: 575.64.05 using the gpu-legacy image
Environment
CPU architecture
x86-64
Docker creation
services:
whsper:
image: lscr.io/linuxserver/faster-whisper:gpu-legacy
container_name: whisper
ports:
- "10300:10300"
environment:
- PUID=1000
- PGID=1000
- TZ=Etc/UTC
- WHISPER_MODEL=distil-small.en
- WHISPER_BEAM=5
- WHISPER_LANG=en
volumes:
- /srv/whisper:/config
restart: unless-stopped
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities:
- gpu
- utility
- compute
Container logs
whisper | [custom-init] No custom files found, skipping...
whisper | [custom-init] No custom files found, skipping...
whisper | [2025-08-28 18:42:44.355] [ctranslate2] [thread 161] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
whisper | [2025-08-28 18:42:44.355] [ctranslate2] [thread 161] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
whisper | INFO:__main__:Ready
whisper | INFO:__main__:Ready
whisper | Connection to localhost (127.0.0.1) 10300 port [tcp/*] succeeded!
whisper | Connection to localhost (127.0.0.1) 10300 port [tcp/*] succeeded!
whisper | [ls.io-init] done.
whisper | [ls.io-init] done.
whisper | INFO:faster_whisper:Processing audio with duration 00:04.110
whisper | INFO:faster_whisper:Processing audio with duration 00:04.110
whisper | ERROR:asyncio:Task exception was never retrieved
whisper | future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.12/site-packages/wyoming/server.py:31> exception=RuntimeError('parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device')>
whisper | Traceback (most recent call last):
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming/server.py", line 41, in run
whisper | if not (await self.handle_event(event)):
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in handle_event
whisper | text = " ".join(segment.text for segment in segments)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in <genexpr>
whisper | text = " ".join(segment.text for segment in segments)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1164, in generate_segments
whisper | encoder_output = self.encode(segment)
whisper | ^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1374, in encode
whisper | return self.model.encode(features, to_cpu=to_cpu)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
whisper | ERROR:asyncio:Task exception was never retrieved
whisper | future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.12/site-packages/wyoming/server.py:31> exception=RuntimeError('parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device')>
whisper | Traceback (most recent call last):
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming/server.py", line 41, in run
whisper | if not (await self.handle_event(event)):
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in handle_event
whisper | text = " ".join(segment.text for segment in segments)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/wyoming_faster_whisper/handler.py", line 76, in <genexpr>
whisper | text = " ".join(segment.text for segment in segments)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1164, in generate_segments
whisper | encoder_output = self.encode(segment)
whisper | ^^^^^^^^^^^^^^^^^^^^
whisper | File "/lsiopy/lib/python3.12/site-packages/faster_whisper/transcribe.py", line 1374, in encode
whisper | return self.model.encode(features, to_cpu=to_cpu)
whisper | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
whisper | RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
root@e2326a5e0839:/# nvidia-smi
Thu Aug 28 19:55:23 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.05 Driver Version: 575.64.05 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 960 Off | 00000000:01:00.0 Off | N/A |
| 0% 50C P8 14W / 160W | 2MiB / 2048MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Is there an existing issue for this?
Current Behavior
I'm getting a
cudaErrorNoKernelImageForDeviceerror with a GTX 960. Is it not supported by the gpu-legacy image?Expected Behavior
Inference should complete without errors
Steps To Reproduce
Try to transcribe some audio with a GTX 960 and driver version: 575.64.05 using the gpu-legacy image
Environment
- OS: ProxmoxCPU architecture
x86-64
Docker creation
services: whsper: image: lscr.io/linuxserver/faster-whisper:gpu-legacy container_name: whisper ports: - "10300:10300" environment: - PUID=1000 - PGID=1000 - TZ=Etc/UTC - WHISPER_MODEL=distil-small.en - WHISPER_BEAM=5 - WHISPER_LANG=en volumes: - /srv/whisper:/config restart: unless-stopped runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia capabilities: - gpu - utility - computeContainer logs