This is the code I am using to prompt the Whisper model.
let promptText = "Glossary: Bucatini all'Amatriciana, Pan-Seared Halibut, Cioppino, Tiramisu, Panna Cotta."
let promptTokens = tokenizer.encode(text: promptText).filter { $0 < tokenizer.specialTokens.specialTokenBegin }
let options = DecodingOptions(skipSpecialTokens: true, promptTokens: promptTokens)
let segments = try await whisperKit.transcribe(audioPath: url.path, decodeOptions: options)
let newText = segments.map(\.text).joined()
I am consistently getting an empty output. I would understand if the transcription was erroneous, but empty? When not using the prompt, I get normal transcription. For context I am using the large-v3-v20240930_626mb (due to device limitations). I have played around with other prompts as well that mimic regular sentences, but same outcome (empty output).
This is the code I am using to prompt the Whisper model.
I am consistently getting an empty output. I would understand if the transcription was erroneous, but empty? When not using the prompt, I get normal transcription. For context I am using the large-v3-v20240930_626mb (due to device limitations). I have played around with other prompts as well that mimic regular sentences, but same outcome (empty output).