Skip to content

Using promptTokens causes the Transcription to return empty result. #372

@oabdullah3

Description

@oabdullah3

This is the code I am using to prompt the Whisper model.

let promptText = "Glossary: Bucatini all'Amatriciana, Pan-Seared Halibut, Cioppino, Tiramisu, Panna Cotta."
let promptTokens = tokenizer.encode(text: promptText).filter { $0 < tokenizer.specialTokens.specialTokenBegin }
let options = DecodingOptions(skipSpecialTokens: true, promptTokens: promptTokens)
let segments = try await whisperKit.transcribe(audioPath: url.path, decodeOptions: options)
let newText = segments.map(\.text).joined()

I am consistently getting an empty output. I would understand if the transcription was erroneous, but empty? When not using the prompt, I get normal transcription. For context I am using the large-v3-v20240930_626mb (due to device limitations). I have played around with other prompts as well that mimic regular sentences, but same outcome (empty output).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions