Skip to content

Modified unused audio_bytes in AIWorker#1

Open
MichaelWaruiru wants to merge 6 commits intoThairu-dev:mainfrom
MichaelWaruiru:local-ollama
Open

Modified unused audio_bytes in AIWorker#1
MichaelWaruiru wants to merge 6 commits intoThairu-dev:mainfrom
MichaelWaruiru:local-ollama

Conversation

@MichaelWaruiru
Copy link
Copy Markdown

@MichaelWaruiru MichaelWaruiru commented Nov 24, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Added robust error handling for transcription processing to prevent crashes and surface parsing issues.
  • Improvements

    • Prompt refined so responses return only the transcribed text with no extra commentary.
    • Extraction logic enhanced to handle multiple response formats and aggregate parts into a single trimmed transcription.
  • Chores

    • Updated ignore rules to exclude certain build asset images.
  • Documentation

    • Minor README formatting fix for hotkey display.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 24, 2025

Walkthrough

Transcription flow in main.py now forces the model to return only raw transcribed text, prefers response.text then falls back to concatenating response.parts (handling dict entries and object attributes), trims the result, and wraps parsing in try/catch with logging. .gitignore adds resources/*.png.

Changes

Cohort / File(s) Summary
Transcription Flow Improvements
main.py
Prompt updated to require text-only output; extraction now prefers response.text, falls back to concatenating response.parts (supports dict entries and object attributes); result is trimmed; parsing wrapped in try/catch with logging.
VCS ignore updates
.gitignore
Added rule to ignore resources/*.png under Build Artifacts.
Docs formatting fix
README.md
Escaped backslash in hotkey table row to correct markdown presentation (no behavioral change).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Transcribe as TranscriptionFlow
    participant API
    participant Handler as ResponseHandler

    Client->>Transcribe: request transcription
    Transcribe->>API: send prompt (require text-only)
    API-->>Transcribe: response

    Transcribe->>Handler: parse response
    alt response.text present
        Handler-->>Transcribe: return response.text
    else
        Handler->>Handler: iterate response.parts (dicts or objects)
        Handler-->>Transcribe: return concatenated text
    end

    Transcribe->>Transcribe: trim result
    alt parsing error
        Transcribe->>Transcribe: catch & log error
        Transcribe-->>Client: emit error/info
    else
        Transcribe-->>Client: emit transcription
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus:
    • Branching correctness between response.text and response.parts
    • Handling both dict-style and attribute-style parts entries
    • Trimming behavior and empty/whitespace-only results
    • Error handling and logging clarity

Poem

🐰 I nibble bytes and tidy strings,

"Just give me text," is what I sing.
From parts or text I softly comb,
I trim the ends and carry home.
If errors hop, I log — then spring.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title refers to 'Modified unused audio_bytes in AIWorker', but the actual changes involve transcription flow improvements, error handling, gitignore updates, and README formatting—not audio_bytes modifications. Update the title to accurately reflect the main changes, such as 'Improve transcription parsing with robust error handling' or 'Enhance transcription flow with better response parsing'.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38d95a6 and 42c2765.

📒 Files selected for processing (2)
  • README.md (1 hunks)
  • main.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • README.md
🧰 Additional context used
🧬 Code graph analysis (1)
main.py (1)
tests/test_stealthit.py (3)
  • text (58-58)
  • text (111-111)
  • emit (198-199)
🪛 Ruff (0.14.6)
main.py

286-286: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (1)
main.py (1)

267-267: LGTM! Clear prompt directive.

The added instruction to return only transcribed text improves output consistency and reduces post-processing needs.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Thairu-dev
Copy link
Copy Markdown
Owner

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 24, 2025

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d25976 and dd25a81.

⛔ Files ignored due to path filters (1)
  • resources/chevron-down.png is excluded by !**/*.png
📒 Files selected for processing (1)
  • main.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
main.py (1)
tests/test_stealthit.py (3)
  • text (58-58)
  • text (111-111)
  • emit (198-199)
🪛 Ruff (0.14.5)
main.py

244-244: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (2)
main.py (2)

225-248: PR title doesn't match the actual changes.

The PR title mentions "Modified unused audio_bytes in AIWorker", but the changes shown here are about enhancing text extraction from transcription responses, not about audio_bytes handling. Consider updating the PR title to accurately reflect the changes, e.g., "Enhanced transcription response parsing with fallback support".


225-225: Good improvement to the transcription prompt.

The explicit instruction to return only transcribed text with no extra commentary helps ensure cleaner, more predictable responses from the model.

Comment thread main.py
Comment on lines +229 to 246
# Safely extract text from response
text = None
try:
text = getattr(response, "text", None)
if not text and getattr(response, "parts", None):
# Some SDKs return parts; try to collect text from parts
parts = []
for p in response.parts:
if isinstance(p, dict):
parts.append(p.get("text", ""))
else:
parts.append(getattr(p, "text", ""))
text = " ".join([p for p in parts if p]).strip()
if text:
self.transcription_ready.emit(text)
else:
# print("Empty response from AI (Silence?)")
pass
self.transcription_ready.emit(text.strip())
except Exception as e:
print(f"Transcription parse error: {e}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Refactor the text extraction logic to address multiple issues.

Several concerns with this implementation:

  1. Redundant strip() call: Line 243 strips text that was already stripped on line 241.
  2. Nested try-except blocks: The inner try-except (lines 231-246) is nested within the outer one (lines 219-248), which can be confusing and may mask the actual error location.
  3. Bare Exception catching: Line 244 catches a bare Exception (as flagged by the linter). Be more specific about expected exceptions.
  4. Silent error handling: Line 245 only prints the error without informing the user. Consider emitting an error through transcription_ready.emit() or a dedicated error signal.
  5. Unverified response format handling: The fallback to response.parts suggests handling different SDK versions, but this should be verified against the actual Gemini API documentation.

Apply this diff to address the redundant strip and improve error handling:

         # Safely extract text from response
         text = None
         try:
             text = getattr(response, "text", None)
             if not text and getattr(response, "parts", None):
                 # Some SDKs return parts; try to collect text from parts
                 parts = []
                 for p in response.parts:
                     if isinstance(p, dict):
                         parts.append(p.get("text", ""))
                     else:
                         parts.append(getattr(p, "text", ""))
                 text = " ".join([p for p in parts if p]).strip()
             if text:
-                self.transcription_ready.emit(text.strip())
+                self.transcription_ready.emit(text)
-        except Exception as e:
-            print(f"Transcription parse error: {e}")
+        except (AttributeError, TypeError) as e:
+            error_msg = f"Transcription parse error: {e}"
+            print(error_msg)
+            self.transcription_ready.emit(f"Error: {error_msg}")

Verify the Gemini API response format by searching the documentation:


🌐 Web query:

What is the structure of the response object returned by google.generativeai GenerativeModel.generate_content() method? Does it have a text attribute and/or parts attribute?

💡 Result:

Short answer: Yes — the SDK exposes a top-level text property and the full parts structure.

Details (concise)

  • response.text — a convenience string that concatenates/returns the text parts (e.g., examples show print(response.text)). [1][2]
  • The full structured result is available under candidates → each candidate has content → parts (Part objects). Parts may contain text, inline data (images), function-call payloads, etc. You can iterate response.candidates[0].content.parts (or examine response.parts in some SDK convenience examples) to access individual parts. [1][2][3]

Sources: Google GenAI Python SDK docs and API reference. [1][2][3]


Address code quality issues in the text extraction logic.

The Gemini API response format is correct—the SDK exposes both a top-level text property and the parts structure. However, this code has several minor issues to fix:

  1. Redundant strip(): Line 243 calls strip() on text already stripped on line 241.
  2. Bare Exception: Line 244 catches a bare Exception. Use more specific exceptions like AttributeError or TypeError.
  3. Silent error handling: Errors are only printed, not emitted to the user.

Apply this diff:

         # Safely extract text from response
         text = None
         try:
             text = getattr(response, "text", None)
             if not text and getattr(response, "parts", None):
                 # Some SDKs return parts; try to collect text from parts
                 parts = []
                 for p in response.parts:
                     if isinstance(p, dict):
                         parts.append(p.get("text", ""))
                     else:
                         parts.append(getattr(p, "text", ""))
                 text = " ".join([p for p in parts if p]).strip()
             if text:
-                self.transcription_ready.emit(text.strip())
-        except Exception as e:
-            print(f"Transcription parse error: {e}")
+                self.transcription_ready.emit(text)
+        except (AttributeError, TypeError) as e:
+            error_msg = f"Transcription parse error: {e}"
+            print(error_msg)
+            self.transcription_ready.emit(f"Error: {error_msg}")
🧰 Tools
🪛 Ruff (0.14.5)

244-244: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In main.py around lines 229-246, the text extraction logic should be tightened:
remove the redundant second .strip() so text is only stripped once before
emitting, replace the bare except with specific exceptions (e.g., except
(AttributeError, TypeError, ValueError) as e:) and instead of printing errors
use the component's error signal to notify the UI (e.g., call
self.transcription_error.emit(str(e)) or, if that signal doesn't exist, call a
provided error handler) so errors are surfaced to the user rather than silently
printed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants