.Net: Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent#13888
.Net: Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent#13888PrathamAditya wants to merge 3 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 91%
✗ Correctness
The change adds Ollama thinking/reasoning content extraction via reflection in a shared extension method (ChatMessageExtensions.cs). The main correctness concern is that the thinking
TextContentitem is added toresult.Itemsbefore the regular content items (which are added by the subsequentforeachloop). SinceChatMessageContent.ContentreturnsItems.OfType<TextContent>().FirstOrDefault()?.Text(ChatMessageContent.cs line 40), this causes.Contentto return the thinking text (e.g.[THINKING]\nThis is reasoning content) instead of the actual assistant response (e.g.Final answer). This silently breaks any consumer that reads.Contentto get the model's answer. The test only assertsContains("This is reasoning content", result.Content)which passes due to this ordering, but doesn't verify the actual response text is still accessible.
✗ Security Reliability
The PR adds Ollama-specific reflection-based thinking extraction to ChatMessageExtensions.cs, a shared internal utility compiled into SemanticKernel.Abstractions via MEAIUtilities.props (confirmed at SemanticKernel.Abstractions.csproj:18). This means the reflection code runs for ALL providers (OpenAI, Gemini, Mistral, Bedrock, etc.) on every call to ToChatMessageContent, not just Ollama. While the null-conditional chain prevents crashes for non-Ollama types, this is fragile: any future provider whose RawRepresentation accidentally exposes 'Message' and 'Thinking' properties would get unexpected content injected into results. The code should include a type guard to ensure it only runs for Ollama's ChatDoneResponseStream type, or this logic should be moved to the Ollama connector layer rather than the shared abstraction.
✗ Test Coverage
The new test verifies that thinking content appears in the response, but has significant coverage gaps. The thinking TextContent is inserted into Items before the regular content, which means
result.Content(which returns the first TextContent) now returns thinking text instead of the actual response. The test doesn't verify the Items structure, doesn't check that 'Final answer' is accessible, has no negative test for responses without thinking, no streaming test, and no direct unit test for the shared extension method's thinking extraction logic.
✗ Design Approach
The change extracts Ollama's 'thinking' field from chat responses, but does so in the wrong architectural layer using reflection, and stores it in the wrong content type.
ChatMessageExtensions.csis a shared M.E.AI utility used by all connectors viaChatClientChatCompletionService; embedding Ollama-specific reflection logic there leaks a connector implementation detail into a generic abstraction. More critically, usingTextContent("[THINKING]\n..")rather than the existingReasoningContenttype causesChatMessageContent.Content(which returns the text of the firstTextContentitem) to surface the thinking string instead of the actual assistant answer whenever thinking precedes regular content inItems.
Flagged Issues
- Ollama-specific reflection runs unguarded for all providers. ChatMessageExtensions.cs is compiled into SemanticKernel.Abstractions (via MEAIUtilities.props) and called from ChatClientChatCompletionService.cs:71,75 for every IChatClient. The reflection on
MessageandThinkingproperties executes against OpenAI, Gemini, Mistral, Bedrock, etc. on every chat completion call. Without a type guard, any provider whose raw representation coincidentally exposes matching property names would get unexpected content injected. This logic should either check the concrete type before reflecting or be moved to the Ollama connector layer.
Suggestions
- Add a negative test verifying that responses without a
thinkingfield still produce the expected single content item, to guard against regressions from the reflection-based extraction. - Wrap the
GetValue()calls on reflected properties in try-catch for defensive reliability againstTargetInvocationException.
Automated review by PrathamAditya's agents
Summary
Fixes #13860
Ollama responses can include a
thinking(reasoning) field, but this information is currently not exposed throughChatMessageContentin Semantic Kernel.This PR adds support to surface reasoning content by extracting it from the underlying
RawRepresentation.Changes
thinking) content is currently not propagated.thinkingfromRawRepresentationand include it inChatMessageContent.ItemsasTextContent.Behavior
Before:
content) is available.After:
thinking) is included alongside final response.Design Notes
RawRepresentationto avoid introducing a dependency on Ollama-specific types inSemanticKernel.Abstractions.Limitations / Future Work
This highlights a broader gap in the current abstraction:
The
Microsoft.Extensions.AImodel does not currently support reasoning/thinking as a first-class concept.Modern LLMs (Ollama, OpenAI reasoning models, etc.) are increasingly returning structured reasoning.
Potential improvements:
ReasoningContenttypeHappy to align with maintainers on preferred direction.
Testing
Added unit test:
GetChatMessageContent_ShouldIncludeThinking_WhenPresentInResponseAsync