fix: normalize OpenAI multi-part content in _get_events_for_messages#1805
fix: normalize OpenAI multi-part content in _get_events_for_messages#1805octo-patch wants to merge 1 commit intoNVIDIA-NeMo:developfrom
Conversation
…ixes NVIDIA-NeMo#1741) When the OpenAI chat completions API is used with multi-part content (e.g. `"content": [{"type": "text", "text": "…"}]`), llmrails.py was storing the raw list in event text fields instead of a plain string. This caused two problems: 1. Self-check / intent-matching LLM prompts received the Python repr of the list instead of the actual user text. 2. In multi-turn conversations `get_colang_history()` crashed with `TypeError: must be str or None, not list` at the `rsplit()` call. Fix: add a static helper `LLMRails._extract_text_content()` that joins all `{"type": "text"}` parts into a single string (silently ignoring image and other non-text parts), and apply it to every place in `_get_events_for_messages` where user-message content is stored in an event field — for both Colang 1.0 and 2.0 paths. Tests added for the helper function and for the two failure modes described in the issue.
Greptile SummaryThis PR fixes issue #1741 by adding The fix is applied consistently across all four user-message content sites in both the Colang 1.0 and 2.0 paths. Unit and integration tests cover the new behavior well.
|
| Filename | Overview |
|---|---|
| nemoguardrails/rails/llm/llmrails.py | Adds _extract_text_content static helper and applies it at all four user-message content assignment sites in both Colang 1.0 and 2.0 paths of _get_events_for_messages; minor edge case with empty text-part keys. |
| tests/test_llmrails.py | Adds 4 unit tests for _extract_text_content and 2 async integration tests covering single-turn and multi-turn multi-part content; tests are functional but each unit test creates an unnecessary LLMRails instance for a static method. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["msg['content'] received"] --> B{isinstance list?}
B -- Yes --> C["Filter parts where type == 'text'"]
C --> D["join text values with space"]
D --> E["Return normalized string"]
B -- No --> F{content is None?}
F -- Yes --> G["Return ''"]
F -- No --> H["Return content as-is"]
E --> I["Set final_transcript / text in event"]
G --> I
H --> I
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 611-615
Comment:
**Malformed text parts yield leading/trailing spaces**
When a content part has `"type": "text"` but is missing the `"text"` key, `part.get("text", "")` returns `""`, and `" ".join(["", "hello"])` produces `" hello"` (a leading space). Filtering out empty parts before joining avoids this:
```suggestion
return " ".join(
part["text"]
for part in content
if isinstance(part, dict) and part.get("type") == "text" and part.get("text")
)
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: tests/test_llmrails.py
Line: 1486-1526
Comment:
**Unit tests instantiate `LLMRails` unnecessarily for a `@staticmethod`**
`_extract_text_content` is a static method and has no dependency on instance state, so each of the four unit tests creating a full `RailsConfig` + `LLMRails` adds setup overhead without benefit. Calling it as a class method is cleaner:
```python
assert LLMRails._extract_text_content("hello") == "hello"
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: normalize OpenAI multi-part content..." | Re-trigger Greptile
| return " ".join( | ||
| part.get("text", "") | ||
| for part in content | ||
| if isinstance(part, dict) and part.get("type") == "text" | ||
| ) |
There was a problem hiding this comment.
Malformed text parts yield leading/trailing spaces
When a content part has "type": "text" but is missing the "text" key, part.get("text", "") returns "", and " ".join(["", "hello"]) produces " hello" (a leading space). Filtering out empty parts before joining avoids this:
| return " ".join( | |
| part.get("text", "") | |
| for part in content | |
| if isinstance(part, dict) and part.get("type") == "text" | |
| ) | |
| return " ".join( | |
| part["text"] | |
| for part in content | |
| if isinstance(part, dict) and part.get("type") == "text" and part.get("text") | |
| ) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 611-615
Comment:
**Malformed text parts yield leading/trailing spaces**
When a content part has `"type": "text"` but is missing the `"text"` key, `part.get("text", "")` returns `""`, and `" ".join(["", "hello"])` produces `" hello"` (a leading space). Filtering out empty parts before joining avoids this:
```suggestion
return " ".join(
part["text"]
for part in content
if isinstance(part, dict) and part.get("type") == "text" and part.get("text")
)
```
How can I resolve this? If you propose a fix, please make it concise.| def test_extract_text_content_string(): | ||
| """Plain strings should be returned unchanged.""" | ||
| config = RailsConfig.from_content(config={"models": []}) | ||
| rails = LLMRails(config=config, llm=FakeLLM(responses=[])) | ||
|
|
||
| assert rails._extract_text_content("hello") == "hello" | ||
| assert rails._extract_text_content("") == "" | ||
| assert rails._extract_text_content(None) == "" | ||
|
|
||
|
|
||
| def test_extract_text_content_list_single_text_part(): | ||
| """A single-part list with type=text should return the text string.""" | ||
| config = RailsConfig.from_content(config={"models": []}) | ||
| rails = LLMRails(config=config, llm=FakeLLM(responses=[])) | ||
|
|
||
| content = [{"type": "text", "text": "Hello there"}] | ||
| assert rails._extract_text_content(content) == "Hello there" | ||
|
|
||
|
|
||
| def test_extract_text_content_list_multiple_text_parts(): | ||
| """Multiple text parts should be joined with a space.""" | ||
| config = RailsConfig.from_content(config={"models": []}) | ||
| rails = LLMRails(config=config, llm=FakeLLM(responses=[])) | ||
|
|
||
| content = [ | ||
| {"type": "text", "text": "First part."}, | ||
| {"type": "text", "text": "Second part."}, | ||
| ] | ||
| assert rails._extract_text_content(content) == "First part. Second part." | ||
|
|
||
|
|
||
| def test_extract_text_content_list_skips_image_parts(): | ||
| """Image parts should be silently ignored; only text parts are returned.""" | ||
| config = RailsConfig.from_content(config={"models": []}) | ||
| rails = LLMRails(config=config, llm=FakeLLM(responses=[])) | ||
|
|
||
| content = [ | ||
| {"type": "image_url", "image_url": {"url": "http://example.com/img.png"}}, | ||
| {"type": "text", "text": "Describe this image."}, | ||
| ] | ||
| assert rails._extract_text_content(content) == "Describe this image." |
There was a problem hiding this comment.
Unit tests instantiate
LLMRails unnecessarily for a @staticmethod
_extract_text_content is a static method and has no dependency on instance state, so each of the four unit tests creating a full RailsConfig + LLMRails adds setup overhead without benefit. Calling it as a class method is cleaner:
assert LLMRails._extract_text_content("hello") == "hello"Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_llmrails.py
Line: 1486-1526
Comment:
**Unit tests instantiate `LLMRails` unnecessarily for a `@staticmethod`**
`_extract_text_content` is a static method and has no dependency on instance state, so each of the four unit tests creating a full `RailsConfig` + `LLMRails` adds setup overhead without benefit. Calling it as a class method is cleaner:
```python
assert LLMRails._extract_text_content("hello") == "hello"
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
📝 WalkthroughWalkthroughThe PR adds a static method to normalize OpenAI multi-part message content format into single strings. It updates message event creation to use extracted text instead of raw content lists, preventing Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_llmrails.py (1)
1530-1565: Consider asserting normalized transcript content (not only no-crash).These tests currently prove stability, but they don’t explicitly prove that prompt-facing user text is normalized to a string. Adding an assertion on generated user events/history text would lock in the behavior targeted by issue
#1741.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_llmrails.py` around lines 1530 - 1565, Add assertions that validate normalized user text is a plain string, not only that generate_async doesn't crash: after calling LLMRails.generate_async (in tests test_generate_async_multipart_user_content and test_generate_async_multipart_multi_turn_no_type_error) call the LLMRails.get_colang_history() (or inspect the returned result/assistant events) and assert that user-facing entries for the user messages ("Hello!", "Hello again") are present as str types (e.g., isinstance(entry["content"], str) and entry["content"].contains the expected text). This locks in the normalization behavior that the tests target and prevents regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nemoguardrails/rails/llm/llmrails.py`:
- Around line 601-617: The _extract_text_content function can return non-string
values when content is neither a list nor None; modify _extract_text_content so
every return value is explicitly a str: when content is a list keep the existing
join but ensure each part's .get("text", "") is cast/coerced to str before
joining, and for the non-list branch return str(content) if content is not None
(otherwise ""), guaranteeing _extract_text_content always yields a Python str
for downstream uses of final_transcript/text.
---
Nitpick comments:
In `@tests/test_llmrails.py`:
- Around line 1530-1565: Add assertions that validate normalized user text is a
plain string, not only that generate_async doesn't crash: after calling
LLMRails.generate_async (in tests test_generate_async_multipart_user_content and
test_generate_async_multipart_multi_turn_no_type_error) call the
LLMRails.get_colang_history() (or inspect the returned result/assistant events)
and assert that user-facing entries for the user messages ("Hello!", "Hello
again") are present as str types (e.g., isinstance(entry["content"], str) and
entry["content"].contains the expected text). This locks in the normalization
behavior that the tests target and prevents regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: fa0f3665-8ca7-4984-a421-0e6393fc89c3
📒 Files selected for processing (2)
nemoguardrails/rails/llm/llmrails.pytests/test_llmrails.py
| @staticmethod | ||
| def _extract_text_content(content) -> str: | ||
| """Normalize an OpenAI message content value to a plain string. | ||
|
|
||
| The OpenAI spec allows ``content`` to be either a plain string or a list | ||
| of content parts (e.g. ``[{"type": "text", "text": "…"}, …]``). When a | ||
| list is received, extract and join all ``text`` parts so the rest of the | ||
| pipeline always operates on a regular string. | ||
| """ | ||
| if isinstance(content, list): | ||
| return " ".join( | ||
| part.get("text", "") | ||
| for part in content | ||
| if isinstance(part, dict) and part.get("type") == "text" | ||
| ) | ||
| return content if content is not None else "" | ||
|
|
There was a problem hiding this comment.
Ensure _extract_text_content always returns str on all paths.
At Line 616, non-list non-None values are returned as-is, which can leak non-string types into final_transcript / text and break downstream string operations again.
Proposed hardening
`@staticmethod`
def _extract_text_content(content) -> str:
@@
- if isinstance(content, list):
- return " ".join(
- part.get("text", "")
- for part in content
- if isinstance(part, dict) and part.get("type") == "text"
- )
- return content if content is not None else ""
+ if content is None:
+ return ""
+
+ if isinstance(content, str):
+ return content
+
+ if isinstance(content, list):
+ parts: List[str] = []
+ for part in content:
+ if isinstance(part, dict) and part.get("type") == "text":
+ text = part.get("text", "")
+ if text is None:
+ continue
+ parts.append(text if isinstance(text, str) else str(text))
+ return " ".join(parts)
+
+ return str(content)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nemoguardrails/rails/llm/llmrails.py` around lines 601 - 617, The
_extract_text_content function can return non-string values when content is
neither a list nor None; modify _extract_text_content so every return value is
explicitly a str: when content is a list keep the existing join but ensure each
part's .get("text", "") is cast/coerced to str before joining, and for the
non-list branch return str(content) if content is not None (otherwise ""),
guaranteeing _extract_text_content always yields a Python str for downstream
uses of final_transcript/text.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
a6be550 to
c69efe5
Compare
Fixes #1741
Problem
When the OpenAI chat completions API sends messages with multi-part content (e.g.
"content": [{"type": "text", "text": "Hello"}]),llmrails.pystored the raw list directly in event text fields without normalizing to a string. This caused two failures:[{'type': 'text', 'text': '...'}]) instead of the actual user text.TypeErrorcrash — in multi-turn conversations wheremask_prev_user_messagefires,get_colang_history()calledrsplit()on the list and raisedTypeError: must be str or None, not list.Root cause:
_get_events_for_messagesset"final_transcript": msg["content"]and"text": msg["content"]without checking whethercontentis a list.Solution
Add a static helper method
LLMRails._extract_text_content(content)that:{"type": "text"}parts with a space (image and other non-text parts are silently ignored, consistent with how the rest of the codebase handles multimodal content).Apply the helper to every user-message content assignment in
_get_events_for_messagesfor both the Colang 1.0 and 2.0 code paths.Testing
_extract_text_content(plain string, single text part, multiple text parts, image parts ignored).generate_asyncno longer crashes with single-turn and multi-turn multi-part content.test_llmrails.pytests unaffected.Summary by CodeRabbit