Skip to content

fix: normalize OpenAI multi-part content in _get_events_for_messages#1805

Open
octo-patch wants to merge 1 commit intoNVIDIA-NeMo:developfrom
octo-patch:fix/issue-1741-multipart-content-normalization
Open

fix: normalize OpenAI multi-part content in _get_events_for_messages#1805
octo-patch wants to merge 1 commit intoNVIDIA-NeMo:developfrom
octo-patch:fix/issue-1741-multipart-content-normalization

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Apr 21, 2026

Fixes #1741

Problem

When the OpenAI chat completions API sends messages with multi-part content (e.g. "content": [{"type": "text", "text": "Hello"}]), llmrails.py stored the raw list directly in event text fields without normalizing to a string. This caused two failures:

  1. Garbled prompts — self-check and intent-matching prompts received the Python repr of the list ([{'type': 'text', 'text': '...'}]) instead of the actual user text.
  2. TypeError crash — in multi-turn conversations where mask_prev_user_message fires, get_colang_history() called rsplit() on the list and raised TypeError: must be str or None, not list.

Root cause: _get_events_for_messages set "final_transcript": msg["content"] and "text": msg["content"] without checking whether content is a list.

Solution

Add a static helper method LLMRails._extract_text_content(content) that:

  • Returns plain strings unchanged.
  • For list content, joins all {"type": "text"} parts with a space (image and other non-text parts are silently ignored, consistent with how the rest of the codebase handles multimodal content).

Apply the helper to every user-message content assignment in _get_events_for_messages for both the Colang 1.0 and 2.0 code paths.

Testing

  • Added unit tests for _extract_text_content (plain string, single text part, multiple text parts, image parts ignored).
  • Added integration tests verifying that generate_async no longer crashes with single-turn and multi-turn multi-part content.
  • All 6 new tests pass; existing test_llmrails.py tests unaffected.

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling of multi-part message content formats to prevent crashes and enhance compatibility with OpenAI messages
  • Tests
    • Added comprehensive test coverage for message content normalization across various input formats

…ixes NVIDIA-NeMo#1741)

When the OpenAI chat completions API is used with multi-part content
(e.g. `"content": [{"type": "text", "text": "…"}]`), llmrails.py was
storing the raw list in event text fields instead of a plain string.
This caused two problems:

1. Self-check / intent-matching LLM prompts received the Python repr
   of the list instead of the actual user text.
2. In multi-turn conversations `get_colang_history()` crashed with
   `TypeError: must be str or None, not list` at the `rsplit()` call.

Fix: add a static helper `LLMRails._extract_text_content()` that joins
all `{"type": "text"}` parts into a single string (silently ignoring
image and other non-text parts), and apply it to every place in
`_get_events_for_messages` where user-message content is stored in an
event field — for both Colang 1.0 and 2.0 paths.

Tests added for the helper function and for the two failure modes
described in the issue.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR fixes issue #1741 by adding LLMRails._extract_text_content() to normalize OpenAI multi-part content lists (e.g. [{\"type\": \"text\", \"text\": \"…\"}]) into plain strings before they are written into Colang events, preventing both garbled prompts and the TypeError crash in multi-turn flows.

The fix is applied consistently across all four user-message content sites in both the Colang 1.0 and 2.0 paths. Unit and integration tests cover the new behavior well.

Confidence Score: 5/5

Safe to merge — all remaining findings are minor style suggestions with no correctness impact.

The fix is narrowly scoped, covers all affected call sites, and is backed by both unit and integration tests. No P0/P1 issues were found.

No files require special attention.

Important Files Changed

Filename Overview
nemoguardrails/rails/llm/llmrails.py Adds _extract_text_content static helper and applies it at all four user-message content assignment sites in both Colang 1.0 and 2.0 paths of _get_events_for_messages; minor edge case with empty text-part keys.
tests/test_llmrails.py Adds 4 unit tests for _extract_text_content and 2 async integration tests covering single-turn and multi-turn multi-part content; tests are functional but each unit test creates an unnecessary LLMRails instance for a static method.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["msg['content'] received"] --> B{isinstance list?}
    B -- Yes --> C["Filter parts where type == 'text'"]
    C --> D["join text values with space"]
    D --> E["Return normalized string"]
    B -- No --> F{content is None?}
    F -- Yes --> G["Return ''"]
    F -- No --> H["Return content as-is"]
    E --> I["Set final_transcript / text in event"]
    G --> I
    H --> I
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 611-615

Comment:
**Malformed text parts yield leading/trailing spaces**

When a content part has `"type": "text"` but is missing the `"text"` key, `part.get("text", "")` returns `""`, and `" ".join(["", "hello"])` produces `" hello"` (a leading space). Filtering out empty parts before joining avoids this:

```suggestion
            return " ".join(
                part["text"]
                for part in content
                if isinstance(part, dict) and part.get("type") == "text" and part.get("text")
            )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/test_llmrails.py
Line: 1486-1526

Comment:
**Unit tests instantiate `LLMRails` unnecessarily for a `@staticmethod`**

`_extract_text_content` is a static method and has no dependency on instance state, so each of the four unit tests creating a full `RailsConfig` + `LLMRails` adds setup overhead without benefit. Calling it as a class method is cleaner:

```python
assert LLMRails._extract_text_content("hello") == "hello"
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: normalize OpenAI multi-part content..." | Re-trigger Greptile

Comment on lines +611 to +615
return " ".join(
part.get("text", "")
for part in content
if isinstance(part, dict) and part.get("type") == "text"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Malformed text parts yield leading/trailing spaces

When a content part has "type": "text" but is missing the "text" key, part.get("text", "") returns "", and " ".join(["", "hello"]) produces " hello" (a leading space). Filtering out empty parts before joining avoids this:

Suggested change
return " ".join(
part.get("text", "")
for part in content
if isinstance(part, dict) and part.get("type") == "text"
)
return " ".join(
part["text"]
for part in content
if isinstance(part, dict) and part.get("type") == "text" and part.get("text")
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 611-615

Comment:
**Malformed text parts yield leading/trailing spaces**

When a content part has `"type": "text"` but is missing the `"text"` key, `part.get("text", "")` returns `""`, and `" ".join(["", "hello"])` produces `" hello"` (a leading space). Filtering out empty parts before joining avoids this:

```suggestion
            return " ".join(
                part["text"]
                for part in content
                if isinstance(part, dict) and part.get("type") == "text" and part.get("text")
            )
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread tests/test_llmrails.py
Comment on lines +1486 to +1526
def test_extract_text_content_string():
"""Plain strings should be returned unchanged."""
config = RailsConfig.from_content(config={"models": []})
rails = LLMRails(config=config, llm=FakeLLM(responses=[]))

assert rails._extract_text_content("hello") == "hello"
assert rails._extract_text_content("") == ""
assert rails._extract_text_content(None) == ""


def test_extract_text_content_list_single_text_part():
"""A single-part list with type=text should return the text string."""
config = RailsConfig.from_content(config={"models": []})
rails = LLMRails(config=config, llm=FakeLLM(responses=[]))

content = [{"type": "text", "text": "Hello there"}]
assert rails._extract_text_content(content) == "Hello there"


def test_extract_text_content_list_multiple_text_parts():
"""Multiple text parts should be joined with a space."""
config = RailsConfig.from_content(config={"models": []})
rails = LLMRails(config=config, llm=FakeLLM(responses=[]))

content = [
{"type": "text", "text": "First part."},
{"type": "text", "text": "Second part."},
]
assert rails._extract_text_content(content) == "First part. Second part."


def test_extract_text_content_list_skips_image_parts():
"""Image parts should be silently ignored; only text parts are returned."""
config = RailsConfig.from_content(config={"models": []})
rails = LLMRails(config=config, llm=FakeLLM(responses=[]))

content = [
{"type": "image_url", "image_url": {"url": "http://example.com/img.png"}},
{"type": "text", "text": "Describe this image."},
]
assert rails._extract_text_content(content) == "Describe this image."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unit tests instantiate LLMRails unnecessarily for a @staticmethod

_extract_text_content is a static method and has no dependency on instance state, so each of the four unit tests creating a full RailsConfig + LLMRails adds setup overhead without benefit. Calling it as a class method is cleaner:

assert LLMRails._extract_text_content("hello") == "hello"
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_llmrails.py
Line: 1486-1526

Comment:
**Unit tests instantiate `LLMRails` unnecessarily for a `@staticmethod`**

`_extract_text_content` is a static method and has no dependency on instance state, so each of the four unit tests creating a full `RailsConfig` + `LLMRails` adds setup overhead without benefit. Calling it as a class method is cleaner:

```python
assert LLMRails._extract_text_content("hello") == "hello"
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

The PR adds a static method to normalize OpenAI multi-part message content format into single strings. It updates message event creation to use extracted text instead of raw content lists, preventing TypeError crashes and ensuring prompts receive actual text rather than list representations.

Changes

Cohort / File(s) Summary
Content Normalization
nemoguardrails/rails/llm/llmrails.py
Added _extract_text_content() static method to convert OpenAI multi-part content lists (e.g., [{"type": "text", "text": "..."}]) into space-joined strings, with fallback to empty string for None. Updated _get_events_for_messages() to normalize content via this method when populating UtteranceUserActionFinished.final_transcript, UserMessage.text, and related fields for both Colang 1.0 and 2.x message handling.
Test Coverage
tests/test_llmrails.py
Added comprehensive test cases for content extraction: plain string passthrough, None handling, single and multi-part text lists, and mixed-type part filtering. Added integration tests verifying generate_async() handles multi-part content without crashing in single-turn and multi-turn conversation scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: normalizing OpenAI multi-part content in the _get_events_for_messages method, which directly addresses the PR's core fix.
Linked Issues check ✅ Passed The PR fully addresses issue #1741 by adding _extract_text_content to normalize multi-part content, updating all affected code paths in _get_events_for_messages, and including comprehensive tests covering unit and integration scenarios.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing issue #1741: the new helper method, its application to user message handling, and comprehensive tests for both Colang 1.0 and 2.x paths.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes ✅ Passed PR qualifies for passing: changes are minor (localized bug fix with +22/-4 net code changes) and test results are explicitly documented in PR description.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_llmrails.py (1)

1530-1565: Consider asserting normalized transcript content (not only no-crash).

These tests currently prove stability, but they don’t explicitly prove that prompt-facing user text is normalized to a string. Adding an assertion on generated user events/history text would lock in the behavior targeted by issue #1741.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_llmrails.py` around lines 1530 - 1565, Add assertions that
validate normalized user text is a plain string, not only that generate_async
doesn't crash: after calling LLMRails.generate_async (in tests
test_generate_async_multipart_user_content and
test_generate_async_multipart_multi_turn_no_type_error) call the
LLMRails.get_colang_history() (or inspect the returned result/assistant events)
and assert that user-facing entries for the user messages ("Hello!", "Hello
again") are present as str types (e.g., isinstance(entry["content"], str) and
entry["content"].contains the expected text). This locks in the normalization
behavior that the tests target and prevents regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoguardrails/rails/llm/llmrails.py`:
- Around line 601-617: The _extract_text_content function can return non-string
values when content is neither a list nor None; modify _extract_text_content so
every return value is explicitly a str: when content is a list keep the existing
join but ensure each part's .get("text", "") is cast/coerced to str before
joining, and for the non-list branch return str(content) if content is not None
(otherwise ""), guaranteeing _extract_text_content always yields a Python str
for downstream uses of final_transcript/text.

---

Nitpick comments:
In `@tests/test_llmrails.py`:
- Around line 1530-1565: Add assertions that validate normalized user text is a
plain string, not only that generate_async doesn't crash: after calling
LLMRails.generate_async (in tests test_generate_async_multipart_user_content and
test_generate_async_multipart_multi_turn_no_type_error) call the
LLMRails.get_colang_history() (or inspect the returned result/assistant events)
and assert that user-facing entries for the user messages ("Hello!", "Hello
again") are present as str types (e.g., isinstance(entry["content"], str) and
entry["content"].contains the expected text). This locks in the normalization
behavior that the tests target and prevents regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: fa0f3665-8ca7-4984-a421-0e6393fc89c3

📥 Commits

Reviewing files that changed from the base of the PR and between b9cd0fb and b2d9121.

📒 Files selected for processing (2)
  • nemoguardrails/rails/llm/llmrails.py
  • tests/test_llmrails.py

Comment on lines +601 to +617
@staticmethod
def _extract_text_content(content) -> str:
"""Normalize an OpenAI message content value to a plain string.

The OpenAI spec allows ``content`` to be either a plain string or a list
of content parts (e.g. ``[{"type": "text", "text": "…"}, …]``). When a
list is received, extract and join all ``text`` parts so the rest of the
pipeline always operates on a regular string.
"""
if isinstance(content, list):
return " ".join(
part.get("text", "")
for part in content
if isinstance(part, dict) and part.get("type") == "text"
)
return content if content is not None else ""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Ensure _extract_text_content always returns str on all paths.

At Line 616, non-list non-None values are returned as-is, which can leak non-string types into final_transcript / text and break downstream string operations again.

Proposed hardening
 `@staticmethod`
 def _extract_text_content(content) -> str:
@@
-    if isinstance(content, list):
-        return " ".join(
-            part.get("text", "")
-            for part in content
-            if isinstance(part, dict) and part.get("type") == "text"
-        )
-    return content if content is not None else ""
+    if content is None:
+        return ""
+
+    if isinstance(content, str):
+        return content
+
+    if isinstance(content, list):
+        parts: List[str] = []
+        for part in content:
+            if isinstance(part, dict) and part.get("type") == "text":
+                text = part.get("text", "")
+                if text is None:
+                    continue
+                parts.append(text if isinstance(text, str) else str(text))
+        return " ".join(parts)
+
+    return str(content)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoguardrails/rails/llm/llmrails.py` around lines 601 - 617, The
_extract_text_content function can return non-string values when content is
neither a list nor None; modify _extract_text_content so every return value is
explicitly a str: when content is a list keep the existing join but ensure each
part's .get("text", "") is cast/coerced to str before joining, and for the
non-list branch return str(content) if content is not None (otherwise ""),
guaranteeing _extract_text_content always yields a Python str for downstream
uses of final_transcript/text.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi force-pushed the develop branch 4 times, most recently from a6be550 to c69efe5 Compare May 6, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Multi-part content format (OpenAI spec) causes garbled prompts and TypeError crash

1 participant