fix(conversation): preserve tool result JSON structure in sliding window management #94

cagataycali · 2025-05-23T19:47:16Z

Description

Previously, the sliding window conversation manager would convert toolResult JSON
structures to plain text when trimming messages, causing LLMs to lose the ability
to properly parse tool results. This led to hallucinations and incorrect tool calls.

Issue: #95

The issue manifested when the conversation exceeded the window size and the trim
point fell on a message containing toolResult content. The _map_tool_result_content()
method would convert structured JSON like:
{"toolResult": {"toolUseId": "123", "content": [...], "status": "success"}}
into plain text like:
"Tool Result JSON Content: {...}"

This fix:

Removes the problematic _map_tool_result_content() method entirely
Adds a new _find_safe_trim_index() method that intelligently finds cut points
that preserve tool use/result pairs together
Introduces helper functions to track tool relationships:
- has_tool_use() and has_tool_result() for checking message content
- get_tool_use_ids() and get_tool_result_ids() for extracting tool IDs
Ensures tool use and result pairs are never separated during trimming
Maintains the original JSON structure of tool results throughout the conversation

The new approach:

Maps all tool IDs to their message indices
Adjusts trim points to keep related tool interactions together
Falls back to basic trimming only when no safe cut point exists
Preserves the integrity of the conversation structure for LLMs

Testing shows this eliminates the hallucination issues and maintains proper
tool interaction patterns even when conversations exceed the window size.

BREAKING CHANGE: The internal message trimming behavior has changed. While this
maintains API compatibility, it may result in slightly different message retention
patterns when the window size is exceeded.

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
[N/A] I have updated the documentation accordingly
[N/A] I have added an appropriate example to the documentation to outline the feature
[N/A] My changes generate no new warnings
[N/A] Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…dow management Previously, the sliding window conversation manager would convert toolResult JSON structures to plain text when trimming messages, causing LLMs to lose the ability to properly parse tool results. This led to hallucinations and incorrect tool calls. The issue manifested when the conversation exceeded the window size and the trim point fell on a message containing toolResult content. The _map_tool_result_content() method would convert structured JSON like: {"toolResult": {"toolUseId": "123", "content": [...], "status": "success"}} into plain text like: "Tool Result JSON Content: {...}" This fix: - Removes the problematic _map_tool_result_content() method entirely - Adds a new _find_safe_trim_index() method that intelligently finds cut points that preserve tool use/result pairs together - Introduces helper functions to track tool relationships: - has_tool_use() and has_tool_result() for checking message content - get_tool_use_ids() and get_tool_result_ids() for extracting tool IDs - Ensures tool use and result pairs are never separated during trimming - Maintains the original JSON structure of tool results throughout the conversation The new approach: 1. Maps all tool IDs to their message indices 2. Adjusts trim points to keep related tool interactions together 3. Falls back to basic trimming only when no safe cut point exists 4. Preserves the integrity of the conversation structure for LLMs Testing shows this eliminates the hallucination issues and maintains proper tool interaction patterns even when conversations exceed the window size. BREAKING CHANGE: The internal message trimming behavior has changed. While this maintains API compatibility, it may result in slightly different message retention patterns when the window size is exceeded.

fede-dash · 2025-05-25T06:08:59Z

src/strands/agent/conversation_manager/sliding_window_conversation_manager.py

+        tool_use_indices = {}  # toolUseId -> message index
+        tool_result_indices = {}  # toolUseId -> message index
+
+        for i, message in enumerate(messages):


get_tool_use_ids() and get_tool_result_ids() here, consider inlining the logic to avoid scanning message["content"] twice per message and allocating temporary lists.

A single-pass loop over message["content"] can extract both toolUseId and toolResultId efficiently:

for content in message.get("content", []): if "toolUse" in content: tool_id = content["toolUse"].get("toolUseId") if tool_id: tool_use_indices[tool_id] = i if "toolResult" in content: tool_id = content["toolResult"].get("toolUseId") if tool_id: tool_result_indices[tool_id] = i

This avoids unnecessary allocations and keeps the loop flat

fede-dash · 2025-05-25T06:18:37Z

src/strands/agent/conversation_manager/sliding_window_conversation_manager.py

+        safe_index = initial_trim_index
+
+        # Adjust if we would cut in the middle of a tool use/result pair
+        for tool_id, use_idx in tool_use_indices.items():


If you would like to avoid so much nesting

for tool_id, use_idx in tool_use_indices.items(): if tool_id not in tool_result_indices: continue result_idx = tool_result_indices[tool_id] # If the pair would be split by the cut, move it earlier to keep them together if use_idx < safe_index <= result_idx: safe_index = min(safe_index, use_idx) continue # Invalid ordering: toolResult appears before toolUse if result_idx < safe_index < use_idx: logger.warning("tool_id=<%s> | found toolResult before toolUse", tool_id)

cagataycali · 2025-05-26T18:34:26Z

Resolved in: #120

cagataycali requested a review from a team as a code owner May 23, 2025 19:47

cagataycali added the bug Something isn't working label May 23, 2025

cagataycali mentioned this pull request May 23, 2025

[BUG]: SlidingWindowConversationManager corrupts toolResult JSON structure causing LLM hallucinations #95

Closed

fede-dash reviewed May 25, 2025

View reviewed changes

cagataycali mentioned this pull request May 26, 2025

feat: Update SlidingWindowConversationManager #120

Merged

6 tasks

cagataycali closed this May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(conversation): preserve tool result JSON structure in sliding window management #94

fix(conversation): preserve tool result JSON structure in sliding window management #94

Uh oh!

cagataycali commented May 23, 2025 •

edited

Loading

Uh oh!

fede-dash May 25, 2025 •

edited

Loading

Uh oh!

fede-dash May 25, 2025

Uh oh!

cagataycali commented May 26, 2025

Uh oh!

Uh oh!

fix(conversation): preserve tool result JSON structure in sliding window management #94

fix(conversation): preserve tool result JSON structure in sliding window management #94

Uh oh!

Conversation

cagataycali commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

fede-dash May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fede-dash May 25, 2025

Choose a reason for hiding this comment

Uh oh!

cagataycali commented May 26, 2025

Uh oh!

Uh oh!

cagataycali commented May 23, 2025 •

edited

Loading

fede-dash May 25, 2025 •

edited

Loading