Skip to content

Session persistence #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Unshure
Copy link
Member

@Unshure Unshure commented Jun 27, 2025

Description

This PR introduces a SessionManagement class to the Strands SDK that is in charge of persisting an agent to some datastore, and re-initializing the agent. The SessionManagement feature set includes:

  • Introduces an agent id attribute used as a unique id for differentiating agents

  • SessionManager: Interface for SessionManagers. This interface introduces a append_message_to_agent_session method that is called in the Agent class when adding a message to the agent messages array, as well as an initialize_agent class for initializing an agent

  • AgentSessionManager: Implementation of the above interface for persisting a single agent to some datastore. This will also re-initialize the agent's messages and state.

  • Session models: Three new models to represent different data types for session persistence:

    • Session: The overarching container to represent a strands agent, and in the future, multi-agent systems. One session can have many agents
    • SessionAgent: A persisted version of an agent, storing the agents state and messages
    • SessionMessage: A persisted version of a message
  • SessionDAO: DAO Interface with CRUDL operations for each of the above new models

  • FileSessionDAO: A implementation of the SessionDAO for storing a session in the local file system. This introduces the following directory and file structure to represent a Session:

    • <storage_dir>/session_<session_id>/
      • session.json # Stores session related data
      • agents/agent_<agent_id>/
        • agent.json # Stores agent related data
        • messages/messages_<message_id>.json # Stores message content
  • S3SessionDAO: Same as the above FileSessionDAO, but stores sessions in S3

Things that still need to be addressed:

  • all unit tests are vibe-coded, need to do a more thorough review
  • Integ tests
  • Update message after guardrail triggers
  • Order event files
  • Review docstrings (some seem to be out of date)
  • Add tests for EventLoopMetric ser/de
  • Respect created_at time for DAO Update methods

Changes for a follow-up pr:

  • Support ConversationManager refreshing (SummarizationConversationManager)

#246

Documentation PR

TODO:

Type of Change

New feature

Testing

How have you tested the change?

  • Vibecoded unit tests and local testing
  • Still needs:
    • Thorough review of unit tests
    • Integration tests

• [x] I ran hatch run prepare

Checklist

• [x] I have read the CONTRIBUTING document
• [x] I have added any necessary tests that prove my fix is effective or my feature works
• [ ] I have updated the documentation accordingly
• [ ] I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
• [x] My changes generate no new warnings
• [x] Any dependent changes have been merged and published

Additional Notes:
N/A

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

AGENT = "AGENT"


@dataclass
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will likely convert these to TypedDict classes. The main reason I had dataclass was for the lambda default_factory, but this introduces more complexity than its worth.

@Unshure Unshure force-pushed the session-persistence branch from 74d7ed6 to 3b81f1b Compare July 1, 2025 14:49
from ..agent.agent import Agent


class AgentSessionManager(SessionManager):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name seems so abstract IMHO. I feel like this should be FileSessionManager or something that indicates that it's file based

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted this to represent a single agent session manager. Since we eventually want to support multi-agent session managers, I figured we might have a GraphSessionManager or SwarmSessionManager in the future as well.

"""

@abstractmethod
def append_message_to_agent_session(self, agent: "Agent", message: Message) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems verbose compared to many of our apis is there a need to specify agent_session here?

I'd suggest append_message

raise NotImplementedError("Subclasses must implement update_session")

@abstractmethod
def initialize_agent(self, agent: "Agent") -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also suggest just initialize; I'm not entirely convinced we need to repeat agent?

agent: The agent whose session to update
message: The message to append
"""
if agent.id is None:
Copy link
Member

@zastrowm zastrowm Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just default a value (like "default")

My concern is that someone just trying to persist a single agent now needs to add both the session_manager and set an id. So someone persisting a personal agent now has more steps?

Not a blocker though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this would work by checking if there is only one agent in the session, that agent is given a generated id. Then when it is re-initialized, the id will be set? I like that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

oh, that's better than what I was thinking, because I didn't handle the edge case of "you have multiple sessions on file but the agent has no id set"

from .session_models import Session, SessionAgent, SessionMessage


class SessionDAO(ABC):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many of the methods on here are actually used within the SessionManager? My concern is that we're adding implementations of things that are potentially useful from the standpoint of callers, but not necessary for the manager. I think I'd rather keep the api surface smaller until we have alignment that the apis are the right ones

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I wonder if we need the DAO. I'm curious, for something like a provider that wanted to do more than just storing/loading messages? Say if they wanted to affect the prompt or something - would they be implementing a DAO or a SessionManager or both? Do we need both?

Copy link
Member Author

@Unshure Unshure Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on this one a bit.

The general request for this feature is some built-in way to persist and re-initialize an agent as it runs. The interface for a SessionManager should be super basic to support any kind of implementation that may come up.

If we also want to include some implementation of how to actually persist and retrieve an agent, I think a DAO makes sense for that. I included a pretty basic data model to represent agents, messages, and some wrapper around many agents which i'm calling a Session. I imagine if customers wants to build on top of the persistence we have build in, we should probably implement a basic CRUDL api for it. interested in hearing your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One aspect that I'm hung up on is the size of the DAO - 15 methods to implement a new backend was more than I was expecting - but is that perhaps because of the intrinsic complexity?

The other aspect is the distinction between the DAO and the SessionManager, and so I'm interested in the eventual full usage. This makes sense to me:

sessions = S3SessionManager()
agent = Agent(sessions=sessions, id='default')

while this seems to be a bit less ergonomic:

sessions = SessionManager(dao=S3Dao())
agent = Agent(sessions=sessions, id='default')

"""Exception classes for session management operations."""


class SessionException(Exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have one file for all exceptions in src/strands/types/exceptions.py



@dataclass
class SessionMessage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought?: Have we considered having this in the SDK directly in its implementation of Message and we just strip them when we send it to the providers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only important for the session persistence, I would probably keep is separated. If we find another usecase for it though, I would be fine with your suggestion.


@dataclass
class SessionAgent:
"""Agent within a session with lazy message loading."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the lazy-loading come in to play?

Would this be more clear as AgentMetadata?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old vibecoded docstring. This can be removed


except SessionException:
# Session doesn't exist, create new one
logger.debug("Session not found, creating new session")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to log the session-id here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an example usage of what it looks like when using a session with an agent, from the caller's perspective?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ill add an integration test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: We're Working On It
Development

Successfully merging this pull request may close these issues.

2 participants