Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show LLM retries and allow resume from rate-limit state #6438

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

raymyers
Copy link
Contributor

End-user friendly description of the problem this fixes or functionality that this introduces

Show LLM retries as they happen and allow resume from Rate-Limited state

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

Currently when RateLimitError is thrown during a step, retries happen without visibility to the user and then this is shown.

Screenshot 2025-01-23 103524

This PR changes this in two ways. It passes that exception to _react_to_exception, which already has handling for it, triggering an Agent State of RateLimited:

Screenshot 2025-01-23 103725

Second, we want to show the user the retry loop as it occurs, so a retry_listener is added to LLM class and that is used to add them to the Event Stream as Errors from the Environment.

Questions

Is the Event Stream the appropriate place? We want to show it to the user in chat, but not expose it to the Agent. I'm concerned that as an Observation, it might be fed back into the LLM.


Link of any specific issues this addresses

@@ -532,7 +534,7 @@ async def start_delegate(self, action: AgentDelegateAction) -> None:
agent_cls: Type[Agent] = Agent.get_cls(action.agent)
agent_config = self.agent_configs.get(action.agent, self.agent.config)
llm_config = self.agent_to_llm_config.get(action.agent, self.agent.llm.config)
llm = LLM(config=llm_config)
llm = LLM(config=llm_config, retry_listener=self._notify_on_llm_retry)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out self._status_callback--I think that's the right way to do this

Copy link
Contributor Author

@raymyers raymyers Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK on another read-through I might understand how status_callback could be repurposed for this. I had assumed it was linked to set_agent_state_to but they can be called independently...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so _status_callback isn't about agent status so much as...system status I guess?

At a high level, it's a way to pass messages back to the FE via the websocket without going through the EventStream

Currently the only place it gets used is while the runtime boots up, so we can show a little progress. So we might need to change the FE logic a bit to support a wider set of use cases here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the little "status indicator" next to the play/pause button is the right place for something like this. Not sure at what point we'd clear it out though.

We have a pattern of sending back a single space ' ' (why not the empty string? no idea) to clear the status message

if self.status_callback is not None:
msg_id = 'STATUS$LLM_RETRY'
self.status_callback(
'info', msg_id, f'Retrying LLM request, {retries} / {max}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better as warn rather than info if that were a status, but it would need to be added to frontend/src/services/actions.ts and that only makes sense if we have others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants