Show LLM retries and allow resume from rate-limit state #6438

raymyers · 2025-01-23T22:21:14Z

End-user friendly description of the problem this fixes or functionality that this introduces

Show LLM retries as they happen and allow resume from Rate-Limited state

Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

Currently when RateLimitError is thrown during a step, retries happen without visibility to the user and then this is shown.

This PR changes this in two ways. It passes that exception to _react_to_exception, which already has handling for it, triggering an Agent State of RateLimited:

Second, we want to show the user the retry loop as it occurs, so a retry_listener is added to LLM class and that is used to add them to the Event Stream as Errors from the Environment.

Questions

Is the Event Stream the appropriate place? We want to show it to the user in chat, but not expose it to the Agent. I'm concerned that as an Observation, it might be fed back into the LLM.

Link of any specific issues this addresses

rbren · 2025-01-23T22:26:34Z

openhands/controller/agent_controller.py

@@ -532,7 +534,7 @@ async def start_delegate(self, action: AgentDelegateAction) -> None:
        agent_cls: Type[Agent] = Agent.get_cls(action.agent)
        agent_config = self.agent_configs.get(action.agent, self.agent.config)
        llm_config = self.agent_to_llm_config.get(action.agent, self.agent.llm.config)
-        llm = LLM(config=llm_config)
+        llm = LLM(config=llm_config, retry_listener=self._notify_on_llm_retry)


Check out self._status_callback--I think that's the right way to do this

OK on another read-through I might understand how status_callback could be repurposed for this. I had assumed it was linked to set_agent_state_to but they can be called independently...

so _status_callback isn't about agent status so much as...system status I guess?

At a high level, it's a way to pass messages back to the FE via the websocket without going through the EventStream

Currently the only place it gets used is while the runtime boots up, so we can show a little progress. So we might need to change the FE logic a bit to support a wider set of use cases here

I do think the little "status indicator" next to the play/pause button is the right place for something like this. Not sure at what point we'd clear it out though.

We have a pattern of sending back a single space ' ' (why not the empty string? no idea) to clear the status message

openhands/controller/agent_controller.py

raymyers · 2025-01-23T23:07:55Z

openhands/controller/agent_controller.py

+        if self.status_callback is not None:
+            msg_id = 'STATUS$LLM_RETRY'
+            self.status_callback(
+                'info', msg_id, f'Retrying LLM request, {retries} / {max}'


This might be better as warn rather than info if that were a status, but it would need to be added to frontend/src/services/actions.ts and that only makes sense if we have others.

Show LLM retries and allow resume from rate-limit state

57bda57

rbren reviewed Jan 23, 2025

View reviewed changes

openhands/controller/agent_controller.py Outdated Show resolved Hide resolved

Use status message to show retry

17c6bf7

raymyers commented Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show LLM retries and allow resume from rate-limit state #6438

Show LLM retries and allow resume from rate-limit state #6438

raymyers commented Jan 23, 2025

rbren Jan 23, 2025

raymyers Jan 23, 2025 •

edited

Loading

rbren Jan 23, 2025

rbren Jan 23, 2025

raymyers Jan 23, 2025

Show LLM retries and allow resume from rate-limit state #6438

Are you sure you want to change the base?

Show LLM retries and allow resume from rate-limit state #6438

Conversation

raymyers commented Jan 23, 2025

Questions

rbren Jan 23, 2025

Choose a reason for hiding this comment

raymyers Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

rbren Jan 23, 2025

Choose a reason for hiding this comment

rbren Jan 23, 2025

Choose a reason for hiding this comment

raymyers Jan 23, 2025

Choose a reason for hiding this comment

raymyers Jan 23, 2025 •

edited

Loading