There are at least some cases when Neuron doesn't handle implicit retries in case the LLM request tool use with an invalid request body (I personally ran into situation with missing required parameters, but it may not do the recovery in more cases, like bad parameter types, invalid array items – less items, more items, bad type of some item, …).
Also, a similar problem comes with MALFORMED_FUNCTION_CALL finish reason sent by the Gemini provider—the app crashes, which isn't acceptable.
I think this is one of the core features of a decent-quality AI library. Every crash could cause data loss, bad customer experience, and other similar problems. As LLMs are stochastic and sometimes hallucinate, they generate invalid JSON schemas, don't respect property definitions, etc. There should be a recovery mechanism that implicitly retries the request, so developers don't have to handle this every time they write an agent that has some tools.
Also, I hope an implicit retry mechanism is implemented for structured outputs or casual API errors (the API is unavailable for a few seconds, it crashed during request processing, etc.). The reasons are pretty much the same as for tool use.
There are at least some cases when Neuron doesn't handle implicit retries in case the LLM request tool use with an invalid request body (I personally ran into situation with missing required parameters, but it may not do the recovery in more cases, like bad parameter types, invalid array items – less items, more items, bad type of some item, …).
Also, a similar problem comes with
MALFORMED_FUNCTION_CALLfinish reason sent by the Gemini provider—the app crashes, which isn't acceptable.I think this is one of the core features of a decent-quality AI library. Every crash could cause data loss, bad customer experience, and other similar problems. As LLMs are stochastic and sometimes hallucinate, they generate invalid JSON schemas, don't respect property definitions, etc. There should be a recovery mechanism that implicitly retries the request, so developers don't have to handle this every time they write an agent that has some tools.
Also, I hope an implicit retry mechanism is implemented for structured outputs or casual API errors (the API is unavailable for a few seconds, it crashed during request processing, etc.). The reasons are pretty much the same as for tool use.