Find answers from the community

Updated last month

Timeout

At a glance

The community member is trying to figure out why a higher timeout set for the OpenAI LLM option is not being used by the retry decorator, which always kicks in at 60 seconds. They are trying to call a chat completion with the DeepSeek R1 model, which always returns a 200 OK response with an empty body within 60 seconds.

The comments indicate that this is a bug, and a community member has made a local fix and plans to submit a pull request. However, the issue still seems to be that the timeout is not being passed all the way to the OpenAI client call, as the timeout is set to "NOT_GIVEN" there. The community members discuss potential reasons for this, such as the client call setting the timeout somewhere else, and plan to investigate further.

Useful resources
I am trying to figure out how a higher timeout set for the openai llm option isnt clobbered by the retry_decorator here
retry = create_retry_decorator(
max_retries=max_retries,
random_exponential=True,
stop_after_delay_seconds=60,
min_seconds=1,
max_seconds=20,
)

Even though I have set a higher timeout, the retry always kicks in at 60 seconds, shouldn't the retry decorator use the same value for timeout in stop_after_delay_seconds

Maybe I am imissing something?

For additional context I am trying to call a chat completion with the deepseek r1 model, and it always returns from the POST to https://api.deepseek.com/chat/completions in 60 seconds with a 200 OK, but empty response.
L
r
14 comments
Nah this is a bug. Would appreciate a PR if you have time
Otherwise i can get to it in a bit
I made a fix locally. Will submit
But, despite that, it still looks like the timeout doesn't make it all the way to the client http call. The method for chat completions in the openai library has timeout as NOT_GIVEN, so this might still be on an issue.
So here or the equivalent in achat,
When it gets to openai here, timeout isnt set.
Probably not something encountered often, but with reasoning models, the roundtrip could take a longer period of time than the regular chat models
If I'm reading the code properly, the timeout is passed in the credentials kwargs
https://github.com/run-llama/llama_index/blob/7391f302e18542c68b9cf5025afb510af4a52324/llama-index-integrations/llms/llama-index-llms-openai/llama_index/llms/openai/base.py#L404

Which is used when creating the openai client OpenAI(...., timeout=60.0)
Sure, but when I step thru the code, timeout at the post call is NOT_GIVEN πŸ€·πŸ½β€β™‚οΈ
So it does say that the timeout in the function is a override of the client level timeout. So maybe the client call is setting the timeout somewhere. I'll check the http calls and see if that's where it's showing.
✨️ appreciate it! Taking a look
This is strange on the actions in llama-index. That isn't my commit...
not sure on that one, github getting spooky
Add a reply
Sign up and join the conversation on Discord