Timeout

At a glance

The community member is trying to figure out why a higher timeout set for the OpenAI LLM option is not being used by the retry decorator, which always kicks in at 60 seconds. They are trying to call a chat completion with the DeepSeek R1 model, which always returns a 200 OK response with an empty body within 60 seconds.

The comments indicate that this is a bug, and a community member has made a local fix and plans to submit a pull request. However, the issue still seems to be that the timeout is not being passed all the way to the OpenAI client call, as the timeout is set to "NOT_GIVEN" there. The community members discuss potential reasons for this, such as the client call setting the timeout somewhere else, and plan to investigate further.

Useful resources

rro

I am trying to figure out how a higher timeout set for the openai llm option isnt clobbered by the retry_decorator here
retry = create_retry_decorator(
max_retries=max_retries,
random_exponential=True,
stop_after_delay_seconds=60,
min_seconds=1,
max_seconds=20,
)

Even though I have set a higher timeout, the retry always kicks in at 60 seconds, shouldn't the retry decorator use the same value for timeout in stop_after_delay_seconds

Maybe I am imissing something?

For additional context I am trying to call a chat completion with the deepseek r1 model, and it always returns from the POST to https://api.deepseek.com/chat/completions in 60 seconds with a 200 OK, but empty response.

14 comments

LLogan M

Nah this is a bug. Would appreciate a PR if you have time

LLogan M

Otherwise i can get to it in a bit

rro

I made a fix locally. Will submit

rro

But, despite that, it still looks like the timeout doesn't make it all the way to the client http call. The method for chat completions in the openai library has timeout as NOT_GIVEN, so this might still be on an issue.

rro

So here or the equivalent in achat,
When it gets to openai here, timeout isnt set.

rro

Probably not something encountered often, but with reasoning models, the roundtrip could take a longer period of time than the regular chat models

LLogan M

If I'm reading the code properly, the timeout is passed in the credentials kwargs
https://github.com/run-llama/llama_index/blob/7391f302e18542c68b9cf5025afb510af4a52324/llama-index-integrations/llms/llama-index-llms-openai/llama_index/llms/openai/base.py#L404

Which is used when creating the openai client OpenAI(...., timeout=60.0)

LLogan M

Which is valid according to the openai source code
https://github.com/openai/openai-python/blob/7193688e364bd726594fe369032e813ced1bdfe2/src/openai/_client.py#L82

rro

Sure, but when I step thru the code, timeout at the post call is NOT_GIVEN 🤷🏽‍♂️

rro

So it does say that the timeout in the function is a override of the client level timeout. So maybe the client call is setting the timeout somewhere. I'll check the http calls and see if that's where it's showing.

rro

Issue and PR

LLogan M

✨️ appreciate it! Taking a look

rro

This is strange on the actions in llama-index. That isn't my commit...

LLogan M

not sure on that one, github getting spooky

Add a reply

Find answers from the community

Timeout