Find answers from the community

Updated 3 months ago

Response streaming

I mean it starts printing out/streaming the response nodes but they're nonsensical, it just keep repeating one source node and doesn't function the same way it does when not streaming
L
T
5 comments
Not sure I know what you mean haha

So here's my understanding. You can enable streaming and query

response = query_engine.query("query")

From there, you can either do

response.print_response_stream() to print to stdout

Or you can iterate over the generator yourself to handle the tokens

Plain Text
for word in response.response_gen:
    <do a thing with word>


At the same time, independent of those things, you can do response.source_nodes to get a list of source nodes and similarity scores. These are not streamed, as they are static and should be set before the response even starts streaming

So you could iterate over the generator, and after the generator is done, do something with the source nodes right?
Yeah I'm streaming the LLM response stream, however I was trying to get the response source nodes printed at the same type (I managed to get them streaming) but it doesn't really work properly.

I also tried just streaming the LLM response stream and then printing out the source nodes simultaneously onto a different element (without streaming) but that didn't seem to work too well either.
But I guess if they're calculated before the response even starts streaming there must be a way to print them visible even before the LLM response starts streaming?
But that part seemed to break it for me because the data types are different I guess?
I found a patchy workaround but lets say it wasn't too easy πŸ™πŸΌ
Add a reply
Sign up and join the conversation on Discord