Feedback loop for chat engine response quality improvement.

Question

Is there any human feedback loop possible such that it can understand what answers the chat engine gave were good and some were not so good and so should take care in future response. We are giving feedback of good and bad response based on like and dislike button for each response
@Logan M @WhiteFang_Jr

WhiteFang_Jr · Answer

If the bot is giving bad response that could mean that the source nodes it is picking may not contain right answers or may have similar context for different questions.

Not sure if human in the middle would help in this case.

Human in the middle can help you with like change the tone of the answers or maybe new prompt instructions.

For this also you dont need to have human in the middle

WhiteFang_Jr · Answer

You need to review bot answers based on users feedback and then maybe give updated doc to the bot based on observation and correction

Pragyan Mohapatra · Answer

I want to store the liked responses so that those will be used in the answers by the query engine

Pragyan Mohapatra · Answer

Can you please help me how can I do that

Pragyan Mohapatra · Answer

@WhiteFang_Jr @Logan M

WhiteFang_Jr · Answer

You can insert the liked responses to the index but this would only create the duplication of your document. Which might result in leaving other nodes while getting right answers.

This would decrease bot response quality from the actual document

Pragyan Mohapatra · Answer

Can I add in the chat history that user liked the response or user disliked the response, so that the Llm remembers and give more like liked responses and less like disliked responses@WhiteFang_Jr

Pragyan Mohapatra · Answer

@Logan M

Logan M · Answer

Not really? I mean, you could add some text and modify that message in the chat historyIts python, you can do whatever your heart desires 🙂

Pragyan Mohapatra · Answer

I can add text in the chat history that user liked the response or user disliked the response

But what I am asking is, will the llm learn from it and give more similar to liked responses and less similar to disliked responses, based on the liked and disliked responses in the chat history, can we make llm learn which are good answers and which are not.

So that the llm doesn’t repeat disliked answers to other users

Pragyan Mohapatra · Answer

@Logan M

Logan M · Answer

You can only influence the llm throughprompting (which is what modifying the messages might do)finetuning (i.e. DPO, etc.)

Logan M · Answer

the latter is outside the scope of llama-index

Pragyan Mohapatra · Answer

Yes, I want to fine tune the chat engine responses based on feedback from user. Can you provide some example how can we do it

Pragyan Mohapatra · Answer

@Logan M

Logan M · Answer

Look up some huggingface guides I suppose. I haven't finetuned a chat model in ages. A quick google search will take you far

Logan M · Answer

https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Pragyan Mohapatra · Answer

Can llama index score the liked and dislike responses and use in the feedback loop?That is not possible using llama index, right?

Logan M · Answer

not possible in llama-index directly yea. You could implement that pretty easily using a raw llm call to score a response, and then integrate that into some finetuning dataset though

Pragyan Mohapatra · Answer

How can I score similar answer based on liked and disliked responses. If many users give thumbs up for a question, how can I get score for that answer of a question

Pragyan Mohapatra · Answer

@Logan M

Pragyan Mohapatra · Answer

Please let me know how can I do this

Logan M · Answer

prompt an llm to score it? or just assign a binary 0 or 1 based on thumbs up or down

Pragyan Mohapatra · Answer

And then how do we integrate with fine tuning dataset

Logan M · Answer

that depends on what the dataset format is supposed to look like? I really recommend reading a few finetuning guides

Pragyan Mohapatra · Answer

Sure, before moving outside llama index, I was trying to explore if I can add most liked answers in vector db and give prompt that these answers are good and liked, we can influence that way as well, isn’t it?

Pragyan Mohapatra · Answer

@Logan M

Find answers from the community

Feedback loop for chat engine response quality improvement.