If the bot is giving bad response that could mean that the source nodes it is picking may not contain right answers or may have similar context for different questions.
Not sure if human in the middle would help in this case.
Human in the middle can help you with like change the tone of the answers or maybe new prompt instructions.
For this also you dont need to have human in the middle
You need to review bot answers based on users feedback and then maybe give updated doc to the bot based on observation and correction
I want to store the liked responses so that those will be used in the answers by the query engine
Can you please help me how can I do that
You can insert the liked responses to the index but this would only create the duplication of your document. Which might result in leaving other nodes while getting right answers.
This would decrease bot response quality from the actual document
Can I add in the chat history that user liked the response or user disliked the response, so that the Llm remembers and give more like liked responses and less like disliked responses
@WhiteFang_Jr
Not really? I mean, you could add some text and modify that message in the chat history
Its python, you can do whatever your heart desires 🙂
I can add text in the chat history that user liked the response or user disliked the response
But what I am asking is, will the llm learn from it and give more similar to liked responses and less similar to disliked responses, based on the liked and disliked responses in the chat history, can we make llm learn which are good answers and which are not.
So that the llm doesn’t repeat disliked answers to other users
You can only influence the llm through
- prompting (which is what modifying the messages might do)
- finetuning (i.e. DPO, etc.)
the latter is outside the scope of llama-index
Yes, I want to fine tune the chat engine responses based on feedback from user. Can you provide some example how can we do it
Look up some huggingface guides I suppose. I haven't finetuned a chat model in ages. A quick google search will take you far
Can llama index score the liked and dislike responses and use in the feedback loop?
That is not possible using llama index, right?
not possible in llama-index directly yea. You could implement that pretty easily using a raw llm call to score a response, and then integrate that into some finetuning dataset though
How can I score similar answer based on liked and disliked responses. If many users give thumbs up for a question, how can I get score for that answer of a question
Please let me know how can I do this
prompt an llm to score it? or just assign a binary 0 or 1 based on thumbs up or down
And then how do we integrate with fine tuning dataset
that depends on what the dataset format is supposed to look like? I really recommend reading a few finetuning guides
Sure, before moving outside llama index, I was trying to explore if I can add most liked answers in vector db and give prompt that these answers are good and liked, we can influence that way as well, isn’t it?