Find answers from the community

Updated 2 years ago

Can someone explain the embeddings

At a glance
Can someone explain the embeddings module for GPTIndex? How is it different from doing cosine similarity across a set of vectors and grabbing top k results and injecting that into the prompt?
j
l
7 comments
Hi @lucasneg , that basically is the gist of our embeddings support across our indices: https://gpt-index.readthedocs.io/en/latest/how_to/embeddings.html

However, 1) you don't have to worry about token limitations (you can feed in more examples than can fit within the max prompt size and you'll still get an answer), 2) with GPT Index tools you can try out different indices + combine indices to better synthesize an answer over your data
how does the accuracy compare to the tree index? it looks interesting but way too expensive for production usage it seems
is the tokenizer being used? in theory, couldn't I hit a token limit if I inject too much context?
Re 1) question empirically an embeddings-based approach is better than tree index for retrieving top-k documents over a large corpus (you can specify your own embeddings per document too instead of having us call OpenAI for you e.g. here https://twitter.com/gpt_index/status/1608975108496068609?s=20&t=V4a7LUq7Bi7O_PvxN_eR8g).

2) yeah one of the main points of GPT Index is we build a data structure over your data so even if the context is > token limit, we handle that under the hood for you by doing iterative LLM calls per text chunk (you can see how each index works here https://gpt-index.readthedocs.io/en/latest/guides/index_guide.html)
by accuracy I mean in terms of finding the correct context. I find that embeddings sometimes finds the most similar document, but not necessarily the most relevant one
but nice, will check out the iterative logic
πŸ‘ i haven't done extensive experiments, but would love your feedback as to where it's not working for future improvements

Re: iterative logic check out the response synthesis section too! https://gpt-index.readthedocs.io/en/latest/guides/index_guide.html you can specify diff response modes for each query
Add a reply
Sign up and join the conversation on Discord