Find answers from the community

Updated 5 months ago

Issue reading from local SimpleVectorStore

At a glance

The community member is having an issue reading from a local SimpleVectorStore and cannot figure it out. The issue is that the code works fine in a Qdrant setup, but not in the local SimpleVectorStore. The community members discuss the code, which includes loading the vector index from storage, creating a vector retriever and query engine, and querying the engine. They note that the code seems to be constructed correctly, as it can read the document hashes, but it does not seem to be comparing the query with the data in the vector store. One community member suggests a simpler way to load the index from storage.

Useful resources
I'm having an issue reading from a local SimpleVectorStore and can't figure it out. The question is an exact replica of one in the ingested docs. Works fine in the qdrant i have setup (using other code obviously but same concepts).

(code in thread)
s
L
6 comments
Plain Text
self.vector_storage_context = StorageContext.from_defaults(vector_store=SimpleVectorStore(), persist_dir="storage/vector")

<skip>

    def _create_query_engine(self, product: str) -> RetrieverQueryEngine:
        try:
            vector_index = load_index_from_storage(storage_context=self.vector_storage_context, embed_model=Settings.embed_model)
            print(vector_index.docstore.get_all_document_hashes())
            
            vector_retriever = VectorIndexRetriever(
                index=vector_index, 
                similarity_top_k=10,
            )
            
            vector_query_engine = RetrieverQueryEngine(
                retriever=vector_retriever,
                response_synthesizer=self.response_synthesizer,
                node_postprocessors=[
                    SimilarityPostprocessor(similarity_cutoff=0.50), 
                    self.cohere_rerank
                ],
            )
            
            vector_query_engine.update_prompts({"response_synthesizer:text_qa_template": self.qa_prompt_tmpl})
            
            embedded_query = Settings.embed_model.get_text_embedding("Why is the sky blue?")
            print(embedded_query[:5])
            response = vector_query_engine.query(QueryBundle(query_str="Why is the sky blue?", embedding=embedded_query))
            print(response)
Results:

Plain Text
INFO:llama_index.core.indices.loading:Loading all indices.

{'5b1b644c50660b2eeb43bb629c2f462bcbdafd4cb4e27ca7501ddfc62bc97c53': 'd8a32ab1-7d70-4385-b2a7-01a5199d1cce', 'e0b24ce3b797398050dae22d27b2e2b24c93d370ef96c358352e38bd0151a97c': '6cccde43-92af-4702-889f-db29fcb45088', 'd2df971ab7a5100975e944a514f1f753d1d0b5156fed7135312444e199b93a93': '8bfa44c9-eeb6-4408-a7ef-c250c53dd88c', 'e9050cb40b3c654d3266997d42beb7fc8d1fd02255f5eeac0679d35df646d997': 'edf5382c-2328-470b-8ffd-b28c6f76c186'}

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"

[-0.026572242379188538, -0.02191539667546749, -0.02187306247651577, 0.015861498191952705, 0.04239140450954437]

Empty Response
everything was constructed correctly it seems; also in the output you can see it seeing/printing the doc hashes out so it's able to read but just doesn't seem to be comparing the query with the whatever is in the vector store
Attachment
Screenshot_2024-10-24_at_8.31.06_AM.png
the loading code seems a little weird
Plain Text
storage_context = StorageContext.from_defaults(persist_dir="./storage/vector:)

index = load_index_from_storage(storage_context)
Add a reply
Sign up and join the conversation on Discord