Find answers from the community

Updated 10 months ago

Query Fusion Retriever with metadata filtering fails

At a glance
I have a redis vector index store and a doc store and an upsatash vector store defined. When I define my retriever such that it filters the nodes by metadata it seems to fail.

Steps to Reproduce


defining a base retriever with metadata filter enabled

Query Fusion Retriever with metadata filtering fails

Plain Text
base_retriever = self.base_index.as_retriever(
                    similarity_top_k=self.similarity_top_k,
                    filters=MetadataFilters(
                        filters=[
                            ExactMatchFilter(key="namespace", value=self.namespace)
                        ]
                    ),
                )
retriever = AutoMergingRetriever(
                    base_retriever, self.storage_context, verbose=verbose
                )
                
bm24_retriever = BM25Retriever.from_defaults(
                docstore=self.docstore, similarity_top_k=self.similarity_top_k
            )

fusion_retriever = QueryFusionRetriever(
                [retriever, bm24_retriever],
                similarity_top_k=self.similarity_top_k,
                num_queries=1,  # set this to 1 to disable query generation
                mode="reciprocal_rerank",
                use_async=True,
                verbose=verbose,
            )


https://github.com/run-llama/llama_index/issues/11391#issue-2153660643
b
L
10 comments
Is this my upstash vector store that's is causing the issue ?
Is namespace actually a field you added to the metadata of each document?
yeah added it :

Plain Text
def create_document(self, text, filename):
        documents = []
        for idx, page in text.items():
            document = Document(text=page)
            current_date = datetime.now().strftime("%Y-%m-%d")
            document.metadata = {
                "filename": filename,
                "page_number": idx,
                "creation_date": current_date,
                "last_accessed_date": current_date,
                "last_modified_date": current_date,
                "namespace": self.namespace
            }
            documents.append(document)
        return documents
the current stack uses upstash vector store. Does namespace as a metadata field should also be present while defining the vector store or adding it to the doc is enough? Upstash as a vector store currently does not seem to have support for metadata fields
Looking at the upstash code, it should support storing metadata. But not filtering
It seemed to throw an error when I tried adding the metadata when I did this:

Plain Text
vector_store = UpstashVectorStore(
                    url=self.config.get("UPSTASH_VECTOR_URL"),
                    token=self.config.get("UPSTASH_VECTOR_TOKEN"),
                    metadata_fields=['namespace']
                )
metadata_fields=['namespace'] is not an arg in the source code πŸ‘€
When it adds data, it seems to just throw in all the metadata
Attachment
image.png
yeah, had seen the method
Add a reply
Sign up and join the conversation on Discord