Find answers from the community

Updated 2 weeks ago

Storing Metadata in Neo4j Using VectorStoreIndex

Hello everyone,
I'm encountering some problems with storing metadata in Neo4j using the VectorStoreIndex. I'm creating nodes with important metadata. Here is the relevant part of my code:
Plain Text
def get_metadata(filename):
   for item in metadata:
    if item["url"] == filename:
     return item
   return {}
documents = SimpleDirectoryReader(dir_documents, file_metadata=get_metadata).load_data()
node_parser = SentenceWindowNodeParser.from_defaults(window_size=3,window_metadata_key="window",original_text_metadata_key="original_text")
nodes = node_parser.get_nodes_from_documents(documents)
transformations = [title_extractor, qa_extractor]
neo4j_vector_store = Neo4jVectorStore(neo_username, neo_password, neo_url, embed_dim, hybrid_search=True, index_name=index_name)
storage_context = StorageContext.from_defaults(vector_store=neo4j_vector_store)
vector_index = get_vector_index(neo_username, neo_password, neo_url, embed_dim, index_name)
vector_index.insert_nodes(nodes, transformations=transformations, storage_context=storage_context)

I then built a query engine used by an agent. While the agent retrieves these nodes, the nodes do not have the metadata I imported previously. Upon checking the Neo4j created nodes, I noticed that not only do the nodes lack this metadata, but the metadata has also been vectorized as part of the content. Therefore, I don't have access to that metadata when retrieving normal chunks.
Here is the code for the query engine:
Plain Text
node_postprocessors = [MetadataReplacementPostProcessor(target_metadata_key="window"),SimilarityPostprocessor(similarity_cutoff=0.5)]
index_query_engine = index.as_query_engine(similarity_top_k=doc_similarity_top_k, node_postprocessors=node_postprocessors)

Is this an error? Is it a problem with using SentenceWindowNodeParser or how i include the metadata? What can I do to ensure that the metadata is stored correctly and can be retrieved as expected?
Any help or guidance would be greatly appreciated.
Thank you!
L
d
2 comments
I don't think you can pass transformations to insert_nodes? I think that has to go in the constructor?

VectorStoreIndex(..., transfomrations=transformations)
Thank you, that is true. I resolve the issue adding metadata after creating all nodes, because the failing point was the function getmetadata send it to SentenceNoseParser.
Add a reply
Sign up and join the conversation on Discord