llama_parse/examples/multimodal/multimod...

Question

hello there, i have some questions about preparing your knowledge base for a multimodal rag application.referencing from this guide - https://github.com/run-llama/llama_parse/blob/main/examples/multimodal/multimodal_rag_slide_deck.ipynbit iterates through each page and creates a TextNode, and at the same time, adds the page number and image path of the image as metadata.For my case, I am using MarkdownElementNodeParser which separates texts and tables into IndexNode and BaseNode. Similarly I will like to add page number and image path into these nodes' metadata. But the sequence of the nodes are already jumbled up from line 2 onwards. So how can I still add the page number and image path in them? Thanks[1] node_parser = MarkdownElementNodeParser(llm=llm)
[2] nodes = node_parser.get_nodes_from_documents([document])
[3] base_nodes, objects = node_parser.get_nodes_and_objects(nodes)

Logan M · Answer

your initial input documents should probably be per-page already, and have the metadata attached, so that the nodes inherit it

galvangjx · Answer

So before running line 2 of the code, document (the Document object from LlamaParse) should already have page number and image path as metadata?My document exist in Azure blob storage and I am passing in my file as file bytes to the parsing service via .get_json_result()For context this is my set up and parsing steps# define LlamaParse parameters
parser_params = { 'api_key': LLAMA_CLOUD_API_KEY, 'result_type': 'markdown',
} parser = LlamaParse(**parser_params)
extra_info = {"file_name": blob_name} # code to download blob as bytes # write document stream to temp file
with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as temp_file: document_stream.seek(0) temp_file.write(document_stream.read()) temp_file_path = temp_file.name # parse document
document = parser.get_json_result(temp_file.name, extra_info=extra_info) # clean up temp file
os.remove(temp_file_path)

Logan M · Answer

get_json_result doesn't return Document object, but rather, a bunch of info that you can use to construct a Document

galvangjx · Answer

I should use load_data then?

Logan M · Answer

maybe? I think you lose out on some metadata though. I would use the json result and construct document objects from it, that way you can get any info you want

Logan M · Answer

This is a thourough example of what the json result is returning
https://github.com/run-llama/llama_parse/blob/main/examples/demo_json_tour.ipynb

From there, you could construct document objects Document(text=text, metadata={'key': 'val', ...}

Find answers from the community

llama_parse/examples/multimodal/multimod...