Find answers from the community

Updated 5 months ago

Docstore

At a glance

The community member is seeking clarification on what the docstore is intended to store - chunks or full documents. The documentation states that the docstore stores chunks, but the community member has found that the ingestion pipeline stores the full document text before chunking when using document management. The comments indicate that the docstore can store both chunks and full documents, as a document and a chunk are essentially the same class under the hood. There is no explicitly marked answer, but the community members provide this clarification.

Useful resources
Hello, can I have some clarification on what the docstore is intended to store? Chunks or full document?

The documentation here states chunks: https://docs.llamaindex.ai/en/stable/module_guides/storing/docstores/

However, I have found that the ingestion pipeline stores the full document text before chunking when using document management.

https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/#document-management

Thank you!
L
f
g
7 comments
A document and a chunk are essentially the exact same class under the hood
The docstore can store both
whats your thoughts on contextual retreival from anthropic?
its just metadata/chunk enrichment -- nothing new really. Us and other libraries have had similar modules for ages. The cooler part about it is leveraging prompt caching to make it scalable
Awesome thanks for the clarification Logan
Add a reply
Sign up and join the conversation on Discord