Manan Patel

·

Efficiently Handling Large CSV Files with LlamaIndex

I'm facing an issue while trying to read a large CSV file (around 20M+ rows) using SimpleDirectoryReader. It seems to struggle with handling such a large file.

Is it possible to read this file using CSVReader? Or are there any other recommended approaches within LlamaIndex for efficiently handling large CSV files?

2 comments

W

L

MManan Patel

·

Efficient Approach for Generating Embeddings from CSV Files for Retrieval-Augmented Generation

Hi everyone,
I’m working with CSV files and exploring the best way to generate and save embeddings for them. I noticed that PagedCSVReader creates one embedding per row, which can be time-consuming for large files.

Could you recommend a more efficient approach to generate embeddings while maintaining accuracy for Retrieval-Augmented Generation (RaG)? I’m looking for something that balances embedding granularity and performance, especially for structured tabular data.

Thanks in advance for your insights!

10 comments

W

M

c

Find answers from the community

Efficiently Handling Large CSV Files with LlamaIndex

Efficient Approach for Generating Embeddings from CSV Files for Retrieval-Augmented Generation