Find answers from the community

Updated 3 months ago

save_to_string speed

At a glance
Hey . I'm also using AWS lambda but apperently save_to_string takes very long time to run even on small text files. Any idea how things can be speed up?
L
y
j
9 comments
Hmm, it's taking too long on lambda, or before lambda?
@Logan M on lambda but also tried locally on my laptop. It can take more than 10 seconds for a 30kb text file that was indexed with chunk_size = 256
Yea if you have a vector index, it's generating 1526? Sized vectors for each chunk, so saving this might take some time when you have a lot of chunks. Not sure how that can be sped up πŸ€”

If saving is going to be a common operation, it might be worthwhile to setup a 3rd party vector store (pinecone, etc.). Then, the vectors are never saved in the index
When you save the index to disk, I would guess it's much larger than 30kb
I guess I'm getting a huge json when getting the embeddings from open AI. If using Pinecone, don't I need to somehow covert this json to a string before saving it in the DB? Where would the vectors be saved?
I'm pretty sure when using a store like pinecone, it can send the raw numbers, rather than converting to string (type conversion is probably what's slowing things down, if I had to guess). And this is all handled under the hood

Maybe @jerryjliu0 can confirm this haha
@yoelk with pinecone, you don't need to do save_to_disk or save_to_string, when you add documents to the pinecone index it'll automatically be stored in the pinecone backend
Thanks @jerryjliu0 and @Logan M !
Add a reply
Sign up and join the conversation on Discord