Batch

Question

hey all. is there a way to use the llama index LLM class (or other llama index abstractions), to access the openai/anthropic Message Batching API?

Logan M · Answer

Not really. All the abstractions are built around real time interactions 🤔

SnowBloom · Answer

i think I have a pretty good use case for it.

Logan M · Answer

I would just use the raw openai client. Or if you want to make a pr, I can review it 🙂

SnowBloom · Answer

maybe!! but one Pr at a time eh? Here's my proposed usecase/interface:

SnowBloom · Answer

# Part 1: Submit batch during pipeline - remember a documentcontextextractor can literally take # hours to run - or days in the extreme case. batch processing can cut costs by 50%! extractor = DocumentContextExtractor( docstore=docstore, llm=llm, mode="submit_batch"
) # Must be last transform in pipeline
index.update_nodes(transforms=[transform_A, transform_B, transform_C, extractor])
index.persist(...) # Part 2: Check batch status and update nodes # first load index
index = ... extractor.set_mode("process_batch")
while not extractor.is_batch_complete(): num_completed = index.update_nodes(transforms=[extractor]) # Maybe return number of nodes updated for user feedback time.sleep(...) # User controls polling frequency # After this, user can do whatever they want with their context-enabled nodes

Logan M · Answer

I guess the advantage here is just cost savings? Otherwise async calls will achieve similar right?

Yea. Would need to be added to the llm interference for llms that support that, the concept doesn't quite exist yet in the codebase

SnowBloom · Answer

(the above code requires either raw api calls under the hood, using openai library, or an addition to the LLM interface to do batch processing)

SnowBloom · Answer

yep yep. cost, and also offline/asynchronous

SnowBloom · Answer

the computer can be shutoff or the code can crash/get interrupted and its fine

SnowBloom · Answer

my proposed solution above would be entirely stateless too. it can use node-id as the batch job id, and just check which nodes are already processed (if they have a 'context' key) and which ones are currently waiting for procsesing and which ones are ready.

SnowBloom · Answer

with this approach you dont need to keep a python script running for days. you just submit, then check again in a day or two

Find answers from the community

Batch