Find answers from the community

Updated 3 days ago

Batch

hey all. is there a way to use the llama index LLM class (or other llama index abstractions), to access the openai/anthropic Message Batching API?
L
S
11 comments
Not really. All the abstractions are built around real time interactions πŸ€”
i think I have a pretty good use case for it.
I would just use the raw openai client. Or if you want to make a pr, I can review it πŸ™‚
maybe!! but one Pr at a time eh? Here's my proposed usecase/interface:
Plain Text
# Part 1: Submit batch during pipeline - remember a documentcontextextractor can literally take # hours to run - or days in the extreme case. batch processing can cut costs by 50%!

extractor = DocumentContextExtractor(
    docstore=docstore,
    llm=llm,
    mode="submit_batch"
)

# Must be last transform in pipeline
index.update_nodes(transforms=[transform_A, transform_B, transform_C, extractor])
index.persist(...)


# Part 2: Check batch status and update nodes

# first load index
index = ...

extractor.set_mode("process_batch")
while not extractor.is_batch_complete():
    num_completed = index.update_nodes(transforms=[extractor])
    # Maybe return number of nodes updated for user feedback
    time.sleep(...)  # User controls polling frequency

# After this, user can do whatever they want with their context-enabled nodes
I guess the advantage here is just cost savings? Otherwise async calls will achieve similar right?

Yea. Would need to be added to the llm interference for llms that support that, the concept doesn't quite exist yet in the codebase
(the above code requires either raw api calls under the hood, using openai library, or an addition to the LLM interface to do batch processing)
yep yep. cost, and also offline/asynchronous
the computer can be shutoff or the code can crash/get interrupted and its fine
my proposed solution above would be entirely stateless too. it can use node-id as the batch job id, and just check which nodes are already processed (if they have a 'context' key) and which ones are currently waiting for procsesing and which ones are ready.
with this approach you dont need to keep a python script running for days. you just submit, then check again in a day or two
Add a reply
Sign up and join the conversation on Discord