Find answers from the community

M
Mike
Offline, last seen 2 weeks ago
Joined September 25, 2024
Does anyone have any findings around switching from gpt 3.5 to gpt 4o mini? I'm finding that structured content is quite a bit worse for gpt 4o mini vs 3.5, often failing to return the values in the correct format. Also the speed is quite a bit slower... But I feel we have to switch as the pricing is so much better...
5 comments
L
M
We use PydanticProgramExtractor to get a list of tags aswell as a summary and we see a strange error where content is repeated endlessly. This then causes the validation to fail.

This is our code:

Plain Text
EXTRACT_TEMPLATE_STR = """\
Here is the content of a section:
----------------
{context_str}
----------------
Given the contextual information, extract out a {class_name} object.\
"""

openai_program_summary = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeSummaryMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

openai_program_keywords = OpenAIPydanticProgram.from_defaults(
    llm=get_llm(model=MODEL_BASIC),
    output_cls=NodeKeywordsMetadata,
    prompt_template_str="You must answer in the same language as the context given. {input}",
    extract_template_str=EXTRACT_TEMPLATE_STR,
)

summary_extractor = PydanticProgramExtractor(program=openai_program_summary, input_key="input", num_workers=12)
keywords_extractor = PydanticProgramExtractor(program=openai_program_keywords, input_key="input", num_workers=12)
8 comments
L
M
Anyone have any information about the speed differences between Flask and FastAPI? We were thinking about switching from Flask to FastAPI but it seems to be quite a bit slower. A basic request where I use the chat engine seems to be almost twice as slow...
21 comments
k
L
M
Is there a way to retry matching a certain JSON format? Right now i just try to parse it and re-prompt if it fails. But i feel it would be better to just have it correct its mistake instead of retrying enterly right? Is this possible?
6 comments
L
M
W
We're seeing this issue when running our project and doing a fresh install of the dependencies. Anyone else experiencing this or know what could cause this?

"Resource wordnet not found. Please use the NLTK Downloader to obtain the resource:"
14 comments
L
M
How can I use chat history in combination with an index? In this example the AI does not know anything about the chat history and just seems to try and query the index.

Plain Text
@app.route("/history")
def history():
    # Load data
    documents = SimpleDirectoryReader("./src/data/paul_graham").load_data()

    # create index
    index: VectorStoreIndex = VectorStoreIndex.from_documents(documents)

    custom_prompt = PromptTemplate(
        """\
    Given a conversation (between Human and Assistant) and a follow up message from Human, \
    rewrite the message to be a standalone question that captures all relevant context \
    from the conversation.

    <Chat History>
    {chat_history}

    <Follow Up Message>
    {question}

    <Standalone question>
    """
    )

    # list of `ChatMessage` objects
    custom_chat_history = [
        ChatMessage(
            role=MessageRole.USER,
            content="Remember that John Doe is wearing a blue shirt.",
        ),
        ChatMessage(role=MessageRole.ASSISTANT,
                    content=(
                        "Certainly, I'll remember that John Doe is wearing a blue shirt."
                    )
                    ),
    ]

    query_engine = index.as_query_engine()

    chat_engine = CondenseQuestionChatEngine.from_defaults(
        query_engine=query_engine,
        condense_question_prompt=custom_prompt,
        chat_history=custom_chat_history,
        verbose=True,
    )

    chat_response: AgentChatResponse = chat_engine.chat(
        "What color shirt is John Doe wearing?",
        # tool_choice="query_engine_tool"
    )

    pprint("answer:")
    pprint(chat_response.response)

    return "Done."

Response:

Plain Text
I'm sorry, but I cannot answer that question based on the given context information.
3 comments
L
M
M
Mike
·

Json

Is there an easy way to turn a llama_index.response.schema.PydanticResponse into json?
2 comments
M
L
When using pydantic classes with OpenAI I sometimes get a validation error. Is there anything I can do about this? Maybe retry?

Plain Text
Expecting value: line 1 column 1 (char 0) (type=value_error.jsondecode; msg=Expecting value; doc=Empty Response; pos=0; lineno=1; colno=1)
12 comments
M
L
r