admin管理员组文章数量:1431903
I am having trouble using MultiQueryRetriever and PromptTemplate.
My goal is to take a list of allegations against a police officer, and using the MultiQueryRetriever, have the LLM generate one query per allegation + description combination, in order to fetch the most relevant rule broken for each allegation. I have Chroma as my vector store, it contains a police department officer rule book. To do this, I am using a custom prompt that instructs the LLM to generate one Chroma query for each allegation. In order to generate this query, it must look at the allegation and try to extract potential violations (relevant to the allegation) from the description, then form a query that can be used to fetch relevant rules from Chroma. (Take a look at the actual prompt for more detail)
This is the LineListOutputParser that I've defined:
class LineListOutputParser(BaseOutputParser[List[str]]):
"""Output parser for a list of lines."""
def parse(self, text: str) -> List[str]:
lines = text.strip().split("\n")
return list(filter(None, lines)) # Remove empty lines
This is the custom prompt I've designed:
chroma_prompt = PromptTemplate(
input_variables=["allegations", "description", "num_allegations"],
template=(
"""You are an AI language model assistant. Your task is to analyze the following civilian complaint
description against a police officer, and the allegations that are raised against the officer. Identify
potential acts of misconduct or crimes committed by the officer, and generate {num_allegations} different queries to
retrieve relevant sections from the Police Rulebook (one query per allegation-description combination), stored in a vector database.
By generating multiple perspectives on the analysis, your goal is to help the user overcome some of the limitations of the
distance-based similarity search. Provide these alternative analyses as distinct queries, separated by newlines.
Allegations made against officer: {allegations}
Incident description: {description}
"""
)
)
This is the code section that fetches from Chroma:
def fetch_from_chroma(allegations, description, ia_num, llm, k=2):
"""
Fetches relevant documents from Chroma using Maximal Marginal Relevance (MMR).
Parameters:
- query (str): The query string.
- ia_num (int): Internal Affairs number for logging/debugging.
- k (int): Number of results to fetch, set to 3 by default.
- lambda_mult (float): MMR diversity parameter. Values closer to 1 prioritize diversity, closer to 0 prioritize relevance.
Returns:
- context_text (str): Combined context text from retrieved documents.
- sources (list): List of source metadata.
"""
embedding_function = OpenAIEmbeddings()
db = Chroma(persist_directory=CHROMA_PATH, embedding_function=embedding_function)
line_output_parser = LineListOutputParser()
llm_chain = chroma_prompt | llm | line_output_parser
retriever = MultiQueryRetriever(
retriever=db.as_retriever(search_type="similarity", search_kwargs={"k": k}), llm_chain=llm_chain, parser_key="lines"
)
# Invoke the retriever with the input dictionary
results = retriever.invoke({
"allegations": ", ".join(allegations),
"description": description,
"num_allegations": str(len(allegations))
})
if len(results) == 0:
print(f"{ia_num} - Unable to find matching results.")
return "No Context Available", "No Sources Available"
context_text = "\n\n---\n\n".join([doc.page_content for doc in results])
sources = [doc.metadata.get("source", None) for doc in results]
print(f"{ia_num} - Found matching results.")
return context_text, sources
However, I am getting this error and have no idea why:
KeyError: "Input to PromptTemplate is missing variables {'description', 'allegations', 'num_allegations'}. Expected: ['allegations', 'description', 'num_allegations'] Received: ['question']\nNote: if you intended {description} to be part of the string and not a variable, please escape it with double curly braces like: '{{description}}'."
for some reason, it keeps saying that I only passed in a variable 'question', but when i call retriever.invoke(), I am clearly passing in the required variables.
here is an example input that is being passed in:
{'allegations': 'Conformance to Laws, Conduct Unbecoming, Respectful Treatment, Alcohol off Duty', 'description': 'Officer firstname Lastname fled from a taxicab without paying the fare. Officer Lastname was located by Officers from Area A‐7. Officer Lastname A‐7 where he was uncooperative with Sgt. Lastname and refused to talk to him. Sgt. Lastname escorted Officer Lastname back to the o his department equipment was received by Sgt. Lastname including a Glock 40 Serial # number, Radio # number, Handcuffs #number, 3 magazine Police Badge# number, 1 container of OC Spray and 1 bullet resistant vest. Department equipment to be turned over to Sgt. last name', 'num_allegations': '4'}
System Info:
langchain==0.3.7
langchain-community==0.2.3
langchain-core==0.3.19
langchain-google-genai==2.0.1
langchain-openai==0.2.8
langchain-text-splitters==0.3.2
using a mac
Using Python 3.11.9
版权声明:本文标题:python - KeyError with MultiQueryRetriever and Custom Prompt for Fetching data from ChromaDB - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745589905a2665135.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论