admin管理员组

文章数量:1431468

I have been trying to productionize my code using llamaindex, however, struggling with filtering doc ids and some other metadata filtering.

In my documents, I have id and section parts as metadata. Now I first retrieve a set based on a prompt and then I want to ask more questions about those retrieved ids.

From the first retrieval,

result_dicts = []
for node in new_nodes:
    result_dict = {"doc_id": node.id_, "ID": node.metadata.get('id'),"Score": node.score, "Text": node.node.get_text()}
    result_dicts.append(result_dict)

now I try to the filtering:

from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, FilterOperator

desired_ids = list(pd.DataFrame(result_dicts).head(10)['ID'])
# desired_ids = list(pd.DataFrame(result_dicts).head(10)['doc_id'])

filters1 = [
    MetadataFilter(
        key='id',
        value=idd,
        operator='==',
    
    )
    for idd in desired_ids
]

filters2=[
    
    MetadataFilter(
        key="section", 
        value='experience',
        operator=FilterOperator.EQ
    )
]

but could not combine the two. I also tried doc_ids through

retriever_f = VectorIndexRetriever(
    doc_ids = desired_ids,
    index=index,
    similarity_top_k=5,
    filters=filters)

however it did not work and apparently it seems a bug as stated here: bug

what is the best way in llamaindex to apply multiple metadata filtering including list of values for each?

本文标签: