BM25
BM25 (Wikipedia) also known as the
Okapi BM25
, is a ranking function used in information retrieval systems to estimate the relevance of documents to a given search query.
BM25Retriever
retriever uses therank_bm25
package.
%pip install --upgrade --quiet rank_bm25
from langchain_community.retrievers import BM25Retriever
API Reference:
Create New Retriever with Texts
retriever = BM25Retriever.from_texts(["foo", "bar", "world", "hello", "foo bar"])
Create a New Retriever with Documents
You can now create a new retriever with the documents you created.
from langchain_core.documents import Document
retriever = BM25Retriever.from_documents(
[
Document(page_content="foo"),
Document(page_content="bar"),
Document(page_content="world"),
Document(page_content="hello"),
Document(page_content="foo bar"),
]
)
API Reference:
Use Retriever
We can now use the retriever!
result = retriever.invoke("foo")
result
[Document(page_content='foo', metadata={}),
Document(page_content='foo bar', metadata={}),
Document(page_content='hello', metadata={}),
Document(page_content='world', metadata={})]