MLflow Deployments for LLMs

The MLflow Deployments for LLMs is a powerful tool designed to streamline the usage and management of various large language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests.

Installation and Setup

Install mlflow with MLflow Deployments dependencies:

pip install 'mlflow[genai]'

Set the OpenAI API key as an environment variable:

export OPENAI_API_KEY=...

Create a configuration file:

endpoints:
  - name: completions
    endpoint_type: llm/v1/completions
    model:
      provider: openai
      name: text-davinci-003
      config:
        openai_api_key: $OPENAI_API_KEY

  - name: embeddings
    endpoint_type: llm/v1/embeddings
    model:
      provider: openai
      name: text-embedding-ada-002
      config:
        openai_api_key: $OPENAI_API_KEY

Start the deployments server:

mlflow deployments start-server --config-path /path/to/config.yaml

Example provided by `MLflow`

The mlflow.langchain module provides an API for logging and loading LangChain models. This module exports multivariate LangChain models in the langchain flavor and univariate LangChain models in the pyfunc flavor.

See the API documentation and examples for more information.

Completions Example

import mlflow
from langchain.chains import LLMChain, PromptTemplate
from langchain_community.llms import Mlflow

llm = Mlflow(
    target_uri="http://127.0.0.1:5000",
    endpoint="completions",
)

llm_chain = LLMChain(
    llm=Mlflow,
    prompt=PromptTemplate(
        input_variables=["adjective"],
        template="Tell me a {adjective} joke",
    ),
)
result = llm_chain.run(adjective="funny")
print(result)

with mlflow.start_run():
    model_info = mlflow.langchain.log_model(chain, "model")

model = mlflow.pyfunc.load_model(model_info.model_uri)
print(model.predict([{"adjective": "funny"}]))

API Reference:

Embeddings Example

from langchain_community.embeddings import MlflowEmbeddings

embeddings = MlflowEmbeddings(
    target_uri="http://127.0.0.1:5000",
    endpoint="embeddings",
)

print(embeddings.embed_query("hello"))
print(embeddings.embed_documents(["hello"]))

API Reference:

MlflowEmbeddings

Chat Example

from langchain_community.chat_models import ChatMlflow
from langchain_core.messages import HumanMessage, SystemMessage

chat = ChatMlflow(
    target_uri="http://127.0.0.1:5000",
    endpoint="chat",
)

messages = [
    SystemMessage(
        content="You are a helpful assistant that translates English to French."
    ),
    HumanMessage(
        content="Translate this sentence from English to French: I love programming."
    ),
]
print(chat(messages))

MLflow Deployments for LLMs

Installation and Setup

Example provided by `MLflow`

Completions Example

API Reference:

Embeddings Example

API Reference:

Chat Example

API Reference:

Was this page helpful?

You can leave detailed feedback on GitHub.

MLflow Deployments for LLMs

Installation and Setup​

Example provided by MLflow​

Completions Example​

API Reference:

Embeddings Example​

API Reference:

Chat Example​

API Reference:

Was this page helpful?

You can leave detailed feedback on GitHub.

Installation and Setup

Example provided by `MLflow`

Completions Example

Embeddings Example

Chat Example