标签：实战 RAG chain cl 08 session id history

【RAG 项目实战 08】为 RAG 添加历史对话能力

NLP Github 项目：

NLP 项目实践：fasterai/nlp-project-practice

介绍：该仓库围绕着 NLP 任务模型的设计、训练、优化、部署和应用，分享大模型算法工程师的日常工作和实战经验
AI 藏经阁：https://gitee.com/fasterai/ai-e-book

介绍：该仓库主要分享了数百本 AI 领域电子书
AI 算法面经：fasterai/nlp-interview-handbook#面经

介绍：该仓库一网打尽互联网大厂NLP算法面经，算法求职必备神器
NLP 剑指Offer：https://gitee.com/fasterai/nlp-interview-handbook

介绍：该仓库汇总了 NLP 算法工程师高频面题

[!NOTE] 为 RAG 添加多轮对话能力

使用 create_history_aware_retriever 创建 改写链

使用 create_stuff_documents_chain 创建 问答链

使用 create_retrieval_chain 创建 RAG链

查看效果

添加多轮对话能力，存储对话历史（自动存储）

用 RunnableWithMessageHistory 包装 Chain 添加对话历史能力

添加 session_id

大模型交互配置 session_id ，可以根据 session_id 区分对话历史

使用流式问答

使用 LangchainCallbackHandler 监听 Langchain事件，便于在页面进行调试

添加知识来源

将大模型更换为千帆对话大模型

[!NOTE] 问题调试

网络超时

反应慢

修改文档块的分割逻辑

一、添加历史信息

添加前：

query -> retriever
添加后：
(query, conversation history) -> LLM -> rephrased query -> retriever

添加前：

(query, context) -> LLM -> answer
添加后：
(query, conversation history, context) -> LLM -> answer

using the "chat_history" input key, and these messages will be inserted after the system message and before the human message containing the latest question.

create_history_aware_retriever
manages the case where chat_history is empty, and otherwise applies prompt | llm | StrOutputParser() | retriever in sequence.

使用记忆组件自动的管理会话历史

二、核心代码

环境配置
改写链：结合聊天历史改写用户问题
问答链：根据问题和参考内容生成答案
RAG链：将改写链和问答链合并成完整的RAG链
管理聊天历史

2.1 环境配置

# @Author：青松  
# 公众号：FasterAI  
# Python, version 3.10.14  
# Pytorch, version 2.3.0  
# Chainlit, version 1.1.301

2.2 创建`改写链`

改写链：结合上下文改写用户问题

# 将 Chroma 向量数据库转化为检索器
retriever = vectorstore.as_retriever()

# 通过上下文改写用户的问题
contextualize_q_system_prompt = "给出下面的对话和一个后续问题，用原来的语言将后续问题改写为一个独立的问题。"
contextualize_q_prompt = ChatPromptTemplate.from_messages(
	[
		("system", contextualize_q_system_prompt),
		MessagesPlaceholder("chat_history"),
		("human", "{input}"),
	]
)

# 改写链：结合上下文改写用户问题
history_aware_retriever = create_history_aware_retriever(
	llm, retriever, contextualize_q_prompt
)

2.3 创建`问答链`

# 根据参考内容回答用户的问题
system_prompt = (
	"你是一个专门处理问答任务的智能助理。 "
	"你需要使用给定的参考内容来回答用户的问题。如果你不知道答案，就说你不知道，不要试图编造答案。参考内容如下："
	"\n\n"
	"{context}"
)
qa_prompt = ChatPromptTemplate.from_messages(
	[
		("system", system_prompt),
		MessagesPlaceholder("chat_history"),
		("human", "{input}"),
	]
)

# 问答链：根据问题和参考内容生成答案
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

2.4 创建`RAG链`

# RAG链：将改写链和问答链合并成完整的RAG链
    rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

2.5 管理聊天历史

# 管理聊天历史
store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

# 在 rag_chain 中添加 chat_history
conversational_rag_chain = RunnableWithMessageHistory(
        rag_chain,
        get_session_history,
        input_messages_key="input",
        history_messages_key="chat_history",
        output_messages_key="answer",
    )

2.6 配置 session_id 使用聊天历史

session_id = cl.user_session.get("session_id")
conversational_rag_chain = cl.user_session.get("conversational_rag_chain")

# 使用 session_id 结合聊天历史响应用户问题
res = conversational_rag_chain.invoke(
	{"input": message.content},
	config=RunnableConfig(
		configurable={"session_id": session_id},
		callbacks=[cl.LangchainCallbackHandler()]),
)

await cl.Message(content=res["answer"]).send()

2.7 使用流式进行响应

msg = cl.Message(content="")

# 使用 session_id 用流式的方式响应用户问题
chain = conversational_rag_chain.pick("answer")  # 只挑选 'answer' 属性输出
async for chunk in chain.astream(
		{"input": message.content},
		config=RunnableConfig(
			configurable={"session_id": session_id},
			callbacks=[cl.LangchainCallbackHandler()])
):
	await msg.stream_token(chunk)

await msg.send()

三、效果展示

四、完整代码

# @Author：青松
# 公众号：FasterAI
# Python, version 3.10.14
# Pytorch, version 2.3.0
# Chainlit, version 1.1.301

import chainlit as cl
from langchain.chains import create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain.memory import ChatMessageHistory
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.prompts import MessagesPlaceholder, ChatPromptTemplate
from langchain_core.runnables import RunnableConfig
from langchain_core.runnables.history import RunnableWithMessageHistory

import llm_util
from common import Constants

# 获取大模型实例
llm = llm_util.get_llm(Constants.MODEL_NAME['QianFan'])

# 获取文本嵌入模型
model_name = "BAAI/bge-small-zh"
encode_kwargs = {"normalize_embeddings": True}
embeddings_model = HuggingFaceBgeEmbeddings(
    model_name=model_name, encode_kwargs=encode_kwargs
)

# 配置文件分割器，每个块 1000 个token，重复100个
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)

# 管理聊天历史
store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


@cl.on_chat_start
async def on_chat_start():
    """ 监听会话开始事件 """

    session_id = "abc123"
    cl.user_session.set("session_id", session_id)

    await send_welcome_msg()

    files = None

    # 等待用户上传文件
    while files is None:
        files = await cl.AskFileMessage(
            content="Please upload a text file to begin!",
            accept=["text/plain"],
            max_size_mb=20,
            timeout=180,
        ).send()

    file = files[0]

    # 发送处理文件的消息
    msg = cl.Message(content=f"Processing `{file.name}`...", disable_feedback=True)
    await msg.send()

    with open(file.path, "r", encoding="utf-8") as f:
        text = f.read()

    # 将文件分割成文本块
    texts = text_splitter.split_text(text)

    # 为每个文本块添加元数据
    metadatas = [{"source": f"{i}-pl"} for i in range(len(texts))]

    # 使用异步方式创建 Chroma 向量数据库
    vectorstore = await cl.make_async(Chroma.from_texts)(
        texts, embeddings_model, metadatas=metadatas
    )

    # 将 Chroma 向量数据库转化为检索器
    retriever = vectorstore.as_retriever()

    # 通过上下文改写用户的问题
    contextualize_q_system_prompt = "给出下面的对话和一个后续问题，用原来的语言将后续问题改写为一个独立的问题。"
    contextualize_q_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", contextualize_q_system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )

    # 改写链：结合上下文改写用户问题
    history_aware_retriever = create_history_aware_retriever(
        llm, retriever, contextualize_q_prompt
    )

    # 根据参考内容回答用户的问题
    system_prompt = (
        "You are an assistant for question-answering tasks."
        "你需要使用给定的参考内容来回答用户的问题。如果你不知道答案，就说你不知道，不要试图编造答案。参考内容如下："
        "\n\n"
        "{context}"
    )
    qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )

    # 问答链：根据问题和参考内容生成答案
    question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

    # RAG链：将改写链和问答链合并成完整的RAG链
    rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

    conversational_rag_chain = RunnableWithMessageHistory(
        rag_chain,
        get_session_history,
        input_messages_key="input",
        history_messages_key="chat_history",
        output_messages_key="answer",
    )

    # 在 rag_chain 中添加 chat_history
    cl.user_session.set("conversational_rag_chain", conversational_rag_chain)

    # 通知用户文件已处理完成，更新当前窗口的内容
    msg.content = f"Processing `{file.name}` done. You can now ask questions!"
    await msg.update()


@cl.on_message
async def on_message(message: cl.Message):
    """ 监听用户消息事件 """
    session_id = cl.user_session.get("session_id")
    conversational_rag_chain = cl.user_session.get("conversational_rag_chain")

    msg = cl.Message(content="")

    # 使用 session_id 用流式的方式响应用户问题
    chain = conversational_rag_chain.pick("answer")  # 只挑选 'answer' 属性输出
    async for chunk in chain.astream(
            {"input": message.content},
            config=RunnableConfig(
                configurable={"session_id": session_id},
                callbacks=[cl.LangchainCallbackHandler()])
    ):
        await msg.stream_token(chunk)

    await msg.send()


async def send_welcome_msg():
    image = cl.Image(url="https://qingsong-1257401904.cos.ap-nanjing.myqcloud.com/wecaht.png")

    # 发送一个图片
    await cl.Message(
        content="**青松** 邀你关注 **FasterAI**， 让每个人的 AI 学习之路走的更容易些！立刻扫码开启 AI 学习、面试快车道 **(^_^)** ",
        elements=[image],
    ).send()

【动手学 RAG】系列文章：

【动手部署大模型】系列文章：

本文由mdnice多平台发布

标签：实战,RAG,chain,cl,08,session,id,history
From： https://www.cnblogs.com/fasterai/p/18571471

【RAG 项目实战 08】为 RAG 添加历史对话能力

【RAG 项目实战 08】为 RAG 添加历史对话能力

一、添加历史信息

二、核心代码

2.1 环境配置

2.2 创建`改写链`

2.3 创建`问答链`

2.4 创建`RAG链`

2.5 管理聊天历史

2.6 配置 session_id 使用聊天历史

2.7 使用流式进行响应

三、效果展示

四、完整代码

【动手学 RAG】系列文章：

相关文章

赞助商

阅读排行

【RAG 项目实战 08】为 RAG 添加历史对话能力

【RAG 项目实战 08】为 RAG 添加历史对话能力

一、添加历史信息

二、核心代码

2.1 环境配置

2.2 创建改写链

2.3 创建问答链

2.4 创建RAG链

2.5 管理聊天历史

2.6 配置 session_id 使用聊天历史

2.7 使用流式进行响应

三、效果展示

四、完整代码

【动手学 RAG】系列文章：

相关文章

赞助商

阅读排行

2.2 创建`改写链`

2.3 创建`问答链`

2.4 创建`RAG链`