首页 > 其他分享 >Intro to LLM Agents with Langchain: When RAG is Not Enough

Intro to LLM Agents with Langchain: When RAG is Not Enough

时间:2024-11-10 22:57:16浏览次数:6  
标签:RAG prompt When agent langchain Langchain model tools history

https://towardsdatascience.com/intro-to-llm-agents-with-langchain-when-rag-is-not-enough-7d8c08145834

As always, you can find the code on GitHub, and here are separate Colab Notebooks:

  1. Planning and reasoning
  2. Different types of memories
  3. Various types of tools
  4. Building complete agents

Introduction to the agents

Illustration by author. LLMs are often augmented with external memory via RAG architecture. Agents extend this concept to memory, reasoning, tools, answers, and actions

Let’s begin the lecture by exploring various examples of LLM agents. While the topic is widely discussed, few are actively utilizing agents; often, what we perceive as agents are simply large language models. Let’s consider such a simple task as searching for football game results and saving them as a CSV file. We can compare several available tools:

  • GPT-4 with search and plugins: as you will find in the chat history here, GPT-4 failed to do the task due to code errors
  • AutoGPT through https://evo.ninja/ at least could generate some kind of CSV (not ideal though):

Since the available tools are not great, let’s learn from the first principles of how to build agents from scratch. I am using amazing Lilian’s blog article as a structure reference but adding more examples on my own.

Step 1: Planning

The visual difference between simple “input-output” LLM usage and such techniques as a chain of thought, a chain of thought with self-consistency, a tree of thought

You might have come across various techniques aimed at improving the performance of large language models, such as offering tips or even jokingly threatening them. One popular technique is called “chain of thought,” where the model is asked to think step by step, enabling self-correction. This approach has evolved into more advanced versions like the “chain of thought with self-consistency” and the generalized “tree of thoughts,” where multiple thoughts are created, re-evaluated, and consolidated to provide an output.

In this tutorial, I am using heavily Langsmith, a platform for productionizing LLM applications. For example, while building the tree of thoughts prompts, I save my sub-prompts in the prompts repository and load them:

from langchain import hub
from langchain.chains import SequentialChain

cot_step1 = hub.pull("rachnogstyle/nlw_jan24_cot_step1")
cot_step2 = hub.pull("rachnogstyle/nlw_jan24_cot_step2")
cot_step3 = hub.pull("rachnogstyle/nlw_jan24_cot_step3")
cot_step4 = hub.pull("rachnogstyle/nlw_jan24_cot_step4")

model = "gpt-3.5-turbo"

chain1 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step1,
output_key="solutions"
)

chain2 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step2,
output_key="review"
)

chain3 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step3,
output_key="deepen_thought_process"
)

chain4 = LLMChain(
llm=ChatOpenAI(temperature=0, model=model),
prompt=cot_step4,
output_key="ranked_solutions"
)

overall_chain = SequentialChain(
chains=[chain1, chain2, chain3, chain4],
input_variables=["input", "perfect_factors"],
output_variables=["ranked_solutions"],
verbose=True
)

You can see in this notebook the result of such reasoning, the point I want to make here is the right process for defining your reasoning steps and versioning them in such an LLMOps system like Langsmith. Also, you can see other examples of popular reasoning techniques in public repositories like ReAct or Self-ask with search:

prompt = hub.pull("hwchase17/react")
prompt = hub.pull("hwchase17/self-ask-with-search")

Other notable approaches are:

  • Reflexion (Shinn & Labash 2023) is a framework to equip agents with dynamic memory and self-reflection capabilities to improve reasoning skills.
  • Chain of Hindsight (CoH; Liu et al. 2023) encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback.

Step 2: Memory

We can map different types of memories in our brain to the components of the LLM agents' architecture

  • Sensory Memory: This component of memory captures immediate sensory inputs, like what we see, hear or feel. In the context of prompt engineering and AI models, a prompt serves as a transient input, similar to a momentary touch or sensation. It’s the initial stimulus that triggers the model’s processing.
  • Short-Term Memory: Short-term memory holds information temporarily, typically related to the ongoing task or conversation. In prompt engineering, this equates to retaining the recent chat history. This memory enables the agent to maintain context and coherence throughout the interaction, ensuring that responses align with the current dialogue. In code, you typically add it as conversation history:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.agents import AgentExecutor
from langchain.agents import create_openai_functions_agent

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
tools = [retriever_tool]
agent = create_openai_functions_agent(
llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

message_history = ChatMessageHistory()
agent_with_chat_history = RunnableWithMessageHistory(
agent_executor,
lambda session_id: message_history,
input_messages_key="input",
history_messages_key="chat_history",
)
  • Long-Term Memory: Long-term memory stores both factual knowledge and procedural instructions. In AI models, this is represented by the data used for training and fine-tuning. Additionally, long-term memory supports the operation of RAG frameworks, allowing agents to access and integrate learned information into their responses. It’s like the comprehensive knowledge repository that agents draw upon to generate informed and relevant outputs. In code, you typically add it as a vectorized database:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = WebBaseLoader("https://neurons-lab.com/")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()

Step 3: Tools

In practice, you want to augment your agent with a separate line of reasoning (which can be another LLM, i.e. domain-specific or another ML model for image classification) or with something more rule-based or API-based

ChatGPT Plugins and OpenAI API function calling are good examples of LLMs augmented with tool use capability working in practice.

  • Built-in Langchain tools: Langchain has a pleiad of built-in tools ranging from internet search and Arxiv toolkit to Zapier and Yahoo Finance. For this simple tutorial, we will experiment with the internet search provided by Tavily:
from langchain.utilities.tavily_search import TavilySearchAPIWrapper
from langchain.tools.tavily_search import TavilySearchResults

search = TavilySearchAPIWrapper()
tavily_tool = TavilySearchResults(api_wrapper=search)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)
agent_chain = initialize_agent(
[retriever_tool, tavily_tool],
llm,
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)
  • Custom tools: it’s also very easy to define your own tools. Let’s dissect the simple example of a tool that calculates the length of the string. You need to use the @tooldecorator to make Langchain know about it. Then, don’t forget about the type of input and the output. But the most important part will be the function comment between """ """ — this is how your agent will know what this tool does and will compare this description to descriptions of the other tools:
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool, StructuredTool, tool

@tool
def calculate_length_tool(a: str) -> int:
"""The function calculates the length of the input string."""
return len(a)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)
agent_chain = initialize_agent(
[retriever_tool, tavily_tool, calculate_length_tool],
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)

You can find examples of how it works in this script, but you also can see an error — it doesn't pull the correct description of the Neurons Lab company and despite calling the right custom function of length calculation, the final result is wrong. Let’s try to fix it!

Step 4: All together

I am providing a clean version of combining all the pieces of architecture together in this script. Notice, how we can easily decompose and define separately:

  • All kinds of tools (search, custom tools, etc)
  • All kinds of memories (sensory as a prompt, short-term as runnable message history, and as a sketchpad within the prompt, and long-term as a retrieval from the vector database)
  • Any kind of planning strategy (as a part of a prompt pulled from the LLMOps system)

The final definition of the agent will look as simple as this:

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_with_chat_history = RunnableWithMessageHistory(
agent_executor,
lambda session_id: message_history,
input_messages_key="input",
history_messages_key="chat_history",
)

As you can see in the outputs of the script (or you can run it yourself), it solves the issue in the previous part related to tools. What changed? We defined a complete architecture, where short-term memory plays a crucial role. Our agent obtained message history and a sketchpad as a part of the reasoning structure which allowed it to pull the correct website description and calculate its length.

Outro

I hope this walkthrough through the core elements of the LLM agent architecture will help you design functional bots for the cognitive tasks you aim to automate. To complete, I would like to emphasize again the importance of having all elements of the agent in place. As we can see, missing short-term memory or having an incomplete description of a tool can mess with the agent’s reasoning and provide incorrect answers even for very simple tasks like summary generation and its length calculation. Good luck with your AI projects and don’t hesitate to reach out if you need help at your company!

 

标签:RAG,prompt,When,agent,langchain,Langchain,model,tools,history
From: https://www.cnblogs.com/lightsong/p/18538702

相关文章

  • Langchain ReAct
    officialhttps://python.langchain.com/v0.1/docs/modules/agents/agent_types/react/https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.react.agent.create_react_agent.htmlfromlangchainimporthubfromlangchain_community.llm......
  • 在 Github Action 管道内集成 Code Coverage Report
    GithubActions我们的开源项目Host在Github,并且使用它强大的Actions功能在做CICD。单看GithubActions可能不知道是啥。其实它就是我们常说的CICDpipeline或者叫workflow。当我们Push代码到Github,它会自动触发这些管道。它会帮我们自动build代码,跑testcases,构......
  • 小北的字节跳动青训营与LangChain实战课:深入探索Chain的奥秘(上)写一篇完美鲜花推文?用Se
     前言    最近,字节跳动的青训营再次扬帆起航,作为第二次参与其中的小北,深感荣幸能借此机会为那些尚未了解青训营的友友们带来一些详细介绍。青训营不仅是一个技术学习与成长的摇篮,更是一个连接未来与梦想的桥梁~小北的青训营XMarsCode技术训练营——AI加码,字节跳......
  • localStorage和sessionStorage的区别
    `localStorage`和`sessionStorage`都是浏览器提供的本地存储方案,它们之间有几个关键的区别,包括数据的生命周期、作用域以及存储容量等方面。1.**区别:**  -**生命周期:**   -`localStorage`:存储的数据没有过期时间限制,除非显式删除或浏览器缓存被清除,否则数据将一......
  • python实战(七)——基于LangChain的RAG实践
    一、任务目标    基于之前的RAG实战,相信大家对RAG的实现已经有了一定的了解了。这篇文章将使用LangChain作为辅助,实现一个高效、便于维护的RAG程序。二、什么是LangChain        LangChain是一个用于构建大模型应用程序的开源框架,它内置了多个模块化组件。......
  • [LeetCode] 1343. Number of Sub-arrays of Size K and Average Greater than or Equa
    Givenanarrayofintegersarrandtwointegerskandthreshold,returnthenumberofsub-arraysofsizekandaveragegreaterthanorequaltothreshold.Example1:Input:arr=[2,2,2,2,5,5,5,8],k=3,threshold=4Output:3Explanation:Sub-arrays[2......
  • **直接存储器访问(Direct Storage, DS)**是一种高效的数据传输技术,主要用于加速数据在计
    直接存储器访问(DirectStorage,DS)**直接存储器访问(DirectStorage,DS)**是一种高效的数据传输技术,主要用于加速数据在计算机系统中的传输过程。它允许设备(如硬盘、固态硬盘(SSD)或其他外部存储设备)直接将数据传输到内存,而不经过CPU的中介。通过减少CPU的干预,DS能够显著提高数据的......
  • Langchain-提示工程
    今天我们来尝试用思维链也就是CoT(ChainofThought)的概念来引导模型的推理,让模型生成更详实、更完备的文案什么是ChainofThoughtCoT这个概念来源于学术界,是谷歌大脑的JasonWei等人于2022年在论文《Chain-of-ThoughtPromptingElicitsReasoninginLargeLanguageModel......
  • localeStorage 当前标签页变化监听不到,只能监听不同标签页变化,自己写方法监听
     1.在utils中新建一个文件watchLocalStorage.tsexportdefaultfunctiondispatchEventStroage(){ constsignSetItem=localStorage.setItem localStorage.setItem=function(key,val){ letsetEvent=newEvent('setItemEvent') setEvent.key=key set......
  • langchain agent with tools sample code
    importasynciofromlangchain_openaiimportChatOpenAIfromlangchain.agentsimporttoolfromlangchain_core.promptsimportChatPromptTemplate,MessagesPlaceholderfromlangchain.agents.format_scratchpad.openai_toolsimport(format_to_openai_tool_me......