Battle of the LLM Giants: Google PaLM 2 vs OpenAI GPT-3.5 | by Wen Yang | Jun, 2023


3. Agent utilizing tools and following instructions

Reminder: in order to use google search API (SerpApi), you can sign up for an account here. After that, you can generate a SerpApi API key. Its Free Plan allows you to call 100 searches per month.

from langchain.agents import Tool
from langchain.agents import initialize_agent
from langchain.utilities import SerpAPIWrapper

def chat_agent(query, llm):
#======= Step 1: Search tool ========
# google search
search = SerpAPIWrapper()

# ===== Step 2: Memory =========
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# ====== Step 3: Chain =======

# option 1: RetrievalQA with source chain
# qa = RetrievalQAWithSourcesChain.from_chain_type(
# llm=llm,
# chain_type="stuff",
# retriever=vectorstore.as_retriever()
# )

# option 2: Conversation Retrieval chain
qa = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
memory=memory,
return_source_documents=False
)

#==== Step 4: Create a list of tools
tools = [
# Outside Knowledge Base
Tool(
name='Knowledge Base',
func=qa.__call__, # qa.run won't work!!
description='use this tool when answering general knowledge queries '
),
# Search
Tool(
name="Search",
func=search.run,
description='use this tool when you need to answer questions about weather or current status of the world '
)
]

#==== Step 5: Agent ========

agent = initialize_agent(
agent='chat-conversational-react-description',
llm=llm,
tools=tools,
verbose=True,
max_iterations=3,
early_stopping_method='generate',
memory=memory
)

return agent(query)

The key idea is that our chat agent has an LLM to generate responses, a toolbox with a list of tools, and short-term memory for past interactions. We want our agent to use Pinecone knowledge base to answer questions most of the time, and only to use the search tool to answer questions about the weather or the current status of the world.

Our first question is:

“Could you plan a two-day trip to Yellowstone national park with daily itineraries?”

Let’s see the responses generated by both agents.

From Palm Agent:

Response from Palm Agent

The palm agent had issues parsing the LLM output. Also, Palm went to use the Search tool immediately instead of following instructions on using the knowledge base for general inquiries.

From gpt-3.5 Agent:

Response from gpt-3.5 agent

The gpt-3.5 agent had no problem parsing output, and it followed human instruction more closely — using a knowledge base to answer the question. The quality is also pretty good and it provided a detailed daily itinerary.

Now let’s test a follow-up question, which we want the agent to use the search tool. The idea is when a user uses outside chat for upcoming trip planning, they might want to know the weather for the destination. Here we purposefully used “weather there” instead of “weather in Yellowstone” to test if the agent can remember past conversations.

“What will the weather there be like over the next 7 days?”

Palm Agent searched the weather in Seattle, which is not what we want.

Palm agent search weather

Gpt-3.5 Agent is not any better. It searched Greenville, NC which is also far from our destination Yellowstone.

Gpt-3.5 agent search weather

Both agents made the correct decision to use the search tool, however, they seem to suffer a bit of amnesia — no recollection of the destination we’ve been chatting about! The issue may be related to a potential interaction memory problem with the Langchain agent. If you have encountered similar issues or better yet, have insights on how to fix this, please let me know!



Source link

Leave a Comment