In this workshop, we are going to use LangGraph to create a tool calling LLM agent that can survive a set of Oregon Trail themed scenarios. Additionally, we will setup and configure a semantic cache, allow/block list router, and a vector retrieval tool. The final architecture will look like this:
- python == 3.12.8
- Note: this workshop was tested with version 3.12.8.
- You may experience issues if using another version!
- docker
- openai api key
Run the following to create a .env file
cp dot.env .env
Update the contents of that file with your openai api key and optional LangSmith credentials:
REDIS_URL="redis://localhost:6379/0"
OPENAI_API_KEY=openai_key
# Update if using LangSmith otherwise keep blank
LANGCHAIN_TRACING_V2=
LANGCHAIN_ENDPOINT=
LANGCHAIN_API_KEY=
LANGCHAIN_PROJECT=
Download python: version 3.12.8 is recommend.
cd oregon-trail-agent-workshop
python --version
python -m venv venv
Mac/linux:
source venv/bin/activate
Windows:
venv\Scripts\activate
pip install -r requirements.txt
docker run -d --name redis -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
Navigate to http://localhost:8001/
on your machine and inspect the database with the redis insight GUI.
To make sure your environment is properly configured run:
python test_setup.py
If you don't get any errors you are ready to go! If you do get errors ask for help! The rest of the workshop will not work if this doesn't.
The objective of this workshop is to build an agentic app that can handle 5 different scenarios essential to surviving the Oregon Trail (and potentially apps you build in the future).
The scenarios:
- Knowing the name of the wagon leader (basic prompting).
- Knowing when to restock food (implementing a custom tool).
- Knowing how to ask for directions (retrieval augmented generation).
- Knowing how to hunt (semantic caching).
- Knowing what to ignore (allow/block list router).
Note: you can see the details of each scenario question/answer in questions.json.
To test progress along the trail save the following alias:
alias test_trail_agent="pytest --disable-warnings -vv -rP test_participant_oregon_trail.py"
Then run test_trail_agent
to invoke the workshop tests.
- You will perform all your work in the /particpant_agent folder.
- Within this folder there are various TODO tasks for you to complete in the corresponding files.
- After making updates you will test if the next test passes by running
test_trail_agent
Question: What is the first name of the wagon leader?
Answer: "Art" (short for Artificial Intelligence)
Open participant_agent/utils/nodes.py
Find the variable system_prompt and set to:
You are an oregon trail playing tool calling AI agent. Use the tools available to you to answer the question you are presented. When in doubt use the tools to help you find the answer. If anyone asks your first name is Art return just that string.
When working with LLMs, we need to provide a useful context to the model. In this case, we are telling the model what it's meant to do (play the Oregon Trail) and that it's name is "Art". This may seem trivial but don't underestimate the value of good prompting!
Open participant_agent/graph.py
- Uncomment lines 19-41
- Delete line 42 (graph = None) - this is just a placeholder.
- Define node 1, the agent, by passing a label
"agent"
and the code to execute at that nodecall_tool_model
- Define node 2, the tool node, by passing the label
"tools"
and the code to be executed at that nodetool_node
- Set the entrypoint for your graph at
"agent"
- Add a conditional edge with label
"agent"
and functiontools_condition
- Add a normal edge between
"tools"
and"agent"
Run test_trail_agent
to see if you pass the first scenario.
If you didn't pass the first test ask for help!.
To see a visual of your graph checkout sandbox.ipynb.
On the trail, you may have to do some planning in regards to how much food you want to utilize and when you will need to restock.
Question: In order to survive the trail ahead, you'll need to have a restocking strategy for when you need to get more supplies or risk starving. If it takes you an estimated 3 days to restock your food and you plan to start with 200lbs of food, budget 10lbs/day to eat, and keep a safety stock of at least 50lbs of back up... at what point should you restock?
Answer: D
Options: [A: 100lbs, B: 20lbs, C: 5lbs, D: 80lbs]
- Open participant_agent/utils/tools.py update the restock-tool description with a meaningful doc_string that provides context for the LLM.
Ex:
restock formula tool used specifically for calculating the amount of food at which you should start restocking.
- Implement the restock formula:
(daily_usage * lead_time) + safety_stock
- Update the
RestockInput
class such that it receives the correct variables - Pass the restock_tool to the exported
tools
list
At this stage, you may that your agent is returning a "correct" answer to the question but not in the format the test script expects. The test script expects answers to multiple choice questions to be the single character "A", "B", "C", or "D". This may seem contrived, but often in production scenarios agents will be expected to work with existing deterministic systems that will require specific schemas. For this reason, LangChain supports an LLM call with_structured_output
so that response can come from a predictable structure.
- Open participant_agent/utils/state.py and uncomment the multi_choice_response attribute on the state parameter. To this point our state has only had one attribute called
messages
but we are adding a specific field that we will add structured outputs to.- also observe the defined pydantic model in this file for our output
- Open participant_agent/utils/nodes.py and pass the pydantic class defined in state to the
with_structured_output
function - Update the graph to support a more advanced flow (see image below)
- Add a node for our
multi_choice_structured
this takes the messages after our tool calls and uses an LLM to format as we expect. - Add a conditional edge after the agent that determines if a multi-choice formatting is appropriate (see example)
- Update the
is_multi_choice
function in the nodes file to return the appropriate strings - Add an edge that goes from
multi_choice_structured
toEND
- Add a node for our
workflow.add_conditional_edges(
"agent",
is_multi_choice, # function in nodes that returns a string ["multi-choice", "not-multi-choice"]
{"multi-choice": "multi_choice_structured", "not-multi-choice": END}, # based on the string returned from the function instructs graph to route to a given node
)
Run test_trail_agent
to see if you pass
After these changes our graph is more predictable with structure output however it's important to note that a tradeoff has been incurred. Our results will be more deterministic but we had to add an additional LLM call and additional complexity to our graph in order to accomplish this feat. It's not necessarily a bad thing but it's important to keep in mind as LLM bills and latency can scale quickly.
Question: You’ve encountered a dense forest near the Blue Mountains, and your party is unsure how to proceed. There is a fork in the road, and you must choose a path. Which way will you go?
Answer: B
Options: [A: take the northern trail, B: take the southern trail, C: turn around, D: go fishing]
This scenario requires us to implement Retrieval Augmented Generation (RAG) within our agent workflow. There are cases when an LLM can't be expected to know some piece of information based on its training data and therefore needs to be supplemented.
This is often the case for time-bound or proprietary data. In which case you might augment generation from an LLM by pulling helpful data from a vector database.
In our scenario we want to be able to retrieve the time-bound information that the "the northern trail, of the blue mountains, was destroyed by a flood and is no longer safe to traverse. It is recommended to take the southern trail although it is longer."
. By creating a retriever_tool
that our agent knows how to use.
- Open participant_agent/utils/vector_store.py
- Where
vector_store=None
update tovector_store = RedisVectorStore.from_documents(<docs>, <embedding_model>, config=<config>)
with the appropriate variables. - Open participant_agent/utils/tools.py
- Uncomment code for retrieval tool
- Update the create_retriever_tool to take the correct params. Ex:
create_retriever_tool(vector_store.as_retriever(), "get_directions", "meaningful doc string")
- Make sure the retriever tool is included in the list of tools
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see your vector record stored within the database.
On the trail, sometimes speed is more important than holistic logic. For these type of question you might want to bypass the agent layer all together if you have already cached what a system should respond with given a certain situation.
Question: There's a deer. You're hungry. You know what you have to do... Answer: bang
- Open participant_agent/utils/semantic_cache.py
- Set semantic_cache to be an instance of the
redisvl.extensions.llmcache.SemanticCahe
class. Ex:SemanticCache(name=, redis_url=, distance_threshold=0.1)
- Store a prompt similar to the question and answer pair shown above (a similar example is provided in the file). Ex:
semantic_cache.store(prompt=, response=)
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see the cached record stored within the database.
On the trail, you may run into situations where your agent is simply being asked the wrong questions that you don't want to waste expensive LLM calls dealing with. In this case, we will add a routing layer in front of our agent to prevent our Oregon Trail bot from answering unrelated questions.
Question: Tell me about the S&P 500? Answer: you shall not pass
- Open participant_agent/utils/router.py
- Define the
blocked_route
. This will be the route that inputs similar to the blocked_references will be routed to. Ex:Route(name=, references=)
- Define the router using the
SemanticRouter
from redisvl. Ex:SemanticRouter(name=, vectorizer= routes=[], redis_url=REDIS_URL, overwrite=True)
Run test_trail_agent
to see if you pass
If this passes open localhost:8001
and see the route records stored within the database.
- You created a tool calling AI Agent
- You defined a custom tool for mathematical operations (restocking)
- You added structured output for when a system requires answers within a certain form.
- You defined a tool that implements Retrieval Augmented Generation aka RAG (retrieval tool)
- You created a semantic cache that can increase the speed and cost effectiveness of your agent workflow by short circuiting for known inputs/outputs.
- You implemented a router to protect your system from wasting time/resources on unrelated topics.