-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for local models via Ollama #6
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# Ollama setup | ||
1. Download and install [Ollama](https://ollama.com/) | ||
2. Once Ollama is running on your system, run `ollama pull llama3.1` | ||
> Currently this is a ~5GB download, it's best to download it before the workshop if you plan on using it | ||
3. `ollama pull nomic-embed-text` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nomic is pretty small so no need to call this one out |
||
4. Update the `MODEL_NAME` in your `dot.env` file to `ollama` | ||
|
||
Once you are running ollama, it is not necessary to configure an openai api key. | ||
|
||
When you get to the system prompt section of the workshop, llama requires that you are a bit more explicit with your instructions. If the prompt given in the main instructions doesn't work, try the following instead: | ||
|
||
``` | ||
system_prompt = """ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tried several iterations of system prompts. llama 3.1 seems to need very explicit instructions |
||
OREGON TRAIL GAME INSTRUCTIONS: | ||
YOU MUST STRICTLY FOLLOW THIS RULE: | ||
When someone asks "What is the first name of the wagon leader?", your ENTIRE response must ONLY be the word: Art | ||
|
||
For all other questions, use available tools to provide accurate information. | ||
""" | ||
``` | ||
|
||
You're now ready to begin the workshop! Head back to the [Readme.md](Readme.md) | ||
|
||
## Restarting the workshop | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may bear further investigation, but in my tests it was best to kill and re-create |
||
Mixing use of llama and openai on the same Redis instance can cause unexpected behavior. If you want to switch from one to the other it is recommended to kill and re-create the instance. To do this: | ||
1. Run `docker ps` and take note of the ID for the running image | ||
2. `docker stop imageId` | ||
3. `docker rm imageId` | ||
4. Start a new instance using the command from earlier, `docker run -d --name redis -p 6379:6379 -p 8001:8001 redis/redis-stack:latest` |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,10 @@ In this workshop, we are going to use [LangGraph](https://langchain-ai.github.io | |
- [docker](https://docs.docker.com/get-started/get-docker/) | ||
- [openai api key](https://platform.openai.com/docs/quickstart) | ||
|
||
## (Optional) Ollama | ||
This workshop is optimized to run targeting OpenAI models. If you prefer to run locally however, you may do so via Ollama. | ||
* [Ollama setup instructions](Ollama.md) | ||
|
||
## (Optional) helpers | ||
|
||
- [LangSmith](https://docs.smith.langchain.com/) | ||
|
@@ -235,7 +239,13 @@ In our scenario we want to be able to retrieve the time-bound information that t | |
|
||
### Steps: | ||
- Open [participant_agent/utils/vector_store.py](participant_agent/utils/vector_store.py) | ||
- Where `vector_store=None` update to `vector_store = RedisVectorStore.from_documents(<docs>, <embedding_model>, config=<config>)` with the appropriate variables. | ||
- Find the corresponding `get_vector_store` method either for openai or ollama | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this would change re other comment |
||
- If using openai: where `vector_store=None` update to `vector_store = RedisVectorStore.from_documents(<docs>, <embedding_model>, config=<config>)` with the appropriate variables. | ||
|
||
> For `<embedding model>`, keep in mind whether you are using openai or ollama. If using ollama, the `model` parameter should be set to `nomic-embed-text` \ | ||
[OpenAI embeddings](https://python.langchain.com/docs/integrations/text_embedding/openai/) \ | ||
[Ollama embeddings](https://python.langchain.com/docs/integrations/text_embedding/ollama/) | ||
|
||
- Open [participant_agent/utils/tools.py](participant_agent/utils/tools.py) | ||
- Uncomment code for retrieval tool | ||
- Update the create_retriever_tool to take the correct params. Ex: `create_retriever_tool(vector_store.as_retriever(), "get_directions", "meaningful doc string")` | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,4 +3,5 @@ OPENAI_API_KEY=openai_key | |
LANGCHAIN_TRACING_V2= | ||
LANGCHAIN_ENDPOINT= | ||
LANGCHAIN_API_KEY= | ||
LANGCHAIN_PROJECT= | ||
LANGCHAIN_PROJECT= | ||
MODEL_NAME=openai | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. defaulting to openai |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,26 @@ | ||
import os | ||
from functools import lru_cache | ||
|
||
from dotenv import load_dotenv | ||
from langchain_core.messages import HumanMessage | ||
from langchain_openai import ChatOpenAI | ||
from langchain_ollama import ChatOllama | ||
from langgraph.prebuilt import ToolNode | ||
|
||
from example_agent.utils.ex_tools import tools | ||
|
||
from .ex_state import AgentState, MultipleChoiceResponse | ||
|
||
load_dotenv() | ||
|
||
environ_model_name = os.environ.get("MODEL_NAME") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is a constant I tend to follow the pattern that it should be all caps to indicate that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good catch! |
||
|
||
@lru_cache(maxsize=4) | ||
def _get_tool_model(model_name: str): | ||
if model_name == "openai": | ||
model = ChatOpenAI(temperature=0, model_name="gpt-4o") | ||
elif model_name == "ollama": | ||
model = ChatOllama(temperature=0, model="llama3.1", num_ctx=4096) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. increasing the context from the default (which is pretty low) provided much more reliable results |
||
else: | ||
raise ValueError(f"Unsupported model type: {model_name}") | ||
|
||
|
@@ -24,6 +32,8 @@ def _get_tool_model(model_name: str): | |
def _get_response_model(model_name: str): | ||
if model_name == "openai": | ||
model = ChatOpenAI(temperature=0, model_name="gpt-4o") | ||
elif model_name == "ollama": | ||
model = ChatOllama(temperature=0, model="llama3.1", num_ctx=4096) | ||
else: | ||
raise ValueError(f"Unsupported model type: {model_name}") | ||
|
||
|
@@ -36,7 +46,7 @@ def multi_choice_structured(state: AgentState, config): | |
# We call the model with structured output in order to return the same format to the user every time | ||
# state['messages'][-2] is the last ToolMessage in the convo, which we convert to a HumanMessage for the model to use | ||
# We could also pass the entire chat history, but this saves tokens since all we care to structure is the output of the tool | ||
model_name = config.get("configurable", {}).get("model_name", "openai") | ||
model_name = config.get("configurable", {}).get("model_name", environ_model_name) | ||
|
||
response = _get_response_model(model_name).invoke( | ||
[ | ||
|
@@ -62,20 +72,25 @@ def structure_response(state: AgentState, config): | |
# if not multi-choice don't need to do anything | ||
return {"messages": []} | ||
|
||
|
||
system_prompt = """ | ||
You are an oregon trail playing tool calling AI agent. Use the tools available to you to answer the question you are presented. When in doubt use the tools to help you find the answer. | ||
If anyone asks your first name is Art return just that string. | ||
""" | ||
|
||
if environ_model_name == "openai": | ||
system_prompt = """ | ||
You are an oregon trail playing tool calling AI agent. Use the tools available to you to answer the question you are presented. When in doubt use the tools to help you find the answer. | ||
If anyone asks your first name is Art return just that string. | ||
""" | ||
elif environ_model_name == "ollama": | ||
system_prompt = """ | ||
OREGON TRAIL GAME INSTRUCTIONS: | ||
YOU MUST STRICTLY FOLLOW THIS RULE: | ||
When someone asks "What is the first name of the wagon leader?", your ENTIRE response must ONLY be the word: Art | ||
""" | ||
|
||
# Define the function that calls the model | ||
def call_tool_model(state: AgentState, config): | ||
# Combine system prompt with incoming messages | ||
messages = [{"role": "system", "content": system_prompt}] + state["messages"] | ||
|
||
# Get from LangGraph config | ||
model_name = config.get("configurable", {}).get("model_name", "openai") | ||
model_name = config.get("configurable", {}).get("model_name", environ_model_name) | ||
|
||
# Get our model that binds our tools | ||
model = _get_tool_model(model_name) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
from dotenv import load_dotenv | ||
from langchain_core.documents import Document | ||
from langchain_openai import OpenAIEmbeddings | ||
from langchain_ollama import OllamaEmbeddings | ||
from langchain_redis import RedisConfig, RedisVectorStore | ||
|
||
load_dotenv() | ||
|
@@ -18,9 +19,34 @@ | |
|
||
|
||
def get_vector_store(): | ||
if os.environ.get("MODEL_NAME") == "ollama": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This method is pretty verbose, and I'm not the biggest fan of it as-is, so open to suggestions on what might make it better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the embedding model should be coupled to the LLM model in use. For the vector store, you could use whatever embedding model you'd like you don't have to use OpenAIEmbedding if using OpenAI as your LLM. To solve your edge case and make this code more simple but also more variable, I'd move the embedding model up to be a variable that the particpant can set to whatever they feel like and then add a cleaning method to just make sure there's no data under the prefix which seems to be that edge case. import os
from dotenv import load_dotenv
from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from redis import Redis
from langchain_redis import RedisConfig, RedisVectorStore
load_dotenv()
REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379/0")
INDEX_NAME = os.environ.get("VECTOR_INDEX_NAME", "oregon_trail")
config = RedisConfig(index_name=INDEX_NAME, redis_url=REDIS_URL)
redis_client = Redis.from_url(REDIS_URL)
doc = Document(
page_content="the northern trail, of the blue mountains, was destroyed by a flood and is no longer safe to traverse. It is recommended to take the southern trail although it is longer."
)
embedding_model = OpenAIEmbeddings() # TODO: participant can change to whatever desired model
def _clean_existing(prefix):
for key in redis_client.scan_iter(f"{prefix}:*"):
redis_client.delete(key)
def get_vector_store():
try:
config.from_existing = True
vector_store = RedisVectorStore(embedding_model, config=config)
except:
print("Init vector store with document")
print("Clean any existing data in index")
_clean_existing(config.INDEX_NAME)
config.from_existing = False
vector_store = RedisVectorStore.from_documents(
[doc], embedding_model, config=config
)
return vector_store There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great suggestions! This code is significantly cleaner. After a bit of testing though it produces some interesting results:
Even after bumping up the context size to 6144 it still returns the same thing
I've been scratching my head for a minute on what the correlation might be between the unstable results and using llama3.1 for the embedding model. We could just update the system prompt but I would be curious to learn more about what might really be going on. We could also use Openai for the embedding but that would defeat the purpose of not requiring an API token 😞 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find llama does struggle with listening to formatting instructions which is what you're experiencing for the "Art" question. Since the goal of the first test is really just to make sure that participants setup the initial graph correctly we could also relax the scenario to test if One thing to note is that the embedding model chosen for the vector retrieval tool should have absolutely no impact on the first question because it won't touch that system. If the goal is simply to remove dependency on an API key for embedding I'd actually recommend pulling one of the embedding models from huggingface but the llama one is also fine. # pip install langchain-huggingface
from langchain_huggingface import HuggingFaceEmbeddings |
||
return __get_ollama_vector_store() | ||
elif os.environ.get("MODEL_NAME") == "openai": | ||
return __get_openai_vector_store() | ||
|
||
def __check_existing_embedding(vector_store): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This solved an edge-case I kept running into where the store had stale data from another model etc |
||
results = vector_store.similarity_search(doc, k=1) | ||
if not results: | ||
raise Exception("Required content not found in existing store") | ||
|
||
def __get_ollama_vector_store(): | ||
try: | ||
config.from_existing = True | ||
vector_store = RedisVectorStore(OllamaEmbeddings(model="llama3"), config=config) | ||
__check_existing_embedding(vector_store) | ||
except: | ||
print("Init vector store with document") | ||
config.from_existing = False | ||
vector_store = RedisVectorStore.from_documents( | ||
[doc], OllamaEmbeddings(model="nomic-embed-text"), config=config | ||
) | ||
return vector_store | ||
|
||
def __get_openai_vector_store(): | ||
try: | ||
config.from_existing = True | ||
vector_store = RedisVectorStore(OpenAIEmbeddings(), config=config) | ||
__check_existing_embedding(vector_store) | ||
except: | ||
print("Init vector store with document") | ||
config.from_existing = False | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
from dotenv import load_dotenv | ||
from langchain_core.documents import Document | ||
from langchain_openai import OpenAIEmbeddings | ||
from langchain_ollama import OllamaEmbeddings | ||
from langchain_redis import RedisConfig, RedisVectorStore | ||
|
||
load_dotenv() | ||
|
@@ -18,13 +19,38 @@ | |
|
||
|
||
def get_vector_store(): | ||
if os.environ.get("MODEL_NAME") == "ollama": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again, not the dryest file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see other comment |
||
return __get_ollama_vector_store() | ||
elif os.environ.get("MODEL_NAME") == "openai": | ||
return __get_openai_vector_store() | ||
|
||
def __check_existing_embedding(vector_store): | ||
results = vector_store.similarity_search(doc, k=1) | ||
if not results: | ||
raise Exception("Required content not found in existing store") | ||
|
||
def __get_ollama_vector_store(): | ||
try: | ||
config.from_existing = True | ||
vector_store = RedisVectorStore(OllamaEmbeddings(model="llama3"), config=config) | ||
__check_existing_embedding(vector_store) | ||
except: | ||
print("Init vector store with document") | ||
config.from_existing = False | ||
|
||
# TODO: define vector store for ollama | ||
vector_store = None | ||
return vector_store | ||
|
||
def __get_openai_vector_store(): | ||
try: | ||
config.from_existing = True | ||
vector_store = RedisVectorStore(OpenAIEmbeddings(), config=config) | ||
__check_existing_embedding(vector_store) | ||
except: | ||
print("Init vector store with document") | ||
config.from_existing = False | ||
|
||
# TODO: define vector store | ||
# TODO: define vector store for openai | ||
vector_store = None | ||
return vector_store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping most ollama-specific instructions separate to be less intrusive