Skip to content

Commit 0776782

Browse files
committed
AI-feedback
1 parent 38533cc commit 0776782

File tree

11 files changed

+614
-0
lines changed

11 files changed

+614
-0
lines changed
Lines changed: 291 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,291 @@
1+
## LangSmith Streamlit Chat UI Example
2+
3+
In this example, you will create a ChatGPT-like web app in Streamlit that supports streaming, custom instructions, app feedback, and more. The final app will look like the following:
4+
5+
[![Chat UI](img/chat_overview.png)](https://langsmith-chat-feedback.streamlit.app/)
6+
7+
In making this app, you will get to use:
8+
9+
- LangChain chains or runnables to handle prompt templating, LLM calls, and memory management
10+
- LangSmith client to send user feedback and display trace links
11+
- Streamlit runtime and UI components
12+
13+
In particular, you will save user feedback as simple 👍/👎 scores attributed to traced runs, then we will walk through how we can see it in the LangSmith UI. Feedback can benefit LLM applications by providing signal for few-shot examples, model fine-tuning, evaluations, personalized user experiences, and improved application observability.
14+
15+
Now without further ado, let's get started!
16+
17+
## Prerequisites
18+
19+
To trace your runs and log feedback, you'll need to configure your environment to connect to [LangSmith](https://smith.langchain.com/). To do so, define the following environment variables:
20+
21+
```bash
22+
export LANGCHAIN_TRACING_V2=true
23+
export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
24+
export LANGCHAIN_API_KEY=<your-api-key>
25+
export LANGCHAIN_PROJECT=streamlit-demo
26+
```
27+
28+
We'll be using OpenAI, so configure up your API key for them as well:
29+
30+
```python
31+
export OPENAI_API_KEY=<your-openai-key>
32+
```
33+
34+
Since we'll be installing some updated packages, we recommend using a virtual environment to run.
35+
36+
```bash
37+
python -m virtualenv .venv
38+
. .venv/bin/activate
39+
```
40+
41+
Then, install the project requirements:
42+
43+
```bash
44+
pip install -r requirements.txt
45+
```
46+
47+
Finally, you should be able to run the app!
48+
49+
## Running the example
50+
51+
Execute the following command:
52+
53+
```bash
54+
streamlit run main.py
55+
```
56+
57+
It should spin up the chat app on your localhost. Feel free to chat, rate the runs, and view the linked traces using the appropriate buttons! Once you've traced some interactions and provided feedback, you can try navigating to the `streamlit-demo` project (or whichever `LANGCHAIN_PROJECT` environment variable you have configured for this application), to see all the traces for this project.
58+
59+
The aggregate feedback is displayed at the top of the screen, alongside the median and 99th percentile run latencies. In this case, 86% of the runs that received feedback were given a "thumbs up."
60+
61+
![Aggregate Feedback](img/average_feedback.png)
62+
63+
You can click one of the auto-populated filters to exclusively view runs that received a positive or negative score, or you can apply other filters based on latency, the number of tokens consumed, or other parameters.
64+
65+
Below, you can see we've filtered to only see runs that were given a "thumbs up" by the user.
66+
67+
![Positive User Feedback](img/user_feedback_one.png)
68+
69+
Click one of the runs to see its full trace. This is useful for visualizing the data flow through the chain.
70+
71+
[![LangSmith](img/langsmith.png)](https://smith.langchain.com/public/1b571b29-1bcf-406b-9d67-19a48d808b44/r)
72+
73+
74+
If you provided feedback to the selected run using one of the 👍/👎 buttons in the chat app, the "user feedback" will be visible in the "feedback" tab.
75+
76+
[![View Feedback](img/chat_feedback.png)](https://smith.langchain.com/public/1b571b29-1bcf-406b-9d67-19a48d808b44/r?tab=1)
77+
78+
79+
You can add the run as an example to a dataset by clicking "+ Add to Dataset".
80+
81+
![Add to Dataset](img/add_to_dataset.png)
82+
83+
Before saving, feel free to modify the example outputs. This way you can ensure the dataset contains the "ideal" ground truth. This is especially useful if you are filtering by "thumbs down" examples and want to save "corrections" in a dataset.
84+
85+
## Code Walkthrough
86+
87+
The app consists of a main script managed by the `streamlit` event loop. Below are some key code snippets of what you've run.
88+
89+
After importing the required modules, you initialize the streamlit session state with a trace link and run ID, and with a "langchain_messages" key, which is in itialized within the `StreamlitChatMessageHistory`.
90+
91+
```python
92+
if "trace_link" not in st.session_state:
93+
st.session_state.trace_link = None
94+
if "run_id" not in st.session_state:
95+
st.session_state.run_id = None
96+
memory = ConversationBufferMemory(
97+
chat_memory=StreamlitChatMessageHistory(key="langchain_messages"),
98+
return_messages=True, # Used to use message formats with the chat model
99+
memory_key="chat_history",
100+
)
101+
```
102+
103+
Then you define the core logic of the chat model. This example lets you select between two equivalent chains: an LLMChain, and a chain built with LangChain's [expression language](https://python.langchain.com/docs/guides/expression_language/).
104+
105+
#### Option 1: Expression Language Chain
106+
107+
The chain built using the LangChain Expression Language can be found in [expression_chain.py](expression_chain.py). It looks like the following:
108+
109+
```python
110+
memory = ConversationBufferMemory(
111+
chat_memory=StreamlitChatMessageHistory(key="langchain_messages"),
112+
return_messages=True,
113+
memory_key="chat_history",
114+
)
115+
ingress = RunnableMap(
116+
{
117+
"input": lambda x: x["input"],
118+
"chat_history": lambda x: memory.load_memory_variables(x)["chat_history"],
119+
"time": lambda _: str(datetime.now()),
120+
}
121+
)
122+
prompt = ChatPromptTemplate.from_messages(
123+
[
124+
(
125+
"system",
126+
"You are a funky parrot pal. You are not an AI. You are a parrot."
127+
" You love poetry, reading, funk music, friendship, and squawking!"
128+
" It's currently {time}.",
129+
),
130+
MessagesPlaceholder(variable_name="chat_history"),
131+
("human", "{input}"),
132+
]
133+
)
134+
llm = ChatOpenAI(temperature=0.7)
135+
chain = ingress | prompt | llm
136+
```
137+
138+
The expression language lets you compose different `Runnable` objects in a transparent way and provides sync/async, batch, and streaming methods that work end-to-end by default.
139+
140+
#### Optional 2: LLMChain
141+
142+
The second option is to use LangChain's core workhorse, the [LLMChain](https://api.python.langchain.com/en/latest/chains/langchain.chains.llm.LLMChain.html#langchain.chains.llm.LLMChain).
143+
The chain is defined in [vanilla_chain.py](vanilla_chain.py) and looks like the following code block:
144+
145+
```python
146+
memory = ConversationBufferMemory(return_messages=True, memory_key="chat_history")
147+
prompt = ChatPromptTemplate.from_messages(
148+
[
149+
(
150+
"system",
151+
"You are a funky parrot pal. You are not an AI. You are a parrot."
152+
" You love poetry, reading, funk music, and friendship!"
153+
" It's currently {time}.",
154+
),
155+
MessagesPlaceholder(variable_name="chat_history"),
156+
("human", "{input}"),
157+
]
158+
).partial(time=lambda: str(datetime.now()))
159+
llm = ChatOpenAI(temperature=0.7)
160+
chain = LLMChain(prompt=prompt, llm=llm, memory=memory)
161+
```
162+
163+
#### Streamlit State
164+
165+
Once you've defined the chat model, including it's conversational memory, we define another code block to manage the streamlit session state:
166+
167+
```python
168+
def _get_openai_type(msg):
169+
if msg.type == "human":
170+
return "user"
171+
if msg.type == "ai":
172+
return "assistant"
173+
if msg.type == "chat":
174+
return msg.role
175+
return msg.type
176+
177+
for msg in st.session_state.messages:
178+
with st.chat_message(_get_openai_type(msg)):
179+
st.markdown(msg.content)
180+
# Re-hydrate memory on app rerun
181+
memory.chat_memory.add_message(msg)
182+
183+
```
184+
185+
This does two things each time the streamlit event loop is triggered.
186+
1. Re-renders the chat conversation in the UI
187+
2. Re-hydrates the memory so the chain will resume where you left off.
188+
189+
After this, we define a function for logging feedback to LangSmith. It's a simple wrapper around the client:
190+
191+
```python
192+
# Imported above
193+
from langsmith import Client
194+
195+
client = Client()
196+
197+
def send_feedback(run_id, score):
198+
client.create_feedback(run_id, "user_score", score=score)
199+
```
200+
201+
This will be used in the `on_click` event for feedback buttons!
202+
203+
The logic for rendering the chat input and streaming the output to the app looks like this:
204+
205+
```python
206+
if prompt := st.chat_input(placeholder="Ask me a question!"):
207+
st.chat_message("user").write(prompt)
208+
with st.chat_message("assistant", avatar="🦜"):
209+
message_placeholder = st.empty()
210+
full_response = ""
211+
for chunk in chain.stream({"input": prompt}, config=runnable_config):
212+
full_response += chunk.content
213+
message_placeholder.markdown(full_response + "")
214+
memory.save_context({"input": prompt}, {"output": full_response})
215+
```
216+
217+
This renders a `chat_input` container, and when the user sends an input, it's converted to a "user" chat message. Then an "assistant" message is created, and tokens are streamed in by updating a full response and rendering it to markdown with a "cursor" icon to simulate typing.
218+
219+
Once the response completes, the values are saved to memory, which updates the streamlit message state so the conversation can be continued on the next loop.
220+
221+
Finally, you can create feedback for the response directly in the app using the following code:
222+
223+
```python
224+
if st.session_state.get("run_id"):
225+
feedback = streamlit_feedback(
226+
feedback_type="thumbs",
227+
key=f"feedback_{st.session_state.run_id}",
228+
)
229+
if feedback:
230+
scores = {"👍": 1, "👎": 0}
231+
client.create_feedback(
232+
st.session_state.run_id, "user_score", score=scores[feedback["score"]]
233+
)
234+
st.session_state.feedback = {"feedback_id": str(feedback.id), "score": score}
235+
```
236+
237+
To add additional comments or corrections via forms, we add the following code blocks:
238+
239+
```python
240+
# Prompt for more information, if feedback was submitted
241+
if st.session_state.get("feedback"):
242+
feedback = st.session_state.get("feedback")
243+
feedback_id = feedback["feedback_id"]
244+
score = feedback["score"]
245+
if score == 0:
246+
# Add text input with a correction box
247+
correction = st.text_input(
248+
label="What would the correct or preferred response have been?",
249+
key=f"correction_{feedback_id}",
250+
)
251+
if correction:
252+
st.session_state.feedback_update = {
253+
"correction": {"desired": correction},
254+
"feedback_id": feedback_id,
255+
}
256+
if score == 1:
257+
comment = st.text_input(
258+
label="Anything else you'd like to add about this response?",
259+
key=f"comment_{feedback_id}",
260+
)
261+
if comment:
262+
st.session_state.feedback_update = {
263+
"comment": comment,
264+
"feedback_id": feedback_id,
265+
}
266+
# Update the feedback if additional information was provided
267+
if st.session_state.get("feedback_update"):
268+
feedback_update = st.session_state.get("feedback_update")
269+
feedback_id = feedback_update.pop("feedback_id")
270+
client.update_feedback(feedback_id, **feedback_update)
271+
# Clear the comments
272+
_reset_feedback()
273+
```
274+
275+
They use the streamlit session state to track the state of the feedback dialog and make sure the original feedback is still logged immediately whether or not the user wants to add additional commentary.
276+
277+
## Reusable Tactics
278+
279+
Below are some 'tactics' used in this example that you could reuse in other situations:
280+
281+
1. **Using the Run Collector:** One way to fetch the run ID is by using the `RunCollectorCallbackHandler`, which stores all run objects in a simple python list. The collected run IDs are used to associate logged feedback and for accessing the trace URLs.
282+
283+
2. **Logging feedback with LangSmith client:** The LangSmith client is used to create and update feedback for each run. A simple form is thumbs up/down, but it also supports other `value`'s, `comment`'s, `correction`'s, and other input. This way, users and annotators alike can share explicit feedback on a run.
284+
285+
3. **Accessing URLs from saved runs:** The client also retrieves URLs for saved runs. It allows users to inspect their interactions, providing a direct link to LangSmith traces.
286+
287+
4. **LangChain Expression Language:** This example optionally uses LangChain's [expression language](https://python.langchain.com/docs/guides/expression_language/) to create the chain and provide streaming support by default. It also gives more visibility in the resulting traces.
288+
289+
## Conclusion
290+
291+
The LangSmith Streamlit Chat UI example provides a straightforward approach to crafting a chat interface abundant with features. If you aim to develop conversational AI applications with real-time feedback and traceability, the techniques and implementations in this guide are tailored for you. Feel free to adapt the code to suit your specific needs.
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
from datetime import datetime
2+
import operator
3+
4+
from langchain import chat_models
5+
from langchain import prompts
6+
from langchain.schema import runnable
7+
from langchain import memory
8+
import langsmith
9+
from langchain.output_parsers import openai_functions
10+
11+
12+
def get_critique_chain(
13+
memory: memory.ConversationBufferMemory, client: langsmith.Client
14+
) -> runnable.Runnable:
15+
"""Return a functions chain that critiques the prediction given the user's next response."""
16+
ingress = runnable.RunnableMap(
17+
{
18+
"input": lambda x: x["input"],
19+
"chat_history": lambda x: memory.load_memory_variables(x)["chat_history"],
20+
"time": lambda _: str(datetime.now()),
21+
}
22+
)
23+
prompt = prompts.ChatPromptTemplate.from_messages(
24+
[
25+
(
26+
"system",
27+
"You are a QA assurance agent shadowing a colleague. Review the following"
28+
" conversation and score the quality of "
29+
" the AI assistant's last response, taking the user's next response into account."
30+
" for instance, if the user corrects the AI saying 'no' or seems frustrated, "
31+
" you should score the AI's last response poorly."
32+
"\nIt's currently {time}.\n\n<TRANSCRIPT>",
33+
),
34+
prompts.MessagesPlaceholder(variable_name="chat_history"),
35+
# TODO: Could fetch previous feedback from this user / store in DB
36+
# to provide few-shot examples of good and bad responses for this user.
37+
("human", "{input}"),
38+
(
39+
"system",
40+
"</TRANSCRIPT>\nBased on the previous messages, how would you "
41+
"rate the AI's last response? Use the critique function.",
42+
),
43+
]
44+
).partial(time=lambda: str(datetime.now()))
45+
46+
schema = {
47+
"name": "critique",
48+
"description": "Save critique for later review.",
49+
"parameters": {
50+
"type": "object",
51+
"properties": {
52+
"score": {
53+
"type": "integer",
54+
"description": "The numeric grade (from 1 to 10) stating how well your colleague's"
55+
" response satisfied the user's need.",
56+
"minimum": 1,
57+
"maximum": 10,
58+
},
59+
"comment": {
60+
"type": "string",
61+
"description": "Step-by-step reasoning or explanation for the score.",
62+
},
63+
"correction": {
64+
"type": "object",
65+
"description": "What would a more appropriate response have been?",
66+
},
67+
},
68+
},
69+
}
70+
71+
llm = chat_models.ChatOpenAI(temperature=0.7).bind(functions=[schema])
72+
chain = ingress | prompt | llm | openai_functions.JsonOutputFunctionsParser()
73+
74+
feedback_chain = runnable.RunnableMap(
75+
{
76+
"result": (lambda x: {"input": x["input"]}) | chain,
77+
"run_id": operator.itemgetter("run_id"),
78+
}
79+
) | (
80+
lambda x: client.create_feedback(
81+
run_id=x["run_id"], key="ai_score", **x["result"]
82+
)
83+
)
84+
85+
return feedback_chain

0 commit comments

Comments
 (0)