|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "66854c19-6b01-4046-8ac5-27940537c35c", |
| 6 | + "metadata": {}, |
| 7 | + "source": [ |
| 8 | + "# Evaluating an Extraction Chain\n", |
| 9 | + "\n", |
| 10 | + "Structured data [extraction](https://python.langchain.com/docs/use_cases/extraction) from unstructured text is a core part of any LLM applications. Whether it's preparing structured rows for database insertion, deriving API parameters for function calling and forms, or for building knowledge graphs, the utility is present.\n", |
| 11 | + "\n", |
| 12 | + "This walkthrough presents a method to evaluate an extraction chain. While our example dataset revolves around legal briefs, the principles and techniques laid out here are widely applicable across various domains and use-cases.\n", |
| 13 | + "\n", |
| 14 | + "By the end of this guide, you'll be equipped to set up and evaluate extraction chains tailored to your specific needs, ensuring your applications extract information both effectively and efficiently.\n", |
| 15 | + "\n", |
| 16 | + "\n", |
| 17 | + "\n", |
| 18 | + "## Prerequisites\n", |
| 19 | + "\n", |
| 20 | + "This walkthrough requires LangChain and Anthropic. Ensure they're installed and that you've configured the necessary API keys." |
| 21 | + ] |
| 22 | + }, |
| 23 | + { |
| 24 | + "cell_type": "code", |
| 25 | + "execution_count": 1, |
| 26 | + "id": "0464edf8-751b-4b80-9f01-70af40f00c4b", |
| 27 | + "metadata": {}, |
| 28 | + "outputs": [], |
| 29 | + "source": [ |
| 30 | + "%pip install -U --quiet langchain langsmith langchain_experimental anthropic jsonschema" |
| 31 | + ] |
| 32 | + }, |
| 33 | + { |
| 34 | + "cell_type": "code", |
| 35 | + "execution_count": 2, |
| 36 | + "id": "a1e3b17b-736a-4991-8122-11bf3ac121c5", |
| 37 | + "metadata": {}, |
| 38 | + "outputs": [], |
| 39 | + "source": [ |
| 40 | + "import os\n", |
| 41 | + "import uuid\n", |
| 42 | + "\n", |
| 43 | + "uid = uuid.uuid4()\n", |
| 44 | + "os.environ[\"LANGCHAIN_API_KEY\"] = \"YOUR API KEY\"\n", |
| 45 | + "os.environ[\"ANTHROPIC_API_KEY\"] = \"sk-ant-***\"" |
| 46 | + ] |
| 47 | + }, |
| 48 | + { |
| 49 | + "cell_type": "markdown", |
| 50 | + "id": "b876b92c-74a5-47ae-9efc-dab64bf88d19", |
| 51 | + "metadata": {}, |
| 52 | + "source": [ |
| 53 | + "## 1. Create dataset\n", |
| 54 | + "\n", |
| 55 | + "For this task, we will be filling out details about legal contracts from their context. We have prepared a mall labeled dataset for this walkthrough based on the Contract Understanding Atticus Dataset (CUAD)([link](https://github.com/TheAtticusProject/cuad)). You can explore the [Contract Extraction](https://smith.langchain.com/public/08ab7912-006e-4c00-a973-0f833e74907b/d) dataset at the provided link." |
| 56 | + ] |
| 57 | + }, |
| 58 | + { |
| 59 | + "cell_type": "code", |
| 60 | + "execution_count": 3, |
| 61 | + "id": "921efddb-8210-4d5f-8705-41e6f9521b28", |
| 62 | + "metadata": {}, |
| 63 | + "outputs": [], |
| 64 | + "source": [ |
| 65 | + "from langsmith import Client\n", |
| 66 | + "\n", |
| 67 | + "share_token = \"08ab7912-006e-4c00-a973-0f833e74907b\"\n", |
| 68 | + "dataset_name = f\"Contract Extraction - {uid}\"\n", |
| 69 | + "\n", |
| 70 | + "client = Client()\n", |
| 71 | + "examples = list(client.list_shared_examples(share_token))\n", |
| 72 | + "dataset = client.create_dataset(dataset_name=dataset_name)\n", |
| 73 | + "client.create_examples(\n", |
| 74 | + " inputs=[e.inputs for e in examples],\n", |
| 75 | + " outputs=[e.outputs for e in examples],\n", |
| 76 | + " dataset_id=dataset.id,\n", |
| 77 | + ")" |
| 78 | + ] |
| 79 | + }, |
| 80 | + { |
| 81 | + "cell_type": "markdown", |
| 82 | + "id": "067a9e51-7fcb-4de5-87af-644b7ca9b893", |
| 83 | + "metadata": {}, |
| 84 | + "source": [ |
| 85 | + "## 2. Define extraction chain\n", |
| 86 | + "\n", |
| 87 | + "Our dataset inputs are quite long, so we will be testing out the experimental [Anthropic Functions](https://python.langchain.com/docs/integrations/chat/anthropic_functions) chain for this extraction task. This chain prompts the model to respond in XML that conforms to the provided schema.\n", |
| 88 | + "\n", |
| 89 | + "Below, we will define the contract schema to extract" |
| 90 | + ] |
| 91 | + }, |
| 92 | + { |
| 93 | + "cell_type": "code", |
| 94 | + "execution_count": 4, |
| 95 | + "id": "e51793dc-ee9f-491c-aa91-fb32cf66308c", |
| 96 | + "metadata": {}, |
| 97 | + "outputs": [], |
| 98 | + "source": [ |
| 99 | + "from typing import List, Optional, Union\n", |
| 100 | + "\n", |
| 101 | + "from pydantic import BaseModel\n", |
| 102 | + "\n", |
| 103 | + "\n", |
| 104 | + "class Address(BaseModel):\n", |
| 105 | + " street: str\n", |
| 106 | + " city: str\n", |
| 107 | + " state: str\n", |
| 108 | + " zip_code: str\n", |
| 109 | + " country: Optional[str]\n", |
| 110 | + "\n", |
| 111 | + "\n", |
| 112 | + "class Party(BaseModel):\n", |
| 113 | + " name: str\n", |
| 114 | + " address: Address\n", |
| 115 | + " type: Optional[str]\n", |
| 116 | + "\n", |
| 117 | + "\n", |
| 118 | + "class Section(BaseModel):\n", |
| 119 | + " title: str\n", |
| 120 | + " content: str\n", |
| 121 | + "\n", |
| 122 | + "\n", |
| 123 | + "class Contract(BaseModel):\n", |
| 124 | + " document_title: str\n", |
| 125 | + " exhibit_number: Optional[str]\n", |
| 126 | + " effective_date: str\n", |
| 127 | + " parties: List[Party]\n", |
| 128 | + " sections: List[Section]" |
| 129 | + ] |
| 130 | + }, |
| 131 | + { |
| 132 | + "cell_type": "markdown", |
| 133 | + "id": "25e90fd2-bd7c-499d-b02d-3eee98550031", |
| 134 | + "metadata": {}, |
| 135 | + "source": [ |
| 136 | + "Now we can define our extraction chain. We define it in the `create_chain`" |
| 137 | + ] |
| 138 | + }, |
| 139 | + { |
| 140 | + "cell_type": "code", |
| 141 | + "execution_count": 5, |
| 142 | + "id": "6a18eb63-c525-40d6-be1d-bd4c52fbfbdf", |
| 143 | + "metadata": {}, |
| 144 | + "outputs": [], |
| 145 | + "source": [ |
| 146 | + "from langchain import hub\n", |
| 147 | + "from langchain.chains import create_extraction_chain\n", |
| 148 | + "from langchain.chat_models import ChatAnthropic\n", |
| 149 | + "from langchain_experimental.llms.anthropic_functions import AnthropicFunctions\n", |
| 150 | + "\n", |
| 151 | + "contract_prompt = hub.pull(\"wfh/anthropic_contract_extraction\")\n", |
| 152 | + "\n", |
| 153 | + "\n", |
| 154 | + "extraction_subchain = create_extraction_chain(\n", |
| 155 | + " Contract.schema(),\n", |
| 156 | + " llm=AnthropicFunctions(model=\"claude-2\", max_tokens=20_000),\n", |
| 157 | + " prompt=contract_prompt,\n", |
| 158 | + ")\n", |
| 159 | + "# Dataset inputs have an \"context\" key, but this chain\n", |
| 160 | + "# expects a dict with an \"input\" key\n", |
| 161 | + "chain = (\n", |
| 162 | + " (lambda x: {\"input\": x[\"context\"]})\n", |
| 163 | + " | extraction_subchain\n", |
| 164 | + " | (lambda x: {\"output\": x[\"text\"]})\n", |
| 165 | + ")" |
| 166 | + ] |
| 167 | + }, |
| 168 | + { |
| 169 | + "cell_type": "markdown", |
| 170 | + "id": "ee760e32-6d0b-421a-b73b-c15c60f35883", |
| 171 | + "metadata": {}, |
| 172 | + "source": [ |
| 173 | + "## 3. Evaluate\n", |
| 174 | + "\n", |
| 175 | + "For this evaluation, we'll utilize the JSON edit distance evaluator, which standardizes the extracted entities and then determines a normalized string edit distance between the canonical versions. It is a fast way to check for the similarity between two json objects without relying explicitly on an LLM." |
| 176 | + ] |
| 177 | + }, |
| 178 | + { |
| 179 | + "cell_type": "code", |
| 180 | + "execution_count": 6, |
| 181 | + "id": "65d8ff4f-f5fa-4f21-a412-9a8b21749ec3", |
| 182 | + "metadata": {}, |
| 183 | + "outputs": [], |
| 184 | + "source": [ |
| 185 | + "import logging\n", |
| 186 | + "\n", |
| 187 | + "# We will suppress any errors here since the documents are long\n", |
| 188 | + "# and could pollute the notebook output\n", |
| 189 | + "logger = logging.getLogger()\n", |
| 190 | + "logger.setLevel(logging.CRITICAL)" |
| 191 | + ] |
| 192 | + }, |
| 193 | + { |
| 194 | + "cell_type": "code", |
| 195 | + "execution_count": 7, |
| 196 | + "id": "03383cdd-8fef-480f-bddf-b0616ed6e0c9", |
| 197 | + "metadata": {}, |
| 198 | + "outputs": [ |
| 199 | + { |
| 200 | + "name": "stdout", |
| 201 | + "output_type": "stream", |
| 202 | + "text": [ |
| 203 | + "View the evaluation results for project 'test-extraneous-weather-30' at:\n", |
| 204 | + "https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/projects/p/06d577cc-77b2-45a2-80a5-34c232bba9af?eval=true\n", |
| 205 | + "[------------------------------------------------->] 16/16" |
| 206 | + ] |
| 207 | + } |
| 208 | + ], |
| 209 | + "source": [ |
| 210 | + "from langchain.smith import RunEvalConfig\n", |
| 211 | + "\n", |
| 212 | + "eval_config = RunEvalConfig(\n", |
| 213 | + " evaluators=[\"json_edit_distance\"],\n", |
| 214 | + ")\n", |
| 215 | + "res = client.run_on_dataset(\n", |
| 216 | + " dataset_name=dataset_name,\n", |
| 217 | + " llm_or_chain_factory=chain,\n", |
| 218 | + " evaluation=eval_config,\n", |
| 219 | + " # In case you are rate-limited\n", |
| 220 | + " concurrency_level=2,\n", |
| 221 | + ")" |
| 222 | + ] |
| 223 | + }, |
| 224 | + { |
| 225 | + "cell_type": "markdown", |
| 226 | + "id": "7f59fb9d-9cb1-4365-9619-28d2897e0dfd", |
| 227 | + "metadata": {}, |
| 228 | + "source": [ |
| 229 | + "## Conclusion\n", |
| 230 | + "\n", |
| 231 | + "In this walkthrough, we showcased a methodical approach to evaluating an extraction chain applied to template filling for legal briefs.\n", |
| 232 | + "You can use similar techniques to evaluate chains intended to return structured output." |
| 233 | + ] |
| 234 | + } |
| 235 | + ], |
| 236 | + "metadata": { |
| 237 | + "kernelspec": { |
| 238 | + "display_name": "Python 3 (ipykernel)", |
| 239 | + "language": "python", |
| 240 | + "name": "python3" |
| 241 | + }, |
| 242 | + "language_info": { |
| 243 | + "codemirror_mode": { |
| 244 | + "name": "ipython", |
| 245 | + "version": 3 |
| 246 | + }, |
| 247 | + "file_extension": ".py", |
| 248 | + "mimetype": "text/x-python", |
| 249 | + "name": "python", |
| 250 | + "nbconvert_exporter": "python", |
| 251 | + "pygments_lexer": "ipython3", |
| 252 | + "version": "3.11.2" |
| 253 | + } |
| 254 | + }, |
| 255 | + "nbformat": 4, |
| 256 | + "nbformat_minor": 5 |
| 257 | +} |
0 commit comments