Skip to content

Commit 9046e88

Browse files
shawnlewisWeave Build Botvanpelttssweeneydannygoldstein
authored
chore(weave): Weaveflow (wandb#753)
* Eval notebooks * WIP junk * Add working manually configured eval board * Text extraction eval board. * Enable facet sort * Help text and facet titles * Object and method saving & loading by value * Recursively publish object ops * Fix wandb recursive op publishing * Remove breakpoint * Fixes for dynamic object loading, WeaveList alias * Save objects to root type instead of their type * handle reading ref types * Fix boards on home page, render ref as string * Make PanelFiles work when no context present * Return UnknownType when wandb artifact missing, main() fn for boards * Weave op level tracing * Fixes to openai and trace * Ensure recursively published objects go to same project * Fixes for ref comparison and more * Hack fix for run history parquet v3 read * Browse2 * streamTable -> callsTable * Trace page * Style openai and other styling * Better UI * Place holders for new features * More styling and placeholders * Add Boards panel, tweaks * Add @mui/lab * Fix Facet-selected * Add row to dataset from UI * Fix all typing issues * Object editing * Edit more types, bugfixes, UX * More fixes. * Logged call fixes, render op code * Start splitting up Browse2 * Continue splitting up Browse2 * More splitting * Finish splitting Browse2 * Changes need to make user API nice * Nested runs table and show refs in a uniform way * Try removing vite plugin block so i can build frontend * chore(bot): update frontend bundle sha [no ci] * Include weave command line tool * Update cli, remove serve for now * Include openai and tiktoken in requirements * Handle simple ref argument, fix crash * Use wandb default entity name in init * Fix table navigation and use SmallRef for OpDef * chore(bot): update frontend bundle sha [no ci] * Fix loading ArrowWeaveList / evaltable * Fix: handle multiple ObjectType with same .name but different fields * UI tweaks * chore(bot): update frontend bundle sha [no ci] * Revert "Fix: handle multiple ObjectType with same .name but different fields" This reverts commit 6616ab2. * Fix SpanDetails when output is null * Fixes * Don't crash when types are unknown, and fix recursive object publishling case * Add helpful print * Graph API and artifact helpers * Working feedback tab * Show feedback in runs table * Don't mess with openai-apikey path * More fixes, API improvements * Move weaveflow classes into weave * Chat model fixes * Add finetuning * server * Graft in weaveflow * Refacor the command to use click and go into pyproject.toml * Fix model loading * Much improved serving * Don't sync wandb artifacts if they already exist! * Progress on deploy command * Fixes for deploy * Switched to cloud run! * Cleanup, no more secrets * Accept service account, fix safe_name * Fix RunsTable field rendering * Fix rendering junk in output * Fix nested WeaveEditor links and rendering * Ensure serialized ops have imports needed for function annotations * Fix display of single results, and show run which objects are output of * Fix deserializing same object type class with different fields * Added modal support, made GCP store secrets and limit permissions with a custom service account * Made weave startup without .md files because of modal syncing bug * Modal reqs * Added wandb authentication to the endpoints * Get tests running, make tests lazy by default * lint * ts fix * more lint * more lint fixes * bump node version * sha update * node bump * built * chore(bot): update frontend bundle sha [no ci] * a bunch more lint * back down to node 16 * back down to node 16 * chore(bot): update frontend bundle sha [no ci] * Fix weaveflow issue with artifact_pusher after merge * Mark object types as relocatable to fix lots of tests * More relocatable fixes * Fix more tests * Bring back string-append * Fix broken test * Fix two more tests * Add comment about disabled objecttype constructor, disable tests * Remove invalid test * Fix op versioning test * fixed test_publish_* * test build fix * test build fix 2 * typing: low hanging fruit * typing: low hanging fruit * typing: low-ish hanging fruit * typing: finally typed the op decorator * typing: run.py * typing: eager.py * typing: weave_types.py * typing: monitor.py * typing: execute.py * typing: pyfunc_type_util.py * typing: storage.py * typing: weave_types.py * typing: graph_client.py * typing: api.py * typing: weave_api.py * typing: weave_api.py * typing: wandb_domain_gql.py * typing: serve_fastapi.py * typing: stub.py * typing: panel_eval.py * typing: structured_output.py * typing: chat_model.py * typing: deploy/gcp/__init__.py * typing COMPLETE: deploy/modal/__init__.py * better local tests * better local tests 2 * fixed a few ref tests * fixed a few ref tests 2 * fixed last test? * final test fix? * temp fix: Table Summary panel.ipyndb * notebook fix: closures.ipynb * TESTING: Potentially revert - moving eager mode to weaveflow init * Fix for MonitorPanelPlot.ipyndb * Fix for Closures.ipyndb * test fix * fixed empty obj_val * fixes for some notebooks * fix table summary again * Improve error message, fix missing weaveflow openai spans * Potentially Revert: op_def_type.py * Attempt to fix image notebooks * Sanitze before exec * Don't allow op loading in wandb prod * chore(bot): update frontend bundle sha [no ci] * fixes fallback object deserialization * comment improvement * potentially incorrect: fix deserialize 2 * fix mappers_python_def.py from tim weaveflow merge * fix undo deserialize shange * revert serialize fix * feat(weaveflow): make versions table better (wandb#869) * make versions table better * dev tracks * dev tracks * lint * fix versions table * Small fixes * make newlines gray instead of red (wandb#884) * Fix weave serve command * add basic document experience (wandb#883) * tslint * remove new line in requirements.dev.txt * remove new line in requirements.dev.txt * remove new line in requirements.dev.txt - 2 * trying to fix CLA * undo cla * final countdown --------- Co-authored-by: Weave Build Bot <[email protected]> Co-authored-by: Chris Van Pelt <[email protected]> Co-authored-by: Tim Sweeney <[email protected]> Co-authored-by: Danny Goldstein <[email protected]>
1 parent 60bf301 commit 9046e88

File tree

170 files changed

+10281
-376
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

170 files changed

+10281
-376
lines changed

.github/workflows/release.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ on:
44
workflow_dispatch:
55
inputs:
66
is_test:
7-
description: 'Use Test Pypi'
7+
description: "Use Test Pypi"
88
required: true
99
type: boolean
1010
default: true
@@ -14,7 +14,7 @@ jobs:
1414
name: Build and publish to pypi
1515
runs-on: ubuntu-8core
1616
timeout-minutes: 10
17-
environment:
17+
environment:
1818
name: release
1919
permissions:
2020
id-token: write

.github/workflows/upload-assets.yaml

+8-8
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ jobs:
1111
name: Build frontend assets
1212
runs-on: ubuntu-8core
1313
timeout-minutes: 10
14-
environment:
14+
environment:
1515
name: release
1616
permissions:
17-
contents: 'write'
18-
id-token: 'write'
17+
contents: "write"
18+
id-token: "write"
1919
steps:
2020
- uses: actions/checkout@v3
2121
with:
@@ -24,7 +24,7 @@ jobs:
2424
- uses: actions/setup-node@v1
2525
with:
2626
node-version: "16.x"
27-
- id: 'build'
27+
- id: "build"
2828
run: |
2929
./weave/frontend/build.sh
3030
if [[ -z "$(git status weave/frontend/sha1.txt --porcelain)" ]]
@@ -36,16 +36,16 @@ jobs:
3636
git diff
3737
echo "UPLOAD_ASSETS=true" >> "$GITHUB_OUTPUT"
3838
fi
39-
- id: 'auth'
40-
name: 'Authenticate to Google Cloud'
41-
uses: 'google-github-actions/auth@v1'
39+
- id: "auth"
40+
name: "Authenticate to Google Cloud"
41+
uses: "google-github-actions/auth@v1"
4242
if: ${{ steps.build.outputs.UPLOAD_ASSETS == 'true' }}
4343
with:
4444
workload_identity_provider: ${{ secrets.WORKLOAD_IDENTITY_PROVIDER }}
4545
# the service account secret format is wrong, hard coding it until it is fixed in core
4646
#service_account: ${{ secrets.WORKLOAD_IDENTITY_SERVICE_ACCOUNT }}
4747
service_account: [email protected]
48-
- id: 'upload-and-push'
48+
- id: "upload-and-push"
4949
if: ${{ steps.build.outputs.UPLOAD_ASSETS == 'true' }}
5050
run: |
5151
./weave/frontend/bundle.sh

Eval Board Syn.ipynb

+247
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "5214b543",
6+
"metadata": {},
7+
"source": [
8+
"- Load two eval_results\n",
9+
"\n",
10+
"EvalResult\n",
11+
"- example, label, result, item_summary"
12+
]
13+
},
14+
{
15+
"cell_type": "code",
16+
"execution_count": null,
17+
"id": "314bb6b4",
18+
"metadata": {},
19+
"outputs": [],
20+
"source": [
21+
"import time\n",
22+
"import typing\n",
23+
"import weave\n",
24+
"import random\n",
25+
"import string\n",
26+
"from weave import weave_internal\n",
27+
"weave.use_frontend_devmode()\n",
28+
"from weave.panels import panel_board\n",
29+
"from weave import ops_domain"
30+
]
31+
},
32+
{
33+
"cell_type": "code",
34+
"execution_count": null,
35+
"id": "550daef6",
36+
"metadata": {},
37+
"outputs": [],
38+
"source": [
39+
"def rand_string_n(n: int) -> str:\n",
40+
" return \"\".join(\n",
41+
" random.choice(string.ascii_uppercase + string.digits) for _ in range(n)\n",
42+
" )\n",
43+
"\n",
44+
"dataset_raw = [{\n",
45+
" 'id': str(i),\n",
46+
" 'example': rand_string_n(10),\n",
47+
" 'label': random.choice(string.ascii_uppercase)} for i in range(50)]\n",
48+
"dataset = weave.save(dataset_raw, 'dataset')\n",
49+
"#dataset"
50+
]
51+
},
52+
{
53+
"cell_type": "code",
54+
"execution_count": null,
55+
"id": "d0d930d8",
56+
"metadata": {},
57+
"outputs": [],
58+
"source": [
59+
"def predict(dataset_row, config):\n",
60+
" if random.random() < config['correct_chance']:\n",
61+
" return dataset_row['label']\n",
62+
" return random.choice(string.ascii_uppercase)"
63+
]
64+
},
65+
{
66+
"cell_type": "code",
67+
"execution_count": null,
68+
"id": "eb86b95c",
69+
"metadata": {},
70+
"outputs": [],
71+
"source": [
72+
"def evaluate(dataset, predict_config):\n",
73+
" eval_result = []\n",
74+
" correct_count = 0\n",
75+
" count = 0\n",
76+
" for dataset_row in dataset:\n",
77+
" start_time = time.time()\n",
78+
" result = predict(dataset_row, predict_config)\n",
79+
" latency = time.time() - start_time\n",
80+
" latency = random.gauss(predict_config['latency_mu'], predict_config['latency_sigma'])\n",
81+
" correct = dataset_row['label'] == result\n",
82+
" if correct:\n",
83+
" correct_count += 1\n",
84+
" count +=1 \n",
85+
" eval_result.append({\n",
86+
" 'dataset_id': dataset_row['id'],\n",
87+
" 'result': result,\n",
88+
" 'summary': {\n",
89+
" 'latency': latency,\n",
90+
" 'correct': correct\n",
91+
" }\n",
92+
" })\n",
93+
" return {\n",
94+
" 'config': predict_config,\n",
95+
" 'eval_table': eval_result,\n",
96+
" 'summary': {'accuracy': correct_count / len(dataset)}}"
97+
]
98+
},
99+
{
100+
"cell_type": "code",
101+
"execution_count": null,
102+
"id": "05d16a5e",
103+
"metadata": {},
104+
"outputs": [],
105+
"source": [
106+
"eval_result_raw0 = evaluate(dataset_raw, {'correct_chance': 0.5, 'latency_mu': 0.3, 'latency_sigma': 0.1})\n",
107+
"eval_result_raw1 = evaluate(dataset_raw, {'correct_chance': 0.5, 'latency_mu': 0.4, 'latency_sigma': 0.2})\n",
108+
"eval_result0 = weave.save(eval_result_raw0, 'eval_result0')\n",
109+
"eval_result1 = weave.save(eval_result_raw1, 'eval_result1')"
110+
]
111+
},
112+
{
113+
"cell_type": "code",
114+
"execution_count": null,
115+
"id": "e8065ad6",
116+
"metadata": {},
117+
"outputs": [],
118+
"source": [
119+
"\n",
120+
"\n",
121+
"varbar = panel_board.varbar()\n",
122+
"\n",
123+
"dataset_var = varbar.add('dataset', dataset)\n",
124+
"eval_result0_var = varbar.add('eval_result0', eval_result0)\n",
125+
"eval_result1_var = varbar.add('eval_result1', eval_result1)\n",
126+
"\n",
127+
"summary = varbar.add('summary', weave.ops.make_list(\n",
128+
" a=weave.ops.TypedDict.merge(weave.ops.dict_(name='res0'), eval_result0_var['summary']),\n",
129+
" b=weave.ops.TypedDict.merge(weave.ops.dict_(name='res1'), eval_result1_var['summary']),\n",
130+
"))\n",
131+
"\n",
132+
"weave.ops.make_list(a=eval_result0_var['eval_table'], b=eval_result0_var['eval_table'])\n",
133+
"\n",
134+
"concatted_evals = varbar.add('concatted_evals', weave.ops.List.concat(\n",
135+
" weave.ops.make_list(\n",
136+
" a=eval_result0_var['eval_table'].map(\n",
137+
" lambda row: weave.ops.TypedDict.merge(\n",
138+
" weave.ops.dict_(name='res0'), row)),\n",
139+
" b=eval_result1_var['eval_table'].map(\n",
140+
" lambda row: weave.ops.TypedDict.merge(\n",
141+
" weave.ops.dict_(name='res1'), row)))))\n",
142+
"\n",
143+
"# join evals together first\n",
144+
"joined_evals = varbar.add('joined_evals', weave.ops.join_all(\n",
145+
" weave.ops.make_list(a=eval_result0_var['eval_table'], b=eval_result1_var['eval_table']),\n",
146+
" lambda row: row['dataset_id'],\n",
147+
" False))\n",
148+
"\n",
149+
"# then join dataset to evals\n",
150+
"dataset_evals = varbar.add('dataset_evals', weave.ops.join_2(\n",
151+
" dataset_var,\n",
152+
" joined_evals,\n",
153+
" lambda row: row['id'],\n",
154+
" lambda row: row['dataset_id'][0],\n",
155+
" 'dataset',\n",
156+
" 'evals',\n",
157+
" False,\n",
158+
" False\n",
159+
"))\n",
160+
"\n",
161+
"\n",
162+
"main = weave.panels.Group(\n",
163+
" layoutMode=\"grid\",\n",
164+
" showExpressions=True,\n",
165+
" enableAddPanel=True,\n",
166+
" )\n",
167+
"\n",
168+
"#### Run/config info TODO\n",
169+
"\n",
170+
"#### Summary info\n",
171+
"\n",
172+
"main.add(\"accuracy\",\n",
173+
" weave.panels.Plot(summary,\n",
174+
" x=lambda row: row['accuracy'],\n",
175+
" y=lambda row: row['name'],\n",
176+
" color=lambda row: row['name']\n",
177+
" ),\n",
178+
" layout=weave.panels.GroupPanelLayout(x=0, y=0, w=12, h=4))\n",
179+
"\n",
180+
"\n",
181+
"main.add(\"latency\",\n",
182+
" weave.panels.Plot(concatted_evals,\n",
183+
" x=lambda row: row['summary']['latency'],\n",
184+
" y=lambda row: row['name'],\n",
185+
" color=lambda row: row['name'],\n",
186+
" mark='boxplot'),\n",
187+
" layout=weave.panels.GroupPanelLayout(x=12, y=0, w=12, h=4))\n",
188+
"\n",
189+
"#ct = main.add('concat_t', concatted_evals, layout=weave.panels.GroupPanelLayout(x=0, y=4, w=24, h=12))\n",
190+
"# main.add('dataset_table', dataset)\n",
191+
"# main.add('joined_evals', joined_evals)\n",
192+
"# main.add('dataset_evals', dataset_evals, layout=weave.panels.GroupPanelLayout(x=0, y=4, w=24, h=6))\n",
193+
"\n",
194+
"##### Example details\n",
195+
"\n",
196+
"# more ideas: show examples that all got wrong, or that are confusing\n",
197+
"\n",
198+
"faceted_view = weave.panels.Facet(dataset_evals,\n",
199+
" x=lambda row: row['evals.summary'][0]['correct'],\n",
200+
" y=lambda row: row['evals.summary'][1]['correct'],\n",
201+
" select=lambda row: row.count())\n",
202+
"\n",
203+
"faceted = main.add('faceted', faceted_view, layout=weave.panels.GroupPanelLayout(x=0, y=4, w=12, h=6))\n",
204+
"\n",
205+
"main.add(\"example_latencies\",\n",
206+
" weave.panels.Plot(dataset_evals,\n",
207+
" x=lambda row: row['evals.summary']['latency'][0],\n",
208+
" y=lambda row: row['evals.summary']['latency'][1]),\n",
209+
" layout=weave.panels.GroupPanelLayout(x=12, y=4, w=12, h=6))\n",
210+
"\n",
211+
"faceted_sel = weave.panels.Table(faceted.selected())\n",
212+
"faceted_sel.config.rowSize = 2\n",
213+
"faceted_sel.add_column(lambda row: row['dataset.id'], 'id')\n",
214+
"faceted_sel.add_column(lambda row: row['dataset.example'], 'example')\n",
215+
"faceted_sel.add_column(lambda row: row['dataset.label'], 'label')\n",
216+
"faceted_sel.add_column(lambda row: weave.ops.dict_(res0=row['evals.result'][0], res1=row['evals.result'][1]), 'result')\n",
217+
"faceted_sel.add_column(lambda row: weave.ops.dict_(res0=row['evals.summary'][0]['correct'], res1=row['evals.summary'][1]['correct']), 'correct')\n",
218+
"faceted_sel.add_column(lambda row: weave.ops.dict_(res0=row['evals.summary'][0]['latency'], res1=row['evals.summary'][1]['latency']), 'latency')\n",
219+
"\n",
220+
"main.add('faceted_sel', faceted_sel, layout=weave.panels.GroupPanelLayout(x=0, y=10, w=24, h=12))\n",
221+
"\n",
222+
"weave.panels.Board(vars=varbar, panels=main)"
223+
]
224+
}
225+
],
226+
"metadata": {
227+
"kernelspec": {
228+
"display_name": "Python 3 (ipykernel)",
229+
"language": "python",
230+
"name": "python3"
231+
},
232+
"language_info": {
233+
"codemirror_mode": {
234+
"name": "ipython",
235+
"version": 3
236+
},
237+
"file_extension": ".py",
238+
"mimetype": "text/x-python",
239+
"name": "python",
240+
"nbconvert_exporter": "python",
241+
"pygments_lexer": "ipython3",
242+
"version": "3.9.7"
243+
}
244+
},
245+
"nbformat": 4,
246+
"nbformat_minor": 5
247+
}

0 commit comments

Comments
 (0)