Skip to content

Commit 08c86b0

Browse files
feat: add feature parity check tests [RET-2382] [RET-2186] (#13)
* chore: add gripmock dockerfile * feat: add one ranker stub * feat: add stub generator * feat: add batch inference for stub generation * feat: add batch inference stub embedder * feat: use monkey patch stub generator for chunker * feat: add all chunker stubs * feat: add ranker generator * refactor: cleanup embedder generator * refactor: cleanup ranker generator * chore: remove duplicated stubs * refactor: remove unused request * feat: add test ranker with mock * feat: add stub for go * fix: add tests and fix id for javascript * feat: add js embedder tests * refactor: cleanup chunker go tests * feat: add embedder stubs * feat: add ci with mock server * chore: remove wrong condition on go workflow * chore: fix go workflow workdir * chore: rename go workflow * chore: run mock server on the same go job * chore: run mock server on the same go job * chore: run mock server on the same go job * chore: test stubs ci * chore: only push stubs on merge * feat: add variable embedder generator * feat: add variable embedder generator * feat: add 368 embedding stubs * feat: add 368 embedding test for js * feat: remove unnecessary file * feat: test ranker mock server python * feat: stubs * feat: move files for better structure * feat(js): make id be present only on the request * feat: rename test to tests * feat: remove stubs from generator * feat: move docker compose to grpc * fix: rectify stubs path * fix: rectify stubs path * fix: rectify stubs and proto path * feat: add beginning of test generator * feat: add ranker test generation for all * feat: iterate through all template of function * feat: remove placeholder comments * feat: generate embedder test for py * fix: fixes embedder dimension change * doc: add doctest to reshaper * doc: add doctest to reshaper * feat: add embedder template for js * feat: make naming consistent * feat: add go template for embedder * feat: generate all embedder tests * feat: add chunker tests python * feat: add go chunker test generation * feat: use generated tests * feat: add make file to generate stubs and tests * feat: update stubs and tests to generated ones * feat: use formatters on tests generated * feat: run pre-commit for new python tests * feat: add chunker js [RET-2186] (#14) * feat: add stub for go * fix: add tests and fix id for javascript * refactor: cleanup chunker go tests * feat: add 368 embedding stubs * feat: move files for better structure * feat: remove stubs from generator * feat(js): add chunker * refactor: add build request in requester * refactor(js): add process response in requester * feat(js): unpack bytes correctly as strings * feat: remove debug logging * feat: add playground chunker * feat(js): add initial chunker test file * feat(js): add chunker test generator * feat(js): add chunker jinja template * feat(js): add complete test to js * feat: copy chunker tests * feat(js): add postprocess new tests * feat: revert workflow changes * feat: revert embedder * feat: remove playground and add readme * feat: update stubs to latest shape * refactor: remove id from input * feat: make functions private in requester * refactor: replace if else by switch * chore: move id to requester in python --------- Co-authored-by: Daniel Buades Marcos <[email protected]>
1 parent 74fba03 commit 08c86b0

File tree

15 files changed

+475
-68
lines changed

15 files changed

+475
-68
lines changed

README.md

Lines changed: 64 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,66 @@ You can now import the Clinia Models client in your project and play with it.
3636

3737
## Playground Examples
3838

39-
### Embedder Model
39+
### Embedder
4040

41-
```typescript
42-
// TODO
43-
```
41+
``` typescript
42+
import { embedder } from '@clinia/models-client-embedder';
43+
44+
async function runEmbedderExample() {
45+
const myEmbedder = embedder({
46+
host: {
47+
Url: '127.0.0.1',
48+
Scheme: 'http',
49+
Port: 8001,
50+
},
51+
});
4452

45-
### Ranker Model
53+
const result = await myEmbedder.embed(
54+
'embedder_medical_journals_qa',
55+
'120240905185426',
56+
{
57+
texts: ['Clinia is based in Montreal'],
58+
id: 'request-123',
59+
},
60+
);
61+
62+
console.log(JSON.stringify(result, null, 2));
63+
}
64+
65+
runEmbedderExample().catch(console.error);
66+
```
4667

68+
### Chunker
4769
```typescript
70+
import { chunker } from '@clinia/models-client-chunker';
71+
72+
async function runChunkerExample() {
73+
const myChunker = chunker({
74+
host: {
75+
Url: '127.0.0.1',
76+
Scheme: 'http',
77+
Port: 8001,
78+
},
79+
});
80+
81+
const result = await myChunker.chunk(
82+
'chunker',
83+
'120252801110000',
84+
{
85+
texts: ['Clinia is based in Montreal'],
86+
id: 'request-123',
87+
},
88+
);
89+
90+
console.log(JSON.stringify(result, null, 2));
91+
}
92+
93+
runChunkerExample().catch(console.error);
94+
```
95+
96+
### Ranker
97+
``` typescript
4898
import { ranker } from '@clinia/models-client-ranker';
49-
import { v4 as uuidv4 } from 'uuid';
5099

51100
async function runRankerExample() {
52101
const myRanker = ranker({
@@ -57,34 +106,22 @@ async function runRankerExample() {
57106
},
58107
});
59108

60-
// Get model name and version from environment variables.
61-
const modelName = process.env.CLINIA_MODEL_NAME;
62-
const modelVersion = process.env.CLINIA_MODEL_VERSION;
63-
if (!modelName || !modelVersion) {
64-
throw new Error('Missing required environment variables: CLINIA_MODEL_NAME or CLINIA_MODEL_VERSION');
65-
}
66-
67-
const rankRequest = {
68-
id: uuidv4(),
69-
query: 'Where is Clinia based?',
70-
texts: ['Clinia is based in Montreal'],
71-
};
72-
73-
const result = await myRanker.rank(modelName, modelVersion, rankRequest);
74-
console.log('Rank result:', result);
109+
const result = await myRanker.rank(
110+
'ranker_medical_journals_qa',
111+
'120240905185925',
112+
{
113+
query: 'hello, how are you?',
114+
texts: ['Clinia is based in Montreal'],
115+
id: 'request-123',
116+
},
117+
);
75118

76119
console.log('Rank result:', result);
77120
}
78121

79122
runRankerExample().catch(console.error);
80123
```
81124

82-
### Chunker Model
83-
84-
```typescript
85-
// TODO
86-
```
87-
88125
## Note
89126

90127
This repository is automatically generated from a private repository within Clinia that contains additional resources including tests, mock servers, and development tools.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../LICENSE
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
import type { RequesterConfig } from '@clinia/models-client-common';
2+
import { Chunker } from '../src/chunker';
3+
import { createGrpcRequester } from '@clinia/models-requester-grpc';
4+
5+
export const chunker = (options: RequesterConfig): Chunker => {
6+
return new Chunker({
7+
requester: createGrpcRequester(options),
8+
});
9+
};
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
import type { RequesterConfig } from '@clinia/models-client-common';
2+
import { Chunker } from '../src/chunker';
3+
import { createGrpcRequester } from '@clinia/models-requester-grpc';
4+
5+
export const chunker = (options: RequesterConfig): Chunker => {
6+
return new Chunker({
7+
requester: createGrpcRequester(options),
8+
});
9+
};
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
// eslint-disable-next-line import/no-unresolved
2+
export * from './dist/builds/node';
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
// eslint-disable-next-line import/no-commonjs,import/extensions
2+
module.exports = require('./dist/models-client-chunker.cjs');
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
{
2+
"name": "@clinia/models-client-chunker",
3+
"version": "1.0.0",
4+
"description": "Javascript client for Clinia embed model",
5+
"repository": {
6+
"type": "git",
7+
"url": "git+https://github.com/clinia/models-client-javascript.git"
8+
},
9+
"license": "MIT",
10+
"author": "Clinia",
11+
"type": "module",
12+
"exports": {
13+
".": {
14+
"node": {
15+
"types": {
16+
"import": "./dist/node.d.ts",
17+
"module": "./dist/node.d.ts",
18+
"require": "./dist/node.d.cts"
19+
},
20+
"import": "./dist/builds/node.js",
21+
"module": "./dist/builds/node.js",
22+
"require": "./dist/builds/node.cjs"
23+
},
24+
"default": {
25+
"types": "./dist/browser.d.ts",
26+
"module": "./dist/builds/browser.js",
27+
"import": "./dist/builds/browser.js",
28+
"default": "./dist/builds/browser.umd.js"
29+
}
30+
},
31+
"./dist/builds/*": "./dist/builds/*.js"
32+
},
33+
"jsdelivr": "./dist/builds/browser.umd.js",
34+
"unpkg": "./dist/builds/browser.umd.js",
35+
"react-native": "./dist/builds/browser.js",
36+
"files": [
37+
"dist",
38+
"index.js",
39+
"index.d.ts"
40+
],
41+
"scripts": {
42+
"build": "yarn clean && yarn tsup",
43+
"clean": "rm -rf ./dist || true",
44+
"test": "tsc --noEmit && vitest --run",
45+
"test:bundle": "publint . && attw --pack ."
46+
},
47+
"dependencies": {
48+
"@clinia/models-client-common": "1.0.0",
49+
"@clinia/models-requester-grpc": "1.0.0"
50+
},
51+
"devDependencies": {
52+
"@arethetypeswrong/cli": "0.17.3",
53+
"@types/node": "22.10.7",
54+
"publint": "0.3.2",
55+
"rollup": "4.30.1",
56+
"tsup": "8.3.5",
57+
"typescript": "5.7.3",
58+
"vitest": "^3.0.5"
59+
},
60+
"engines": {
61+
"node": ">= 14.0.0"
62+
}
63+
}
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
import {
2+
type ClientOptions,
3+
type Requester,
4+
type Input,
5+
type Datatype,
6+
type Output,
7+
getOutputStringContents,
8+
} from '@clinia/models-client-common';
9+
10+
export type Chunk = {
11+
id: string;
12+
text: string;
13+
startIndex: number;
14+
endIndex: number;
15+
tokenCount: number;
16+
};
17+
18+
export type ChunkRequest = {
19+
id: string;
20+
texts: string[];
21+
};
22+
23+
export type ChunkResponse = {
24+
id: string;
25+
chunks: Chunk[][];
26+
};
27+
28+
const CHUNK_INPUT_KEY = 'text';
29+
const CHUNK_OUTPUT_KEY = 'chunk';
30+
const CHUNK_INPUT_DATATYPE: Datatype = 'BYTES';
31+
32+
export class Chunker {
33+
private _requester: Requester;
34+
35+
constructor(options: ClientOptions) {
36+
this._requester = options.requester;
37+
}
38+
39+
private processOutput(output: Output): Chunk[][] {
40+
let textChunks = getOutputStringContents(output);
41+
let formattedChunks = [];
42+
for (let text_chunk of textChunks) {
43+
// Filter out "pad" values and then map
44+
let filtered = text_chunk.filter((chunk) => chunk !== 'pad');
45+
let formatted_text_chunks = filtered.map((chunk) => {
46+
try {
47+
return JSON.parse(chunk) as Chunk;
48+
} catch (e) {
49+
throw new Error(`Invalid JSON: ${chunk}`);
50+
}
51+
});
52+
formattedChunks.push(formatted_text_chunks);
53+
}
54+
55+
return formattedChunks;
56+
}
57+
58+
async chunk(
59+
modelName: string,
60+
modelVersion: string,
61+
request: ChunkRequest,
62+
): Promise<ChunkResponse> {
63+
const inputs: Input[] = [
64+
{
65+
name: CHUNK_INPUT_KEY,
66+
shape: [request.texts.length],
67+
datatype: CHUNK_INPUT_DATATYPE,
68+
contents: [
69+
{
70+
stringContents: request.texts,
71+
},
72+
],
73+
},
74+
];
75+
76+
// The chunker model has only one input and one output
77+
const outputKeys = [CHUNK_OUTPUT_KEY];
78+
79+
const outputs = await this._requester.infer(
80+
modelName,
81+
modelVersion,
82+
inputs,
83+
outputKeys,
84+
request.id,
85+
);
86+
87+
// Since we have only one output, we can directly access the first output.
88+
// We already check the size of the output in the infer function therefore we can "safely" access the element 0.
89+
const chunks = this.processOutput(outputs[0]);
90+
91+
return {
92+
id: request.id,
93+
chunks,
94+
};
95+
}
96+
}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
{
2+
"extends": "../../tsconfig.json",
3+
"compilerOptions": {
4+
"types": ["node", "vitest/globals"],
5+
"outDir": "dist",
6+
"skipLibCheck": true
7+
},
8+
"include": ["src", "model", "builds"],
9+
"exclude": ["dist", "node_modules"]
10+
}
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
import type { Options } from 'tsup';
2+
import { defineConfig } from 'tsup';
3+
4+
import { getBaseBrowserOptions, getBaseNodeOptions } from '../../base.tsup.config';
5+
6+
import pkg from './package.json' with { type: 'json' };
7+
8+
const nodeOptions: Options = {
9+
...getBaseNodeOptions(pkg, __dirname),
10+
dts: { entry: { node: 'builds/node.ts' } },
11+
entry: ['builds/node.ts', 'src/*.ts'],
12+
};
13+
14+
const nodeConfigs: Options[] = [
15+
{
16+
...nodeOptions,
17+
format: 'cjs',
18+
name: `node ${pkg.name} cjs`,
19+
},
20+
{
21+
...nodeOptions,
22+
format: 'esm',
23+
name: `node ${pkg.name} esm`,
24+
},
25+
];
26+
27+
const browserOptions: Options = {
28+
...getBaseBrowserOptions(pkg, __dirname),
29+
globalName: 'chunker',
30+
};
31+
32+
const browserConfigs: Options[] = [
33+
{
34+
...browserOptions,
35+
minify: false,
36+
name: `browser ${pkg.name} esm`,
37+
dts: { entry: { browser: 'builds/browser.ts' } },
38+
entry: ['builds/browser.ts', 'src/*.ts'],
39+
},
40+
];
41+
42+
export default defineConfig([...nodeConfigs, ...browserConfigs]);
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
import { defineConfig } from 'vitest/config';
2+
3+
export default defineConfig({
4+
test: { environment: 'jsdom', globals: true },
5+
});

packages/models-client-common/src/output.ts

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,16 @@ export const getOutputFp32Contents = (output: Output): Float32Array[] => {
2020

2121
return contents;
2222
};
23+
24+
export const getOutputStringContents = (output: Output): string[][] => {
25+
if (output.datatype !== 'BYTES') {
26+
throw new Error('Data type not supported');
27+
}
28+
29+
const contents: string[][] = [];
30+
for (const content of output.contents) {
31+
if (content.stringContents) contents.push(content.stringContents);
32+
}
33+
34+
return contents;
35+
};

0 commit comments

Comments
 (0)