VPC Chat completions API #74

huiwengoh · 2025-06-11T02:18:46Z

Note: very draft PR, mostly skeleton structure

Sample usage for score:

import openai
from cleanlab_tlm.utils.chat_completions import TLMChatCompletion

openai_kwargs = {
    "model":"gpt-4o-mini",
    "messages":[{ "content": "What is 1+1?","role": "user"}]
}
openai_completion = openai.chat.completions.create(
  **openai_kwargs
)

openai_completion
>> ChatCompletion(id='chatcmpl-Bh5HglTUAg5pxDy0299VjXTzx14mA', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='1 + 1 equals 2.', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1749608116, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp_34a54ae93c', usage=CompletionUsage(completion_tokens=8, prompt_tokens=14, total_tokens=22, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))

tlm = TLMChatCompletion()
score_response = tlm.score(
    completion=openai_completion,
    **openai_kwargs
)

score_response
>> {'trustworthiness_score': 0.9819999106445161}

jwmueller · 2025-06-11T03:16:44Z

src/cleanlab_tlm/utils/chat_completions.py

+        res = requests.post(
+            f"{BASE_URL}/score",
+            json={
+                "tlm_options": self._tlm_options,
+                "completion": completion.model_dump(),
+                **openai_kwargs,
+            },
+            timeout=self._timeout,
+        )
+
+        res_json = res.json()
+
+        return {"trustworthiness_score": res_json["tlm_metadata"]["score"]}


couldn't our public implementation of score() be something like this instead?

response_text = completion.get_string_response() messages = get_messages_list(openai_kwargs) tlm_prompt = form_prompt_string(messages) return tlm.get_trustworthiness_score(tlm_prompt, response_text)

or is there a specific reason to use a special backend API?

I guess with this ^ we'd ideally want to try and extract logprobs from the response if they are available.

Backend API would allow for users to do scoring in VPC - otherwise tim.get_trustworthiness_score() would be calling our SaaS TLM right?
Also this would allow better scoring for stuff like tool calls / structure outputs /other formats in the future I think if we retain the full completions object?

sounds good to me. If it helps you code better/faster, I think it's fine to have two different modules: One for SaaS, one for VPC

src/cleanlab_tlm/utils/chat_completions.py

jwmueller · 2025-06-11T03:30:18Z

I vote our tutorial for this looks something like this:

Easiest way to use TLM if you're using the Chat Completions API

Here's how to score the response from every LLM call you've made:

from cleanlab_tlm.utils.chat_completions import TLMChatCompletion
from openai import OpenAI

client = OpenAI()
tlm = TLMChatCompletion(options={'log':['explanation']})  # See Advanced Tutorial for optional configurations for better/faster results

# Your existing code:

openai_kwargs = dict(
    model="gpt-4.1",
     messages=[
        {"role": "developer", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
     ]
)

response = client.chat.completions.create(**openai_kwargs)

# Code to score:

score = tlm.score(
    response=response,
    **openai_kwargs
)

Here's a convenient wrapper you can use to have 1 line of code that generates the response and scores its trustworthiness:

class TrustworthyOpenAIClient:
    # has attributes: openai_client = user's own OpenAI Client (not Cleanlab's), tlm = TLMChatCompletion 

    def create(self, openai_client, **openai_kwargs):
        response = self.openai_client.chat.completions.create(**openai_kwargs)
        score = self.tlm.score(response=response,**openai_kwargs)
        response["tlm_metadata"] = score
        return response


trustworthy_openai = TrustworthyOpenAIClient(client, tlm)
response = trustworthy_openai.completions.create(**openai_kwargs)
print(response.text)
print(response['tlm_metadata'].score)

Finally, Cleanlab already provides such a wrapper that generates the response for you using Cleanlab's infrastructure (using the same LLM model that you specify) and simultaneously scores their trustworthiness:

UPDATE: IGNORE THIS, We will use openAI client package in this section and just point its URL at Cleanlab backend instead.

response = tlm.create(
    model="gpt-4o-mini",
    messages=[{"content": "What is 1+1?","role": "user"}],
)

print(response.text)
print(response['tlm_metadata'].score)

Notes from me

Actually I think the optimal way to offer workflow 2 above is a bit different, but my python skills / free time are lacking. I think the optimal way is:

User decorates their call to: client.chat.completions.create() with a decorator that then appends the trust score as a key in this dict. The functionality would be equivalent to what is shown for workflow 2, but the developer experience would be cleaner (zero change of existing code, just add a decorator). Challenge is the decorator has to be associated with an already-instantiated TLMChatCompletion object. Ie something like:


# Before all your code: 

import functools

def trust_score(fn):
    @functools.wraps(fn)
    def wrapper(**kwargs):
        response = fn(**kwargs)
        score = tlm.score(response=response,**kwargs)  # need to somehow get the TLMChatCompletions object into this decorator
        response["tlm_metadata"] = score
        return response
    return wrapper


from openai import OpenAI
client = OpenAI()

client.chat.completions.create = trust_score(client.chat.completions.create)

# After it's been monkey patched, you don't have to change any of your existing code / infra to also get trust scores:

response = client.chat.completions.create(**openai_kwargs)

print(response.text)
print(response['tlm_metadata'].score)

jwmueller · 2025-06-11T21:27:43Z

src/cleanlab_tlm/utils/chat_completions.py

+
+        res_json = res.json()
+
+        return {"trustworthiness_score": res_json["tlm_metadata"]["trustworthiness_score"]}


wait why is this hardcoded like this? It should return everything that TLM.get_trustworthiness_score() returns.

Maybe I am confused about whether this PR is for VPC-TLM or SaaS-TLM.
If this PR is for VPC-TLM only, then all the code should be in a separate VPC module, so SaaS users are not confused by it.

I'd prefer to only review the SaaS-TLM version, and you can try to make the VPC-TLM API closely match that.

src/cleanlab_tlm/utils/vpc/chat_completions.py

jwmueller · 2025-06-12T01:58:34Z

src/cleanlab_tlm/utils/vpc/chat_completions.py

@@ -0,0 +1,67 @@
+import os


TODO: Confirm this VPC version matches our SaaS API as closely as we are easily able to. The analogous SaaS API is defined here:

#75

jas2600 · 2025-06-13T17:26:41Z

for the decorator idea, a decorator factory should work

# Before all your code: 
import functools

def trust_score(tlm_instance):
    def decorator(fn):
        @functools.wraps(fn)
        def wrapper(*args, **kwargs):
            response = fn(*args, **kwargs)
            score = tlm_instance.score(response=response, **kwargs)
            response.tlm_metadata = score
            return response
        return wrapper
    return decorator

from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
client.chat.completions.create = trust_score(tlm)(client.chat.completions.create)

# After it's been monkey patched, you don't have to change any of your existing code / infra to also get trust scores:
...

jwmueller

all that matters is the tutorial runs well, let's not focus on reviewing this

src/cleanlab_tlm/utils/vpc/chat_completions.py

add skeleton wrapper

3978e47

huiwengoh marked this pull request as draft June 11, 2025 02:18

huiwengoh requested a review from jwmueller June 11, 2025 02:32

jwmueller reviewed Jun 11, 2025

View reviewed changes

src/cleanlab_tlm/utils/chat_completions.py Outdated Show resolved Hide resolved

jwmueller reviewed Jun 11, 2025

View reviewed changes

src/cleanlab_tlm/utils/chat_completions.py Outdated Show resolved Hide resolved

naming changes

53430bd

jwmueller reviewed Jun 11, 2025

View reviewed changes

move to vpc subfolder

f4c7f66

huiwengoh changed the title ~~[WIP] Chat completions API~~ [WIP] VPC Chat completions API Jun 11, 2025

jwmueller mentioned this pull request Jun 12, 2025

Add OpenAI Chat Completions API #75

Merged

jwmueller reviewed Jun 12, 2025

View reviewed changes

src/cleanlab_tlm/utils/vpc/chat_completions.py Show resolved Hide resolved

jwmueller reviewed Jun 12, 2025

View reviewed changes

huiwengoh added 2 commits June 17, 2025 15:26

Merge branch 'main' into vpc-api

6d8663c

vpc updates

4c8b95c

huiwengoh requested a review from jwmueller June 17, 2025 23:12

better unify kwargs

f6a0e3b

jwmueller approved these changes Jun 19, 2025

View reviewed changes

huiwengoh marked this pull request as ready for review June 19, 2025 21:19

jwmueller reviewed Jun 20, 2025

View reviewed changes

src/cleanlab_tlm/utils/vpc/chat_completions.py Show resolved Hide resolved

add docstring

977057d

huiwengoh changed the title ~~[WIP] VPC Chat completions API~~ VPC Chat completions API Jun 20, 2025

huiwengoh merged commit 4dead70 into main Jun 20, 2025
3 checks passed

huiwengoh deleted the vpc-api branch June 20, 2025 20:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VPC Chat completions API #74

VPC Chat completions API #74

Uh oh!

huiwengoh commented Jun 11, 2025 •

edited

Loading

Uh oh!

jwmueller Jun 11, 2025

Uh oh!

jwmueller Jun 11, 2025

Uh oh!

huiwengoh Jun 11, 2025

Uh oh!

jwmueller Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

jwmueller commented Jun 11, 2025 •

edited

Loading

Uh oh!

jwmueller Jun 11, 2025

Uh oh!

Uh oh!

jwmueller Jun 12, 2025

Uh oh!

jas2600 commented Jun 13, 2025

Uh oh!

jwmueller left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		res_json = res.json()

		return {"trustworthiness_score": res_json["tlm_metadata"]["trustworthiness_score"]}

VPC Chat completions API #74

VPC Chat completions API #74

Uh oh!

Conversation

huiwengoh commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jwmueller Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

jwmueller Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

huiwengoh Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

jwmueller Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jwmueller commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Easiest way to use TLM if you're using the Chat Completions API

Notes from me

Uh oh!

jwmueller Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jwmueller Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

jas2600 commented Jun 13, 2025

Uh oh!

jwmueller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

huiwengoh commented Jun 11, 2025 •

edited

Loading

jwmueller commented Jun 11, 2025 •

edited

Loading