Skip to content

Conversation

phattailed
Copy link

This PR introduces several interrelated improvements to model handling, OpenAI client initialization, token accounting, and response parsing to prepare the codebase for GPT-5 style models and more robust runtime behavior.

Key changes

Add ModelType.GPT_5

File: typing.py
Registers GPT_5 in the model enum so the rest of the codebase can target gpt-5 explicitly.
Teach token counters and limits about GPT-5

File: utils.py
Include ModelType.GPT_5 where the code routes message counting to the OpenAI-style token counter.
Add a token limit mapping for GPT-5 (same high limit as other large models, 128k).
Harden OpenAI client initialization and token-completion logic

File: model_backend.py
Fall back to cl100k_base tiktoken encoding when tiktoken.encoding_for_model fails (robust to unknown/edge model names).
Wrap OpenAI client creation in try/except and surface clearer errors on failure.
Normalize token-limit map values (use large safe defaults for newer models).
For non-GPT-5 models: compute a safe max_tokens for completions and cap it with a conservative safe_cap (2048) to avoid large unintended completions.
For GPT-5 models: avoid sending legacy token/max_tokens parameters and instead send a minimal, GPT-5-friendly request body (use response_format and an extra_body map for model-specific flags like verbosity/reasoning_effort). This avoids sending incompatible fields.
Add defensive logging via log_visualize to help debug what token parameters are being used.
Keep the block that handles older OpenAI python client paths and similarly wrap and log failure cases.
Make web_spider robust to missing OpenAI client

File: web_spider.py
Wrap client initialization in try/except and set client = None on failure with a warning.
Short-circuit modal_trans if the client is not available and return an empty string (avoid crashing background web spider flows when OpenAI client can't be initialized).
Improve ChatAgent message handling for varied OpenAI response shapes

File: chat_agent.py
More defensive parsing of both new OpenAI SDK objects and older dict-style responses:
Avoid KeyError/AttributeError by using getattr/get with sensible defaults.
Strip or ignore fields the local ChatMessage doesn't support (e.g., extra annotations, reasoning fields) while passing through supported extras: refusal, audio, function_call, tool_calls.
Ensures empty/None content doesn't cause failures and preserves useful compatible fields.
Why this change

Adds explicit support for GPT-5 and prepares request/response handling for model differences (fields, behaviors).
Improves runtime robustness when a model isn't known to tiktoken or the OpenAI client initialization fails (useful on machines without correct environment variables or when using alternative base URLs).
Prevents sending incompatible payload fields to GPT-5-style endpoints and reduces risk of large, unbounded completions for older models by applying a conservative cap.
Prevents crashes in background utilities (web_spider) when the OpenAI client can't be created.
Makes ChatAgent resilient to payload shape differences across OpenAI SDK versions and model responses.
Migration notes / compatibility

GPT-5 is added to the ModelType enum; ensure any config or code that references model names is updated if necessary.
The changes intentionally avoid sending max_tokens and other legacy ChatGPT config fields to GPT-5-style models. If you have workflows that expect those fields to control completion length for GPT-5, you may need to adapt them to the new model-specific control parameters (e.g., the extra_body keys used here are examples).
The code falls back to cl100k_base encoding when tiktoken cannot resolve a model name; that may change token counts slightly compared with model-specific encodings but is preferable to failing outright.
Testing notes

Smoke test ideas:
Run unit or integration flows that call OpenAIModel.run() with:
a known older model (e.g., gpt-3.5-turbo) and confirm completions still arrive and max_tokens is capped.
a new gpt-5 model string (if available in your environment) and confirm the request is sent without legacy token fields.
Start the app with no OPENAI_API_KEY or with an invalid BASE_URL to confirm web_spider doesn't crash and prints the warning.
Trigger ChatAgent flows and simulate both new SDK-style responses and older dict responses to ensure no exceptions and that ChatMessage objects contain the expected subset of fields.
Files changed (high level)

typing.py — add GPT-5 enum
utils.py — include GPT-5 in token counting and token limit lookup
model_backend.py — robust encoding fallback, client init errors handled, GPT-5 friendly request composition, safe completion caps, logging
web_spider.py — safe OpenAI client init and runtime guard
chat_agent.py — defensive response parsing for OpenAI outputs
Checklist

Add GPT-5 to model types
Add token handling and token limit awareness for GPT-5
Harden OpenAI client init and fallback encoding
Send GPT-5-compatible request body (avoid legacy fields)
Add defensive parsing in ChatAgent for new/old OpenAI responses
Protect web_spider from client initialization failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant