fix(ner): resolve silent pattern fallback when LLM method fails on custom gateways#556
Merged
Merged
Conversation
…stom gateways (#554) Three bugs caused NERExtractor to silently return pattern-based entities even when method="llm" was configured: 1. exc_info=True missing on method-failure warning in NERExtractor — the root exception was swallowed, making the gateway error invisible in logs even with DEBUG enabled. 2. OpenAIProvider.generate_structured always sent response_format=json_object to the API. Custom/enterprise gateways (Qwen, LLaMA proxies, internal gateways) often reject this parameter, causing both the instructor path and the manual repair loop to fail with the same error on every retry. 3. generate_typed manual repair loop had no fallback when generate_structured itself raised — it retried the same failing call up to max_retries times, then propagated the error, triggering _extract_fallback (pattern extraction). Fixes: - Add exc_info=True to the method-failure warning so the full traceback appears in logs and users can diagnose the root cause. - Skip response_format=json_object in OpenAIProvider.generate_structured when base_url is set (custom endpoint), since standard OpenAI gateways don't require it and third-party ones reject it. - In the generate_typed manual repair loop, catch generate_structured failures and immediately retry via plain generate() + _parse_json, breaking the retry-the-same-failing-call loop for custom gateways. Also adds 17 targeted regression tests covering all three bug paths, including the exact gateway configuration reported in the issue.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Four issues raised in code review: - Mode.JSON retry now strips response_format from create_kwargs before calling json_client.chat.completions.create, preventing incompatible kwargs from being forwarded to a client configured for a different mode. - Add exc_info=True to the generate_structured fallback warning in the manual repair loop so the gateway rejection traceback is visible in production logs, consistent with the other warnings added in this PR. - Remove the duplicate is_available definition in GroqProvider. Python silently kept only the second definition; the first (with diagnostic branching) was dead code and could cause confusion on future edits. - Validate base_url scheme in OpenAIProvider._init_client. Non-HTTP(S) schemes (file://, ftp://, javascript:, etc.) are now rejected with a ValueError at init time, preventing SSRF if base_url originates from configuration rather than hardcoded values. Add 3 new tests: SSRF scheme rejection, valid-URL acceptance, and exc_info presence on the generate_structured fallback warning (20/20 pass). Update CHANGELOG.md with full description of all fixes under [Unreleased].
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #554 —
NERExtractorwithmethod=llmsilently returned pattern-based entities (extraction_method='pattern') even when the LLM was correctly generating tool-call output.Three root causes identified and fixed:
exc_info=Truewas missing from the method-failure warning inNERExtractor.extract_entities. The full traceback was never logged, so the real error (gateway rejection) was invisible even withDEBUGlogging enabled.response_format=json_objectsent to custom gateways —OpenAIProvider.generate_structuredalways includedresponse_format={type: json_object}in every API call. Custom/enterprise gateways (internal proxies, Qwen, LLaMA-based routers) frequently reject this parameter, causing both the instructor path and the manual repair loop to fail with the same error on every retry attempt.generate_typed's manual repair loop calledgenerate_structuredexclusively. Whengenerate_structureditself raised (same rejection), the loop retried the identical failing call up tomax_retriestimes before giving up, which then triggered_extract_fallback(pattern extraction).Changes
semantica/semantic_extract/ner_extractor.pyexc_info=Trueto the method-failurewarning()call so the complete traceback is surfaced in logs.semantica/semantic_extract/providers.pyOpenAIProvider.generate_structured: skipresponse_format=json_objectwhenself.base_urlis set. Standard OpenAI endpoints are unaffected; custom gateways no longer receive the unsupported parameter.BaseProvider.generate_typedmanual repair loop: wrap thegenerate_structuredcall in an inner try/except. On failure, immediately fall back to plaingenerate()+_parse_jsonbefore giving up, breaking the retry-same-failure loop.Test plan
tests/test_issue_554_fixes.py— 17 new regression tests (all pass):llm_typedmetadata on successresponse_formatomitted for custom gateways, still included for standard OpenAI, JSON parsed correctly in both pathsgenerate()called whengenerate_structuredfails,generate_structuredis still primary path, errors propagate when both fail, malformed JSON errors surface, bare-list auto-wrap works via fallback pathexc_infopresent on failure,[llm, pattern]fallback chain works end-to-endtest_ner_configurationsandtest_llm_extraction_fixesare identical before and after this branch (verified viagit stash)