Skip to content

fix(ner): resolve silent pattern fallback when LLM method fails on custom gateways#556

Merged
KaifAhmad1 merged 2 commits into
mainfrom
fix/issue-554-ner-llm-gateway-fallback
May 15, 2026
Merged

fix(ner): resolve silent pattern fallback when LLM method fails on custom gateways#556
KaifAhmad1 merged 2 commits into
mainfrom
fix/issue-554-ner-llm-gateway-fallback

Conversation

@KaifAhmad1
Copy link
Copy Markdown
Contributor

Summary

Fixes #554NERExtractor with method=llm silently returned pattern-based entities (extraction_method='pattern') even when the LLM was correctly generating tool-call output.

Three root causes identified and fixed:

  • Silent exception swallowingexc_info=True was missing from the method-failure warning in NERExtractor.extract_entities. The full traceback was never logged, so the real error (gateway rejection) was invisible even with DEBUG logging enabled.
  • response_format=json_object sent to custom gatewaysOpenAIProvider.generate_structured always included response_format={type: json_object} in every API call. Custom/enterprise gateways (internal proxies, Qwen, LLaMA-based routers) frequently reject this parameter, causing both the instructor path and the manual repair loop to fail with the same error on every retry attempt.
  • No fallback in the manual repair loopgenerate_typed's manual repair loop called generate_structured exclusively. When generate_structured itself raised (same rejection), the loop retried the identical failing call up to max_retries times before giving up, which then triggered _extract_fallback (pattern extraction).

Changes

semantica/semantic_extract/ner_extractor.py

  • Added exc_info=True to the method-failure warning() call so the complete traceback is surfaced in logs.

semantica/semantic_extract/providers.py

  • OpenAIProvider.generate_structured: skip response_format=json_object when self.base_url is set. Standard OpenAI endpoints are unaffected; custom gateways no longer receive the unsupported parameter.
  • BaseProvider.generate_typed manual repair loop: wrap the generate_structured call in an inner try/except. On failure, immediately fall back to plain generate() + _parse_json before giving up, breaking the retry-same-failure loop.

Test plan

  • tests/test_issue_554_fixes.py — 17 new regression tests (all pass):
    • Bug 1 (4 tests): traceback present in log, method name in message, pattern fallback still works, llm_typed metadata on success
    • Bug 2 (5 tests): response_format omitted for custom gateways, still included for standard OpenAI, JSON parsed correctly in both paths
    • Bug 3 (5 tests): plain generate() called when generate_structured fails, generate_structured is still primary path, errors propagate when both fail, malformed JSON errors surface, bare-list auto-wrap works via fallback path
    • Integration (3 tests): reproduces harshalizode's exact gateway config, exc_info present on failure, [llm, pattern] fallback chain works end-to-end
  • Zero regressions — pre-existing failures in test_ner_configurations and test_llm_extraction_fixes are identical before and after this branch (verified via git stash)

…stom gateways (#554)

Three bugs caused NERExtractor to silently return pattern-based entities
even when method="llm" was configured:

1. exc_info=True missing on method-failure warning in NERExtractor —
   the root exception was swallowed, making the gateway error invisible
   in logs even with DEBUG enabled.

2. OpenAIProvider.generate_structured always sent response_format=json_object
   to the API. Custom/enterprise gateways (Qwen, LLaMA proxies, internal
   gateways) often reject this parameter, causing both the instructor path
   and the manual repair loop to fail with the same error on every retry.

3. generate_typed manual repair loop had no fallback when generate_structured
   itself raised — it retried the same failing call up to max_retries times,
   then propagated the error, triggering _extract_fallback (pattern extraction).

Fixes:
- Add exc_info=True to the method-failure warning so the full traceback
  appears in logs and users can diagnose the root cause.
- Skip response_format=json_object in OpenAIProvider.generate_structured
  when base_url is set (custom endpoint), since standard OpenAI gateways
  don't require it and third-party ones reject it.
- In the generate_typed manual repair loop, catch generate_structured
  failures and immediately retry via plain generate() + _parse_json,
  breaking the retry-the-same-failing-call loop for custom gateways.

Also adds 17 targeted regression tests covering all three bug paths,
including the exact gateway configuration reported in the issue.
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

Four issues raised in code review:

- Mode.JSON retry now strips response_format from create_kwargs before
  calling json_client.chat.completions.create, preventing incompatible
  kwargs from being forwarded to a client configured for a different mode.

- Add exc_info=True to the generate_structured fallback warning in the
  manual repair loop so the gateway rejection traceback is visible in
  production logs, consistent with the other warnings added in this PR.

- Remove the duplicate is_available definition in GroqProvider. Python
  silently kept only the second definition; the first (with diagnostic
  branching) was dead code and could cause confusion on future edits.

- Validate base_url scheme in OpenAIProvider._init_client. Non-HTTP(S)
  schemes (file://, ftp://, javascript:, etc.) are now rejected with a
  ValueError at init time, preventing SSRF if base_url originates from
  configuration rather than hardcoded values.

Add 3 new tests: SSRF scheme rejection, valid-URL acceptance, and
exc_info presence on the generate_structured fallback warning (20/20 pass).

Update CHANGELOG.md with full description of all fixes under [Unreleased].
@KaifAhmad1 KaifAhmad1 self-assigned this May 15, 2026
@KaifAhmad1 KaifAhmad1 merged commit d3ffbad into main May 15, 2026
10 checks passed
@KaifAhmad1 KaifAhmad1 deleted the fix/issue-554-ner-llm-gateway-fallback branch May 15, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant