Skip to content

[MEDIUM] Bug: Token estimation uses inaccurate character division method #16

@unn-Known1

Description

@unn-Known1

Bug Description

The gateway uses a naive token estimation based on character count:

prompt_tokens = len(input_str) // 4

This is a medium severity issue because:

  1. Inaccurate counting: Different characters/words have varying token lengths
  2. Cost calculation errors: Billing estimates will be incorrect
  3. Model compatibility: Different models use different tokenization

Severity

  • Severity: Medium
  • Impact: Incorrect token usage reporting and billing

Recommended Fix

  1. Use the tiktoken library for accurate token counting:
pip install tiktoken
  1. Implement proper tokenization:
import tiktoken

enc = tiktoken.get_encoding("cl100k_base")

def count_tokens(text: str) -> int:
    return len(enc.encode(text))

prompt_tokens = count_tokens(input_str)
  1. Fallback for when tiktoken is unavailable:
def estimate_tokens(text: str) -> int:
    # Rough approximation for non-English text
    # Average: 2 characters per token for English, 1 for CJK
    if is_cjk(text):
        return len(text) // 2
    return len(text) // 4

Files Affected

  • nvidia-ai-gateway.py (line ~330-350)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions