[MEDIUM] Bug: Token estimation uses inaccurate character division method

## Bug Description

The gateway uses a naive token estimation based on character count:

```python
prompt_tokens = len(input_str) // 4
```

This is a **medium severity issue** because:

1. **Inaccurate counting**: Different characters/words have varying token lengths
2. **Cost calculation errors**: Billing estimates will be incorrect
3. **Model compatibility**: Different models use different tokenization

## Severity

- **Severity**: Medium
- **Impact**: Incorrect token usage reporting and billing

## Recommended Fix

1. Use the `tiktoken` library for accurate token counting:

```bash
pip install tiktoken
```

2. Implement proper tokenization:

```python
import tiktoken

enc = tiktoken.get_encoding("cl100k_base")

def count_tokens(text: str) -> int:
    return len(enc.encode(text))

prompt_tokens = count_tokens(input_str)
```

3. Fallback for when tiktoken is unavailable:

```python
def estimate_tokens(text: str) -> int:
    # Rough approximation for non-English text
    # Average: 2 characters per token for English, 1 for CJK
    if is_cjk(text):
        return len(text) // 2
    return len(text) // 4
```

## Files Affected

- `nvidia-ai-gateway.py` (line ~330-350)

## References

- [OpenAI Tokenizer](https://github.com/openai/tiktoken)
- [Understanding Tokenization](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MEDIUM] Bug: Token estimation uses inaccurate character division method #16

Bug Description

Severity

Recommended Fix

Files Affected

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[MEDIUM] Bug: Token estimation uses inaccurate character division method #16

Description

Bug Description

Severity

Recommended Fix

Files Affected

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions