Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Minimal Tokenizer Implementation #513

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

snico432
Copy link

@snico432 snico432 commented Nov 28, 2024

Overview

In response to #262 and #263, and building off of #271, I've been working on
a minimal tokenizer for exo. Initially aimed to remove the transformers dependency,
but discovered it's a transitive dependency of mlx_lm, which will require more extensive changes.

So far I've tested on:

  • llama-3.2-1b
  • qwen-2.5-coder-1.5b
    (both with MLX inference engine)

Questions

  1. Cut down exo package size #263 mentions removing both Jinja2 and tiktoken dependencies. Since my implementation
    is currently a wrapper around these packages, I wanted to confirm if removing these
    dependencies is still the desired direction before proceeding with changes?
  2. Should we tackle the transformers dependency in a separate PR given its
    transitive nature through mlx_lm?

@AlexCheema
Copy link
Contributor

AlexCheema commented Nov 28, 2024

This is awesome! Much needed addition.
I'm going to assign a $500 retrospective bounty for this if we can get a Minimal Tokenizer implementation working for all models without any dependency on tokenizers.

To answer your questions:

  1. Jinja2 and tiktoken are fine to keep.
  2. Yeah we can tackle getting rid of transformers completely in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants