-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Llama Android example doesn't generate tokens when trying to run local inference. #158
Comments
Hi @BakungaBronson, few ideas on debugging this:
cc. @Riandy |
Hey @WuhanMonkey,
I suspect it has to do with the log you mentioned missing. Is there a need to change any of the files in the example folder? I left the two main files as they were and ran the app in Android Studio. |
You shouldn't need to do anything besides downloading |
How can I do this? I have found the AppLogging class, but I do not know how to enable it. |
@BakungaBronson It is enabled by default. So in your Logcat, filter the logs based on package name, for example |
@WuhanMonkey , I can see the logs now, the specific log you mentioned earlier does exist.
However there seems to also be some lines with errors from Executorch. They appear multiple times.:
|
There are also two lone errors:
|
@BakungaBronson It looks very much your ExecuTorch model has some issue. Does the same model and tokenizer works with ET itself, if so what is the ET version? btw, we also launched Llama Stack Kotlin SDK v0.1 along with updated demo app. Please try it out. |
@WuhanMonkey you're right, the model can't run when I use Executorch on it. My Executoruch version is executorch-0.6.0a0+e78ed83 I tried And got the same long error shown above. What could I be doing wrong? This is the output from me running executorch on the Llama 3.2 1B instruct model. Script to convert
Output
Congratulations on the SDK launch. I will definitely give it a go. |
@BakungaBronson The current Kotlin SDK uses ExecuTorch v0.5.0. For your reference, this is the commit it pinned against. My guess is some issue in v0.6.0a0. In general, the executorch itself should be able to run your model after export. My suggestions are:
|
System Info
Since this is running on a phone, I will share the phone details instead.
Samsung Galaxy S24 Ultra - Running Android 14.
Information
🐛 Describe the bug
The Llama Android example doesn't generate tokens when trying to run local inference.
I have followed the documentation as it is, including preparing the model using
Executorch
and usingadb
to push the .pte and renamedtokenizer.bin
file, but for some reason, there are no tokens being produced. I have tried the Llama3.2-1B-Instruct model and the Llama3.2-1B-Instruct-int4-spinquant-eo8 model both display the same empty message. There are no issues running Executorch on the modelsError logs
Not errors but I'm posting the logs from the app.
Expected behavior
When the model and tokenizer are loaded and the settings are made to point to them the application should perform on-device inference.
The text was updated successfully, but these errors were encountered: