Would you consider supporting the use of a model's multimodal capabilities to read PDFs? #4729
isCopyman
started this conversation in
1. Feature requests
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Gemini web and aistudio can both use multimodal capabilities to directly read PDFs instead of relying on text parsing. This would save more tokens and allow direct reading of images within the PDF. Claude Code actually supports reading PDFs via the readfile tool, and Gemini CLI seems to have a similar functionality.
Beta Was this translation helpful? Give feedback.
All reactions