LlamaParse Roadmap #19

jerryjliu · 2024-02-19T19:12:48Z

jerryjliu
Feb 19, 2024
Maintainer

Currently LlamaParse supports complex PDF documents as input. We will extend LlamaParse in the coming weeks / months to support the following:

More file formats, starting with .docx and .pptx coming this month.
Ability to retrieve image embeded in document coming this month.
More output format better suited for Retrieval augmented generation than Markdown.
Better support for complexe document layouts, including graphs and pie charts.
Vertical specific document parser starting with Financial documents, invoices, forms and contract.

Of course, we will also work hard to address any issue that may arise. Please drop in the Issues tab!

jamesqiyucai · 2024-02-23T01:56:04Z

jamesqiyucai
Feb 23, 2024

When will LlamaParse support Chinese?

1 reply

hexapode Feb 26, 2024
Maintainer

LlamaParse already support chinese. You should specify language=ch_sim or language= ch_trad in the api parameters. Once #37 is merged, you should be able to do it with the python pluggin.

DMAgit · 2024-02-27T09:55:58Z

DMAgit
Feb 27, 2024

Is the source code of the parsing logic itself available, or will it be made available? As far as I can see, currently only the code which accesses the remote point is available, but not any of the parsing logic.

0 replies

relsas · 2024-03-19T18:06:52Z

relsas
Mar 19, 2024

Testing on Academic Papers

Hi, I very much appreciate the initiative on LlamaParse as pdf-parsing is indeed a challenge. Putting this as an API is also cool and solves many installation issues. The idea of using LLM instructions to steer the parser is great, as is the separation of objects for tables and figures.
However, I've now spent quite some time on testing to parse academic papers, with mediocre success only. In particular, the following issues occur:

A new page in the pdf always breaks the current paragraph, such that often sentences are interrupted and continued in the next paragraph.
If the pdf has footer or header information, this is repeated over an over, sometimes even classified as headers. Such footer info is then put between the body text when the page switch occurs, so that one cannot reconstruct interrupted sentences/paragraphs (see above)
Footnotes are not systematically recognized, numeration is in most cases lost, and sometimes footnotes appear in the body text
Formulas are barely translated to Latex.

In most of these aspects, parsing by Nougat works much better (though this has problems on its own...e.g. with references or left out pages).
Any ideas (or maybe plans) how to improve model performance on such complex types of pdfs?
Kind regards

0 replies

gittar · 2024-04-27T23:37:39Z

gittar
Apr 27, 2024

Will there be LlamaParse on prem, i.e. a version which can be run locally without API calls? This would make its use für confidential documents possible. Also a "No"-answer would be helpful for decision making.

2 replies

martincollado Apr 30, 2024

I think that the "No" answer is an essential aspect. I'm trying to see on forums, google searches, etc. what are the decisions that you made on your platform about the data privacy in this product and is difficult to view a clear answer.

Could you provide more information about it @hexapode ? I don't think that any company out there will throw private and sensitive information on this API before being aware of what's happening with their data.

jordanparker6 Aug 13, 2024

@hexapode would love to know if this feature is roadmapped. Would help to understand potential licensing costs if this were to be part of an on-prem stack for enterprise.

guillaume-millot · 2024-05-02T07:11:43Z

guillaume-millot
May 2, 2024

I have tried Llama Parse to parse complex financial documents. I found it to be a great and innovative tool.
However, often, it makes basic mistakes such as completely ignoring a big portion of the PDF.
It'd really be the best tool if it were more robust. Curious to know how much you're investing on llama parse core improvements vs other things.
Thank you!

0 replies

LDD19 · 2024-05-21T22:14:02Z

LDD19
May 21, 2024

Better support for complex document layouts, including graphs and pie charts.

Any plans to integrate this in the near future? Especially with GPT-4o being multimodal, filling the missing data in parsed charts/images shouldn't be an issue.

0 replies

aifirstd3v · 2024-05-23T20:45:38Z

aifirstd3v
May 23, 2024

My PDF only has images :) I hope LlamaParse can read image :)

0 replies

kinthaiofficial · 2026-04-29T00:36:09Z

kinthaiofficial
Apr 29, 2026

A few things that would be high-value additions to LlamaParse from an agent pipeline perspective:

Structured extraction with schema — beyond OCR and layout parsing, the ability to extract into a specified Pydantic/JSON Schema directly. "Parse this invoice and return a InvoiceSchema object" rather than "return markdown, then I'll extract." Reduces one LLM hop in the pipeline.

Incremental / delta parsing — when a document is updated (new page added, table modified), parse only the changed sections and return a delta. Avoids re-parsing stable pages. Crucial for long-lived knowledge bases where documents evolve.

Parse quality confidence — a confidence score per extracted element (this table was cleanly parsed: 0.97; this blurry scan section: 0.61). Agents can use this to decide whether to re-fetch the source, flag for human review, or proceed with reduced confidence.

Extraction lineage — for agent systems, knowing "this fact came from page 4, table 2, row 3 of document X (version 2026-01-15)" is essential for citation and audit. Parse results should carry enough metadata for the agent to build a provenance trail.

Batch prioritization — when an agent submits 50 documents to parse, it should be able to flag 5 as urgent (needed for the current task) and 45 as background. Priority queue semantics rather than FIFO.

We've been building document-aware agents in KinthAI that need reliable parsing as input to their knowledge consolidation step: https://blog.kinthai.ai/why-character-ai-forgets-you-persistent-memory-architecture covers how parsing quality affects downstream memory reliability.

Are the roadmap priorities driven more by enterprise document processing needs or by developer/agent pipeline needs?

0 replies

LlamaParse Roadmap #19

Uh oh!

jerryjliu Feb 19, 2024 Maintainer

Replies: 8 comments · 3 replies

Uh oh!

Uh oh!

hexapode Feb 26, 2024 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerryjliu
Feb 19, 2024
Maintainer

Replies: 8 comments 3 replies

hexapode Feb 26, 2024
Maintainer