Skip to content

Commit

Permalink
I changed the prompt as suggested in the PR comments.
Browse files Browse the repository at this point in the history
  • Loading branch information
dzemeuksis committed Jan 17, 2025
1 parent 3b8ecac commit ca5a251
Showing 1 changed file with 32 additions and 13 deletions.
45 changes: 32 additions & 13 deletions src/markitdown/_markitdown.py
Original file line number Diff line number Diff line change
Expand Up @@ -1048,19 +1048,38 @@ def convert(self, local_path, **kwargs) -> Union[None, DocumentConverterResult]:
def _get_llm_description(self, local_path, extension, client, model, prompt=None):
if prompt is None or prompt.strip() == "":
prompt = '''
Analyze the image and extract all visible text in the original language.
Reproduce the extracted text in a structured Markdown format, preserving
any formatting such as headings, bullet points, and highlights. Ensure
the output accurately reflects the structure and style of the original
document.
Additionally, if the image includes any visual elements (e.g., diagrams,
logos, or specific layouts) that cannot be represented directly in Markdown,
describe them in plain text as part of the Markdown document under a section
titled "Visual Notes."
Output only the converted Markdown text without any additional commentary
or explanations.
Analyze the image and extract all visible text in the original language. Reproduce the extracted text in a structured Markdown format, preserving any formatting such as headings, bullet points, and highlights. Ensure the output accurately reflects the structure and style of the original document.
Follow these additional guidelines based on the content type:
**Tables:**
* Create exact markdown representation of the table using markdown syntax (|column1|column2|)
* Create a separator row (|---|---|) after the header
* Transcribe all values exactly as they appear in the table
**Mathematical Formulas:**
* Use LaTeX notation within markdown delimiters, e.g., `$$ y = mx + b $$`
**Charts and Graphs:**
* Identify the graph type (bar, line, pie, etc.)
* Extract data points into a markdown table
* Include axis labels, units, and scale information
* Describe patterns (e.g., linear, exponential) under markdown headers
* Record maximums, minimums, and important values
**Flowcharts and Diagrams:**
* Use mermaid markdown syntax where possible:
```mermaid
graph LR
A-->B
B-->C
```
* For process flows, create a numbered list with clear step progression and any branching conditions
* For technical diagrams, list components and their relationships in a structured way, preserving measurements/specifications in tables
For any visual elements that cannot be represented directly in Markdown, describe them in plain text under a section titled "Visual Notes."
Maintain numerical precision exactly as shown, preserve all labels and annotations as markdown text, and structure the output for both human and machine readability. Output only the converted Markdown text without any additional commentary or explanations.
'''

data_uri = ""
Expand Down

0 comments on commit ca5a251

Please sign in to comment.