Skip to content

Commit

Permalink
chore: rename to ocr-llm
Browse files Browse the repository at this point in the history
  • Loading branch information
arshad-yaseen committed Nov 6, 2024
1 parent 6724059 commit 2cad42d
Show file tree
Hide file tree
Showing 16 changed files with 930 additions and 3,527 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Please fill out this template with all the relevant information so we can
understand what's going on and fix the issue.
-->

- `ocra` version:
- `ocr-llm` version:
- `node` version:

Relevant code or config
Expand Down
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Ocra
# OcrLLM

Fast, ultra-accurate text extraction from images and PDFs with structured markdown output.

Expand Down Expand Up @@ -28,7 +28,7 @@ Fast, ultra-accurate text extraction from images and PDFs with structured markdo

## Prerequisites

Ocra requires GraphicsMagick and Ghostscript to be installed on your system for PDF processing.
OcrLLM requires GraphicsMagick and Ghostscript to be installed on your system for PDF processing.

### macOS

Expand All @@ -46,33 +46,33 @@ Download and install the following:
## Installation

```bash
npm install ocra
npm install ocr-llm
```

## Quick Start

```typescript
import {Ocra} from 'ocra';
import {OcrLLM} from 'ocr-llm';

const ocra = new Ocra({
const ocrllm = new OcrLLM({
provider: 'openai',
key: 'your-api-key',
});

// Extract text from an image
const imageResult = await ocra.image('path/to/image.jpg');
const imageResult = await ocrllm.image('path/to/image.jpg');
console.log(imageResult.content);

// Process a PDF document
const pdfResults = await ocra.pdf('path/to/document.pdf');
const pdfResults = await ocrllm.pdf('path/to/document.pdf');
pdfResults.forEach(page => {
console.log(`Page ${page.page}:`, page.content);
});
```

## Input Sources

Ocra accepts multiple input formats:
OcrLLM accepts multiple input formats:

| Input Type | Example |
| -------------- | ---------------------------------------------------------------- |
Expand All @@ -83,19 +83,19 @@ Ocra accepts multiple input formats:

## API Reference

| Method | Description | Parameters | Return Type | Details |
| ------------------- | --------------------------- | ------------------------------------------------------------------------------------------------------ | ----------------------- | ---------------------------------------------------------------------------------------------------------- |
| `new Ocra(config)` | Creates a new Ocra instance | `config`: Object containing:<br>- `provider`: OCR provider ('openai')<br>- `key`: API key for provider | `Ocra` | Initializes Ocra with specified provider and credentials |
| `ocra.image(input)` | Processes a single image | `input`: File path, URL, base64 string, or Buffer | `Promise<ImageResult>` | Returns object containing:<br>- `content`: Extracted text in markdown<br>- `metadata`: Processing metadata |
| `ocra.pdf(input)` | Processes a PDF document | `input`: File path, URL, base64 string, or Buffer | `Promise<PageResult[]>` | Returns array of results with:<br>- Page number<br>- Content<br>- Metadata |
| Method | Description | Parameters | Return Type | Details |
| --------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------ | ----------------------- | ---------------------------------------------------------------------------------------------------------- |
| `new OcrLLM(config)` | Creates a new OcrLLM instance | `config`: Object containing:<br>- `provider`: OCR provider ('openai')<br>- `key`: API key for provider | `OcrLLM` | Initializes OcrLLM with specified provider and credentials |
| `ocrllm.image(input)` | Processes a single image | `input`: File path, URL, base64 string, or Buffer | `Promise<ImageResult>` | Returns object containing:<br>- `content`: Extracted text in markdown<br>- `metadata`: Processing metadata |
| `ocrllm.pdf(input)` | Processes a PDF document | `input`: File path, URL, base64 string, or Buffer | `Promise<PageResult[]>` | Returns array of results with:<br>- Page number<br>- Content<br>- Metadata |

## Error Handling

Ocra includes built-in error handling with detailed error messages and automatic retries for transient failures.
OcrLLM includes built-in error handling with detailed error messages and automatic retries for transient failures.

```typescript
try {
const result = await ocra.image('path/to/image.jpg');
const result = await ocrllm.image('path/to/image.jpg');
} catch (error) {
console.error('Processing failed:', error.message);
}
Expand All @@ -109,6 +109,6 @@ try {

## Contributing

For guidelines on contributing, please read the [contributing guide](https://github.com/arshad-yaseen/ocra/blob/main/CONTRIBUTING.md).
For guidelines on contributing, please read the [contributing guide](https://github.com/arshad-yaseen/ocr-llm/blob/main/CONTRIBUTING.md).

We welcome contributions from the community to enhance Ocra's capabilities and make it even more powerful. ❤️
We welcome contributions from the community to enhance OcrLLM's capabilities and make it even more powerful. ❤️
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"name": "ocrllm",
"name": "ocr-llm",
"version": "0.1.0",
"description": "Fast, ultra-accurate text extraction from any image or PDF, even challenging ones, with structured markdown output.",
"main": "./build/index.js",
Expand Down Expand Up @@ -43,7 +43,7 @@
],
"repository": {
"type": "git",
"url": "git+https://github.com/arshad-yaseen/ocra.git"
"url": "git+https://github.com/arshad-yaseen/ocr-llm.git"
},
"maintainers": [
{
Expand Down
Loading

0 comments on commit 2cad42d

Please sign in to comment.