Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying README, adding library mode example #393

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

randerzander
Copy link
Collaborator

No description provided.

Copy link

copy-pr-bot bot commented Jan 30, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@randerzander randerzander marked this pull request as draft January 30, 2025 22:08
@sosahi sosahi self-requested a review January 30, 2025 22:12
@@ -37,322 +37,137 @@ A service that:
- Runs a static pipeline or fixed set of operations on every submitted document.
- Acts as a wrapper for any specific document parsing library.

For production level performance and scalability, we recommend deploying the pipeline and supporting NIMs via docker-compose or kubernetes (via the provided helm charts).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For production level performance and scalability, we recommend deploying the pipeline and supporting NIMs via docker-compose or kubernetes (via the provided helm charts).
For production-level performance and scalability, we recommend that you deploy the pipeline and supporting NIMs by using Docker Compose or Kubernetes. You can use the helm charts provided.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you link from here to the helm charts?


## Prerequisites
For hardware and software pre-requisites for container and kubernetes (helm) based deployments, please find [our comprehensive doc site](https://docs.nvidia.com/nv-ingest/user-guide/getting-started/prerequisites/).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For hardware and software pre-requisites for container and kubernetes (helm) based deployments, please find [our comprehensive doc site](https://docs.nvidia.com/nv-ingest/user-guide/getting-started/prerequisites/).
For hardware and software requirements for container- and Kubernetes-based deployments, refer to [Prerequisites](https://docs.nvidia.com/nv-ingest/user-guide/getting-started/prerequisites/).


### Software
To facilitate an easier evaluation experience, and for small scale (<100 PDFs) workloads, you can use our "library mode" setup, which depends on NIMs either already self hosted, or, by default, NIMs hosted on build.nvidia.com.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To facilitate an easier evaluation experience, and for small scale (<100 PDFs) workloads, you can use our "library mode" setup, which depends on NIMs either already self hosted, or, by default, NIMs hosted on build.nvidia.com.
For small-scale workloads, such as workloads of fewer than 100 PDFs, you can use library mode setup. Library mode set up depends on NIMs that are already self-hosted, or, by default, NIMs that are hosted on build.nvidia.com.

randerzander and others added 2 commits January 31, 2025 11:01
Co-authored-by: nkmcalli <[email protected]>
Co-authored-by: nkmcalli <[email protected]>
> From the bash prompt above, you can run nv-ingest-cli and Python examples described below.

### Step 3: Ingesting Documents
Note: To use library mode with self hosted NIMs: TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO


> [!IMPORTANT]
> `extract_tables` controls extraction for both tables and charts. You can optionally disable chart extraction by setting `extract_charts` to false.
5. Start ingesting and retrieving documents:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems fine, although I do think having the associated CLI command would be a good idea for anyone who doesn't wants a code-free approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants