Extract hyperlinks found in PDF and add them to a new field #44

khalidhibrahim · 2024-08-07T15:05:27Z

Problem

Looks like hyperlinks aren't indexed at the moment i.e. if you have "click here", only the text "click here" is indexed - there's no way to search for the actual URL.

Solution

In this PR, I've tried to extract (using PdfPig's Page.GetHyperlinks()) and save the links in a new field, thought this might be useful.

links extracted and added to a new index field

6566e71

khalidhibrahim changed the title ~~Extract hyper links found in PDF and add them to a new field~~ Extract hyperlinks found in PDF and add them to a new field Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Extract hyperlinks found in PDF and add them to a new field #44

Extract hyperlinks found in PDF and add them to a new field #44

Uh oh!

khalidhibrahim commented Aug 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Extract hyperlinks found in PDF and add them to a new field #44

Are you sure you want to change the base?

Extract hyperlinks found in PDF and add them to a new field #44

Uh oh!

Conversation

khalidhibrahim commented Aug 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant