You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on a tool to try to find the top x books from project gutenberg matching a search term or topic, and I'm having trouble with false positives -- books that appear in the project gutenberg catalog and in the librivox catalog, but that, when I search them by their title, I don't get a match. For example, when I run the tool for all shelves that contain the substring children, the first result is "A Christmas Carol in Prose; Being a Ghost Story of Christmas by Dickens, Charles (https://www.gutenberg.org/ebooks/46.html.images)". However, in librivox, when I search for this title, I don't get any results, I think because its listed as "A Christmas Carol", rather than "A Christmas Carol in Prose; Being a Ghost Story of Christmas".
I would like to request a new feature, a new search param in the url called projectgutenbergid. I would be able to make a request like:
{
"books": [
{
"id": "140",
"title": "Christmas Carol",
"description": "A classic tale of what comes to those whose hearts are hard. In a series of ghostly visits, Scrooge visits his happy past, sees the difficulties of the present, views a bleak future, and in the end amends his mean ways. (Summary written by Kristen McQuillin)",
"url_text_source": "https://www.gutenberg.org/etext/46",
"language": "English",
"copyright_year": "1843",
"num_sections": "5",
"url_rss": "https://librivox.org/rss/140",
"url_zip_file": "https://www.archive.org/download/A_Christmas_Carol/A_Christmas_Carol_64kb_mp3.zip",
"url_project": "https://en.wikipedia.org/wiki/A_Christmas_Carol",
"url_librivox": "https://librivox.org/a-christmas-carol-by-charles-dickens/",
"url_other": "",
"totaltime": "3:14:29",
"totaltimesecs": 11669,
"projectgutenbergid": "46",
"authors": [
{
"id": "91",
"first_name": "Charles",
"last_name": "Dickens",
"dob": "1812",
"dod": "1870"
}
]
}
]
}
I was looking at the librivox recording details page (for example this one), and I see that in the "links" there is an "online text" link that has the project gutenberg link, which iiuc means we have the data in the database to support such an option, though it might not be 100% accurate, since the librivox folks may have linked to a different version of the online text. (edit: also, it's already in the api as url_text_source 🤣)
If the project is willing to support this feature, I'd be interested in contributing.
The text was updated successfully, but these errors were encountered:
Well, we don't have much in terms of supporting features, and I can't say how much of a priority it would be for other volunteers, but let's talk.
Just to be sure this angle is covered: I see your edit, is there any chance you could reasonably parse and reconstruct the url_text_source to search by? To the best of my (limited) knowledge, Gutenberg IDs don't exist as separate objects in the database, they would need to be parsed out on one end or the other. 😄
I'm working on a tool to try to find the top x books from project gutenberg matching a search term or topic, and I'm having trouble with false positives -- books that appear in the project gutenberg catalog and in the librivox catalog, but that, when I search them by their title, I don't get a match. For example, when I run the tool for all shelves that contain the substring
children
, the first result is "A Christmas Carol in Prose; Being a Ghost Story of Christmas by Dickens, Charles (https://www.gutenberg.org/ebooks/46.html.images)". However, in librivox, when I search for this title, I don't get any results, I think because its listed as "A Christmas Carol", rather than "A Christmas Carol in Prose; Being a Ghost Story of Christmas".I would like to request a new feature, a new search param in the url called
projectgutenbergid
. I would be able to make a request like:And get a response like
I was looking at the librivox recording details page (for example this one), and I see that in the "links" there is an "online text" link that has the project gutenberg link, which iiuc means we have the data in the database to support such an option, though it might not be 100% accurate, since the librivox folks may have linked to a different version of the online text. (edit: also, it's already in the api as
url_text_source
🤣)If the project is willing to support this feature, I'd be interested in contributing.
The text was updated successfully, but these errors were encountered: