-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide breadcrumbs / code hooks for "partial re-ingestion" of content #152
Comments
@jchill-git I've been working on this today and will hopefully circle back tomorrow. Ping me here or on Slack if there is anything else I can help with as far as getting the texts / alignments in. If you have been able to get your "version" into the database, this might be a helpful code snippet for getting out the tokens: from pathlib import Path
from scaife_viewer.atlas.parallel_tokenizers import tokenize_text_parts
outdir = Path(".") # the current directory, e.g. backend/
version_urn = "urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:" # replace with your version URN
outf = tokenize_text_parts(outdir, version_urn ) # writes out to urnctsgreeklittlg0012tlg001perseus-grc2.csv Snippet for that CSV file: https://gist.github.com/jacobwegner/3a96e1763b7bc22d827680db1351a377 ![]() This would give you a CSV that could be useful in a dataframe that has calculated that |
I'll keep working on on the backend branch and provide updates on my progress on this issue. |
My commit in scaife-viewer/backend@f4b4ecf wasn't working for the Arabic content in the Codespace today; need to take a closer look. The other thing I'd like to capture here and add a hook / to documentation is how the If we made it an environment variable, that would allow folks to use a subset of the data in a
./manage.py prepare_atlas_db --force Files could be worked on from within Once the file was ready for promotion to |
The current ingestion process is idempotent; it assumes that we're always building up data from scratch, because that's what we do when we deploy the site.
During local development, I have a few shortcuts that I use to give a tighter "feedback loop" when working on a particular annotation.
I'd like to have this to support @jchill-git , @gregorycrane and others who may be doing a lot more content previewing / editing than I have been in the past...it will also help us to be better at incremental updates to content when content moves out of this "code" repo and into content repos like https://github.com/PerseusDL/canonical-greekLit or https://github.com/scaife-viewer/ogl-pdl-annotations.
The text was updated successfully, but these errors were encountered: