-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renaming/fixing of content-item schema #42
Comments
I propose that renaming the title property (e.g., from
Addressing these issues alongside the property renaming would improve consistency and reliability in title handling. |
I agree, that's a good idea, we first need to figure out how we want to handle each of these situations though. |
I think it should be in the first rebuilt. |
But if this information is obtained thank's to linguistic processing, would it not be more logical or practical to make this change in the consolidated rebuilt? The process creating the rebuilt is already quite complex and resource-demanding as it needs to handle many issue and page documents at the same time. Since titles are already inherited values, I think it makes more sense to either try to fix it from the start (canonical) or in the consolidated rebuilt. |
Yes, sure. The lingproc was just the place where the issue caused a bit of a headache. The fix needs to be done earlier. I am currently collecting stats on the different issues for all processed lingproc items (meaning, it will be restricted to the supported languages de/fr). I'll put the results on s3 once the processing went through (which takes a bit of time). |
The Content-item schema is meant to represent the rebuilt content-items, but it's not very clear of explicit and it might be outdated. It should be updated and made more clear.
In addition, as pointed out by @simon-clematide, the porperty
"t"
is present twice in the schema, both fortitle
andtoken
.It's not at the same level of the schema hierarchy, so it has currently not caused too many issues, but is not ideal and should probably be changed.
Action points for this issue are thus:
The text was updated successfully, but these errors were encountered: