-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Rich Indexing Specification #3369
Comments
What happens if there are two instances of adding an index marker of the same topic/subtopic on the same page? Currently, both would receive an anchor of the same (BTW, I don't anticipate any duplicate collisions with other minted |
The attribute names of |
Invald? no. Non-standard? Sure. The standard sucks. I don't think that existed as a standard the last time I did attribute injection. Time for new tennis balls on the walker... Updating the original accordingly |
We could probably find a way to only emit the anchor for the first one and just ignore the duplicate. |
Alternately, We can track and instance mark or we can consider it user error. Dunno what is entirely best. |
It doesn't passes w3c's validation testing ![]() The |
Some more comments:
I like the idea of generating a header based on the label. Should we support multi-word labels?
This would allow for no label as well as multi-word labels without ambiguity.
Or is this something we can already infer logically using our current syntax somehow? |
I wasn't sure at the time but thought it would be better to have them present, then decide they weren't needed than the other way around. If we are forced for (currently unguessable ) reasons to parse the
I mostly constructed it that way because I could not recall ever seeing more than two depths in an RPG index. If we did that, I think we'd want to remove the "This subtopic can belong to multiple topics" code and force 1:1:1... for Topics and subtopic children.
I don't see a problem with it.
OOh. I like this.
I don't think we can infer it reasonably, but it is something we could see about adding as another delimited column.
Maybe? |
index namesAlternative syntax might be When you have lots of indexing entries it's nicer to have the consistent string be at the front for readability cross referencesWith the "see also", do note that there are usually two kinds: "see also topic", and "see topic". This is indexing convention. (I know people that index books as their profession). |
A couple of random thoughts.
|
The See/See Also implementation can wait for a future PR .. it would be prudent to reserve a syntax though, |
For the reservation, I suggest appending examples: See Also Only Both |
An index entry wouldn't have both See and See also. If an index entry does have a preferred term (i.e. See ...), then it is that referred index entry that would have the related concepts See also cross references. |
The documentation I say, doing quick "what are these concepts" to refresh my brain of the specific meaning did both, though I think that may have been a case of poor examples. This is clearer, from https://ugapress.org/resources/for-authors/indexing-guidelines/:
With that in mind, I suggest the following See: postfix This limits the prohibited characters from the Subtopic and postfix to a single character which I feel is better than three separators. Examples. See: See Also: See Also Under: |
Edit: Cleaned up for clarity (hopefully?)
Index entries are atomic as possible, which means each index anchor will need a distinct Markdown element. No more creating anchors for a multiple topics in the same line. Then, determining
Page anchors are created for See
Also,
See also
See under
See under also
Then users don't need to know the ins and outs of index formatting to select the correct option of |
I still prefer a single
In any case, I don't think mirroring the
I guess that should technically work but I would leave that up to the user if they want to add that step. I wouldn't make it the default.
I agree. I think
With the guidelines you posted, I think we can logic it out automatically. |
I agree having things line up does help legibility. Though I will point out we can use spacing to line things up too kind of like a table, in which case a post-label I think ends up looking more legible overall, since it allows a default "blank" label to align with all others: Post-label
vs Pre-label
vs Pre-label aligned
In these examples, the post-label looks more legible to me if you are working with multiple labels nearby (even trying to line up the last one as neatly at possible). But this may just be a personal style preference. Maybe we can do a poll on the gitter chat. |
Hmm .. whether a cross reference shows as a See or a See also shouldn't be determined by whether the referenced topic has subtopics. They should be determined by the author, as suits their intended purposes. One is a cross reference that says "this concept is more commonly known as X, go look there", and the other says "this concept is interesting in it's own right, but also go look at this other related concept for more context". Is it possible that an author might use a See also when they mean See (and vice versa)? Yes. That's on them. |
This is correct, and indeed, consistent with the rules I listed above. Subtopics influence only whether "see under" or "see also under" applies.
This is not correct. Each has a specific defined use case according to established rules.
This is correct conceptually, but there are actual rules that define these more precisely: "See: Follows an entry with no locators of its own—it simply refers the reader to another part of the index." "See also: Follows an entry with locators; it refers the reader to additional information in another entry."
But it shouldn't need to be on them if we just generate the correct version for them, which we can do. We have a calculator and a clear set of rules. If the user gives us 1 + 2 we should give them 3. If they really want to do their own thing and change it to 5 later, they can, but most people would probably want our calculator to give them the right answer to start with. |
Yes. I'm referring to the conventional use of the types of references as being the "intended purpose", to which the author should conform. I'm not suggesting the author determines the purpose.
No. These rules define (precisely) how the two uses are effected. The causality is purpose-of-reference → index entries. A See entry that references an index entry that does not have a page number is an error. (The princess is in another castle. Which castle? Oh, it doesn't exist.) One we could likely fix by minting an index entry for the referred item, using the page the syntax was found on. (Technically fraught, as it would also be possible to write up a whole bunch of See references as a list of synonyms independent of the actual content. Indexes were traditionally constructed separate from the actual content.) And then there's also the edge case of
(no comment) |
Right. And with our syntax as I defined above, this should not be possible. There is no way to create a "see" entry that points to something that doesn't exist. The correct conventions are baked in to the syntax.
Thus, I would amend this to: Syntax → purpose-of-reference → index entries. The whole point being that if we follow the rules I outlined above, our syntax will naturally enforce correct purpose of reference, and hence correct indices. We can take advantage of that to save users the work of figuring that out themselves. |
This is a no-value added requirement. |
correct so far.
This is absolutely not how the code works at present and I see no value in this addition. Code should, at most, warn the user about unfound/incorrect/incomplete references and make zero guesses about layout intent.
Gods no.
No. This is not a wizard. |
It is not a requirement at all. I'm not sure where you got that impression. Sometimes adding spaces to make things line up just looks nicer. |
Hence my proposal above to add this new functionality. I am suggesting a change to how the code works.
Fortunately, our our syntax doesn't suffer from any of these issues, so there is no need to warn/guess anything. Using Reusing my analogy from before: We have a clear set of rules and provide our users with a calculator. If a user gives us 1 + 2, should we not respond with 3? Seems a lot more valuable than responding "I don't know what answer you wanted."
This is exactly what it is doing. It is using multiple separate entries to do this. I suspect you may have misread something here.
Are we not creating an auto-generator for indices? We have all the tools at our disposal to make this work automatically. Why not? |
If I understand the preceding proposal correctly, this is because the instance of the indexing syntax that calls for a See reference will also insert an index entry for the page the See indexing syntax occurs on if there isn't a paged index entry yet. That's an unsafe assumption. I could, for example, get to the end of writing my text, navigate to the page the index will be generated onto, and add a bunch of "X, see Y" syntax instances as an attempt to ensure I've covered off the likely synonyms. If the text regarding topic "Y" (which had it's own direct indexing syntax) was later deleted .. then I certainly don't want to have "X, see Y" point to an index entry for "X" with a page number of the index page itself. The alternative workflow is that I have to go hunt down an actual page containing X to add the "see X" indexing syntax. Given that there might be many pages that are indexed for "X" then the "see X" might end up on any of them. I certainly don't want to putting them on each of them just to be sure (and to guard against later text deletion). |
Okay. I read what you are suggesting backwards. I don't care for this idea, definitions should go in one direction.
Yes, I did.
Index Markdown entries should be as atomic as possible., IMO, and that means you'll have a markdown for the distinct entry crossreffed and the entry that crossrefs |
Ok, This might be where the disconnect is coming from. Let me try to clarify how I'm imagining this syntax working a little more. When you write an index marker Example:
This would generate: Armor, 2, see also Encumbrance shields, 3 Carry Capacity, see Encumbrance Daggers, see under Weapons Defense, see Armor Dual-wielding, see Weapons : Daggers Equipment, 4 see also Armor; Weapons Horses, 5 Saddle bags, see Encumbrance Shields, see under Armor Spears, see under Weapons Swords, see under Weapons Weapons, 1 see also Encumbrance; Repairs and Upgrades daggers, 2 spears, 2 swords, 1
Yeah, you wouldn't do that with this approach, since that would place a bunch of index entries pointing to the index page itself. You would instead add synonyms as relevant to a given page. For example, on page 4 above, I decided "Carry Capacity" would be a term someone might search for to find this paragraph about encumbrance. On page 5, I decided that "Saddle Bags" would be a likely term someone would search to reach that paragraph. So both "Carry Capacity" and "Saddle Bags" will now point to "see Encumbrance".
Luckily you wouldn't need to put the I don't expect this to convince you, but at a minimum I would hope this puts us on the same page about what I am proposing, so we can continue the discussion from the same understanding. "I think we should buy a house." "I don't think we should buy a horse for these reasons." "No, a house." I also totally re-wrote my earlier proposal to try to clear this up. #3369 (comment) |
My proposed syntax is going only in one direction. Topics on left, synonyms (or references) on the right. The reference only points to the Topic and not vice-versa. What am I missing?
This is already part of my proposal, unless I am again totally missing something.
A distinct page anchor for only one Topic, plus an optional cross-reference that should point back to it. If you also want a page anchor for
I also totally re-wrote my earlier proposal to try to clear some of this up. #3369 (comment) |
That is however a very possible likely workflow, one which this tool actively interferes with. When I'm looking at the index itself, either to review or even to actually use it ... that is when I would realise that the term I have for the concept I'm looking for does not appear in the index, and that a See cross-reference is called for. And I'd rather just add it in as an synonym cross reference right there, vs navigating to one of many possible pages the preffered term is indexed on and inserting there. Especially if there's a risk that that section of text might get revised/deleted (removing the See cross-reference, despite the concept still being indexed on multiple other pages).
Only sufficient if there is no risk of that particular section of the text (with attached cross references) not getting removed.
Again, there might be multiple instances of the preferred term indexed across multiple pages .. and we only need one instance of the see cross-reference .. so which page has the cross reference index marker? And when I'm editing text that does have that cross-reference index marker, and I know that the preferred term is indexed on multple other pages .. is this instance a duplicate I can safely delete (e.g. if the text is revised and is no longer referencing the preferred term concept)? The site where I'm likely to search for particular terms is the index. That is what an index is for. Concept → term → page number | redirect to preferred term. Inserting the synonym redirections onto individual pages is fraught in either fragility (one synonym instance might get deleted despite referent concept also appear on alternative pages), or redundant duplication. The tools should support the workflow. The workflow shouldn't have to bend to suit the tool. |
I can see I am still not being understood on several points, so I apologize that I am not able to communicate my idea in an effective way. Rather than continue trying to clarify things, I will step back from this one for a now and focus on some of the other PRs for a while until some other proposal comes through. |
This is what I'm working with. #[IndexMarker][LocationMarker] Where: IndexMarker is [IndexNameMarker][Topic]/[Subtopic] IndexNameMarker is
Topic is a string. LocationMarker is
SeeCrossReference is |[IndexMarker] SeeAlsoCrossReference is ||[IndexMarker] SeeUnderCrossReference is |+[IndexMarker] SeeAlsoUnderCrossReference is ||+[IndexMarker] If a Cross Reference's IndexAddress references the same index, the index will be omitted from the formatted address. If the IndexAddress of a SeeUnderCrossReference or SeeAlsoUnderCrossReference points to a subtopic that does not exist on the topic or the topic has less than 2 subtopics, it will be redirected to the topic and reported to the user as an inline comment Styling/ordering of the Crossreferences is a next stage topic and if it can't be handled purely by CSS then we want to consider having more than one Index generator function that can generate more than one style. IMO, use the Chicago manual first. Reasoning: An index entry should be the minimal as possible for default use cases. |
Looks good. I'm assuming white space (tabs and spaces) are permitted either side of the separator symbols (but not within, e.g. Also, the Similarly, the Lastly, the wording for SeeUnder and SeeAlsoUnder references should be clear that they both can only point to topics, and not subtopics. And then that if a) the topic does not exist, or b) the topic has fewer than 2 subtopics, then the error will be reported (via a |
I still think doubling of the pipe character |
Implements `#Index Name:Topic/Subtopic<Crossreference>` per naturalcrit#3369 (comment)
We sure we don't want to just deduce which one to use (see or see also) by looking if the cross reference has a page anchor of its own? Then we only need one | for both cases. |
Whether a "see X" vs a "see also X" cross-reference is called for is entirely an editorial decision. It can not be deduced.
At best you could try to short-circuit forwarding chains (e.g. Philtres → Elixiers → Potions). |
These examples can all be deduced accurately. Elixirs has no page number of its own, so it uses "see". Oils and Potions do have page numbers where further information can be found, so they use "see also". Not an editorial decision (never was, still not sure where this idea comes from). It follows a defined rule that every style guide I can find is pretty explicit on. I dont understand the eagerness to ignore the rules when we can use them to avoid errors so easily. Using "see also" with Elixirs would be an error. Using "see" with potions or oils would be an error as well. We have the means to avoid these errors, with a bonus of simplifying the syntax. Is that such a bad thing? I see only upsides. |
The base problem is that form should follow function, not function following form. Remember too though that the mapping of X to Y in a "X .. see Y" referral is not always a perfect mapping of concepts — sometimes Y is a broader term, sometimes Y is a narrower term, and sometimes there are multiple meanings of the term text (e.g. "pole"). In addition to the mapping of these synonyms and homonyms, an author might wish to relate a given term to related content. That related content might not exist or be indexed at the same concept level, and so the relationship to the broader/narrower concept might not fit within the homonym referrals .. and so one might end up with this:
Another example: Let's say you have a term "Coding Languages" in your index. Now, "Coding Languages" could be a broad term that encompasses various specific languages. In this case, you might have:
|
The homonym example is interesting; technically two (or more) different entries that happen to appear in the same place in the index, hence the mixing of
not
I.e., each entry should use only one type of see, see under, or some combo of multiple see also/see also under. Mixing "see" and "see also" is a sign of an error.
Even in this case, mixing both see and see also would be an error. Either "Coding Languages" has its own pages or not. Using "see" requires that the current entry has no page number of its own. "See also" requires that the current entry has page numbers of its own.
We must be clear that, again, "see" and "see also" are not interchangeable labels that can be optionally chosen at the editor's choice. "See" and "see also" have quite literally the same purpose (pointing to another entry), with the only distinction being whether the current index topic has its own pages. Full stop. There is no other hidden meaning separating the two. As much as an editor might want to ascribe separate personal meanings to these two labels and so use them in some other way, doing so would be an error. |
If we want to go out of our way to provide a more complex syntax for the sole purpose of allowing users to create erroneous indices (this logic is baffling to me), I might suggest the following:
Though this makes me consider proposing a second snippet for the index generator, so users have the option:
|
I think there's some unintentionally loaded language here. What I am trying to provide in my PR is a system that builds indexes based on the user syntax and ascribes to no particular writing style guide rules. I think this makes for a better tool long term and we can solve for some things down the road. As first-pass compromise, I've put in dropping references to non-existent targets and collating subtopic targets to the parent topic when the parent has two or fewer subtopics. What I'd like to do in later iterations is have a couple of three "preconfigs" and then some prompts for collations and error handling ( Drop errors? Report errors? etc ) coupled with an output log that informs on broken references and rules-based cross-reference redirects ( and whatever else we want along the way )
What precisely do you envision here? |
Your idea:
The rich indexing specification is designed for building an arbitrary number of Book Index style lists consisting of alphabetized Topics and subtopics.
The system involved is two-part. In the markdown parsing, index tokens are processed and either converted to HREF links or consumed. The snippet parses the brew source, collates duplicated entries, and inserts an alphabetized index list for each list.
The markdown takes one of two forms - each is a single line.
#Index2 List|Of|Topics
- One or more Topics, separated by a|
character that will be included in the user-specified indexIndex2
#Index2 List|Of|Topics // Subtopic
- One or more Topics, separated by a|
character with a subtopic that will be included in the user-specified indexIndex2
While there may be multiple topics, there may only be one subtopic in a link.
Parsing Details:
In case 1, An HREF Anchor link in the form
<a href="#p{pageNumber}_{sluggified_subtopic}" data-topic="{topic}" data-index="{index}"></a>
and consumed from the markup.In case 2, An HREF Anchor link in the form
<a href="#p{pageNumber}_{sluggified_subtopic}" data-topic="{topic}" data-subtopic="{subtopic}" data-index="{index}"></a>
and consumed from the markup.Snippet details:
This assumes that the marked run created proper anchors.
The snippet runs its own parsing against the brew text. This parsing will collect all the indexing markdown tags and the associated page number.
Duplicate Topic and Subtopic entries will be collated so that references to the same index text on multiple pages will yield an ordered list of pages in relation to the indexed text.
Topic entries will be alphabetized.
Subtopic entries will be alphabetized.
Each list will be rendered with the Lists name as an H2 header followed by the ordered contents of the index followed by a
\page
.An active PR for an earlier version of this exists as #3113 and will be updated to the new markdown shortly.
Styling will be derived from the existing Indexing snippet styles.
The text was updated successfully, but these errors were encountered: