Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi valued headers #974

Merged

Conversation

TheNailDev
Copy link
Contributor

@TheNailDev TheNailDev commented Feb 16, 2025

The UltraStar Format specification 1.1.0 introduced the (backwards-compatible) concept of multi valued headers.
A single header can contain multiple values split by comma (multi valued approach 1) or multiple occurrences of the same header can each contain one or multiple values of the header (multi valued approach 2).
As of Format 1.1.0 the headers supporting multiple values are: GENRE, CREATOR, EDITION, LANGUAGE and the newly introduced TAGS.
The old implementation of the headers already supported multi valued approach 1, as the use of the headers for textual comparisons works the same regardless of the number of comma separated values in the header.
This PR extends the implementation to also support multi valued approach 2.
Under the hood a multi-valued header still is a single string, where all values are separated by comma. (Simple approach that does not require major refactoring while still achieving the desired filtering capabilities.)

Assume a file contains following lines:

GENRE: Charts,Mainstream
GENRE: Club,Party

The old implementation detects the genre as Club,Party
The new implementation detects the genre as Charts,Mainstream,Club,Party

I also implemented the new TAGS header in the context of this PR, as it is closely related to the other multi valued headers, all being used for song filtering.

There are still three open points that we might want to discuss with regard to the new implementation:

1: Currently multi valued headers are saved like they are stored internally (multi valued approach 1). The specification says "Header values may not exceed 255 characters.". There might be rare cases were each value of a multi-valued header is shorter than 255 characters, but the comma separated list might be longer than 255 characters. Possible approaches to handle such situations are:

  • Keep as it is and allow the cases where the comma separated list is longer than 255 characters. After all the spec says "MAY" and not "MUST", so it is not an actual violation of the spec. Yet, it might reduce compatibility with games having a restricted value length.
  • Always split multi valued headers into separate headers containing only a single value (multi valued approach 2). This might decrease compatibility to older games not (completely) supporting multi valued headers. Additionally this slightly increases the size of the song file, as each value would be prepended by the header.
  • Use a comma separated list if it is shorter then 255 characters, otherwise split the list into multiple sub lists and write each sub list to the file. This would be a compromise between the other two approaches but would increase the complexity of the saving logic only to address a rare edge case without major consequences.

2: With the introduction of multi valued headers the question arises how sorting for those headers should be handled. The current implementation sorts by the entire comma separated list. As such the first value determines how a song is sorted. I don't know if this is the best behavior for sorting, but I'm also unsure what other sorting mechanisms should be used for multi valued headers.

3: Currently the TAGS header cannot be edited in the editor (just like the VERSION header also cannot be edited currently). In my opinion this is fine for the time being. The editor should probably be updated after fully implementing format 1.1.0 to enable all new features (VERSION, TAGS, INSTRUMENTAL, VOCALS) in the editor at once. This way themes only need to be updated once. If you are of a different opinion we can discuss about that further.

@barbeque-squared
Copy link
Member

re: 1 "Header values may not exceed 255 characters."
This isn't an RFC2119 value nor is it capitalized. This is way too open to interpretation anyway. I mean, I get what it's trying to do ("please don't put the entirety of a book on a single line") but right now I can also interpret it as "Header values are always longer than 255 characters, but you might sometimes encounter a shorter one" which obviously doesn't make any sense.

We should probably open an issue in the format repo for clarification on what it's really trying to say. If it's more of an aesthetic thing (long lines in text editors are annoying) but not technically wrong, I'd just go for option 1 until someone complains and comes with an actual usecase why they need 200+ TAGS in the first place, and only then maybe think about option 3.

re: 2 sorting
The spec says "In this way the order of multi-valued headers is significant." Also just seeing it as a single giant string for sorting purposes is fine by me. Can't wait to see what mess this is going to create for people that sort by EDITION but then also have Categories on, but that's a combo I'm not touching with a million foot pole.

re: 3 editor
Yeah not touching that is fine. The editor is more geared towards editing something that already exists than creating a txt from nothing, so as long as whatever external program makes the initial txt sets them, they will be there.
The only sort-of usecase is to quickly check why something is turning up in the search results, but since only TAGS is the new searchable thing, and we don't really know how that's going to be used yet, yeah just don't bother with the editor for now. I sometimes want the notes grid to be bigger already so I really don't know where we would put the extra fields anyway.

I'll try to have a better look at this somewhere during this week or the weekend, but probably can't really do much until the 255 limit has been cleared up.

@barbeque-squared
Copy link
Member

I've thought of a much better reason to go for option 1 (at least for the time being): if you edit+save a file that has repeated headers in an old version (especially 2017 and 2020 are very common), options 2 and 3 would cause destructive behaviour, whereas option 1 is completely fine.

I have created an issue in the format repo for this: UltraStar-Deluxe/format#78

Copy link
Member

@barbeque-squared barbeque-squared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not particularly a fan of strings like SONG_JUMPTO_TYPE9 buuuut this is only the translation (not the actual theme files), and as such not really something many people would need to customize themselves.

Tested it and works fine.

@barbeque-squared barbeque-squared merged commit 3031826 into UltraStar-Deluxe:master Feb 23, 2025
5 checks passed
@TheNailDev
Copy link
Contributor Author

I'm not particularly a fan of strings like SONG_JUMPTO_TYPE9

I understand that. I simply used that name to keep it in line with the remaining translation strings. Might be something that can be changed to more sensible names in the future, although it is probably not of a high priority as you already noted.

@TheNailDev TheNailDev deleted the multi-valued-headers branch February 24, 2025 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants