Skip to content

Commit f7eefb8

Browse files
Squashed merge of develop for Release 4.1.
Squashed commit of the following: commit fd962ee Author: Greg Chapman <[email protected]> Date: Thu Jan 30 11:16:17 2025 -0800 Update READMEs, setup.py, and LICENSE for Release 4.1. commit 9f0cf12 Author: Greg Chapman <[email protected]> Date: Thu Jan 30 10:51:07 2025 -0800 Update tests to know about new symbol counting changes. commit 08e6185 Author: Greg Chapman <[email protected]> Date: Thu Jan 30 10:39:39 2025 -0800 Tests: add SER output to command line tests. commit 2188808 Author: Greg Chapman <[email protected]> Date: Thu Jan 30 10:14:47 2025 -0800 AnnExtra symbol count has len(content) now and AnnExtra symbol error count includes Levenshtein distance of content. AnnStaffGroups are sorted now, and instead of comparing all the part indices, we compare lowest and highest. commit 7dc9ef0 Author: Greg Chapman <[email protected]> Date: Tue Jan 28 16:45:47 2025 -0800 Make sure metadata item value ends up being a string. commit 3f78626 Author: Greg Chapman <[email protected]> Date: Tue Jan 28 16:37:47 2025 -0800 More symbol count (notation_size) and symbol error count (cost) changes. Trying to make them match eachother better, and make more sense. commit a58acf3 Author: Greg Chapman <[email protected]> Date: Tue Jan 28 12:33:35 2025 -0800 Release Notes again. commit a757ad7 Author: Greg Chapman <[email protected]> Date: Tue Jan 28 12:32:04 2025 -0800 Stop assuming that the two different extras are both either a Spanner or not. They could be one of each. commit 3e08d22 Author: Greg Chapman <[email protected]> Date: Mon Jan 27 15:57:25 2025 -0800 ReleaseNotes update. commit 1904528 Author: Greg Chapman <[email protected]> Date: Sun Jan 26 12:10:21 2025 -0800 Update ReleaseNotes 4.1 commit b060304 Author: Greg Chapman <[email protected]> Date: Sat Jan 25 14:01:25 2025 -0800 musicdiff text output expected results have changed a little due to symbol counting (notation_size and cost) changes. commit b618d74 Author: Greg Chapman <[email protected]> Date: Fri Jan 24 14:31:06 2025 -0800 AnnLyric.notation_size: identifiers are only worth 1, not len(identifier). Print SER even if cost == 0. Handle numSymbolsInGroundTruth being 0 without dividing by 0. commit a6f76e9 Author: Greg Chapman <[email protected]> Date: Fri Jan 24 14:27:30 2025 -0800 New release notes for v4.1.0 commit d05187d Author: Greg Chapman <[email protected]> Date: Thu Jan 23 15:18:44 2025 -0800 Another lyrics and extras adjustment (lower the costs). commit 09e92e1 Author: Greg Chapman <[email protected]> Date: Thu Jan 23 15:03:42 2025 -0800 For extras and lyrics, notation_size does not include offset/duration, and diff cost is incremented by only 1 for differences in each of those fields. commit 80a630a Author: Greg Chapman <[email protected]> Date: Thu Jan 23 12:33:15 2025 -0800 Better notation_size and comparison cost for extras and lyrics. commit 45b54e6 Author: Greg Chapman <[email protected]> Date: Tue Jan 21 14:14:03 2025 -0800 Ignore SenzaMisuraTimeSignature (since it is displayed as no timesig at all). commit 1d08dd9 Author: Greg Chapman <[email protected]> Date: Tue Jan 21 11:37:49 2025 -0800 Refactor SER output into Visualization, and return a dict[str, str]. To print it as text, we convert to JSON and print that. commit a176105 Author: Greg Chapman <[email protected]> Date: Thu Jan 16 09:03:42 2025 -0800 Compute SER = symbolic errors/num symbols in ground truth (i.e. file2). commit 067a96d Author: Greg Chapman <[email protected]> Date: Thu Jan 16 08:56:17 2025 -0800 Add to cost any syntax errors fixed by converter21 parse code. Some lint, too. commit bfdcf32 Author: Greg Chapman <[email protected]> Date: Mon Dec 2 17:19:53 2024 -0800 New output format "ser" that prints num errors/max num syms of the two scores. commit 16d2603 Author: Greg Chapman <[email protected]> Date: Mon Dec 2 12:23:37 2024 -0800 Always return cost in symbol errors from diff() and from musicdiff command. commit 31b31e7 Author: Greg Chapman <[email protected]> Date: Sun Dec 1 21:46:53 2024 -0800 First cut at fixing Humdrum syntax errors. commit e54c259 Author: Greg Chapman <[email protected]> Date: Sun Dec 1 19:49:01 2024 -0800 Back out that AnnStaffGroup cost change; I don't like the results, and I wasn't convinced to begin with. commit eafb018 Author: Greg Chapman <[email protected]> Date: Sun Dec 1 19:40:38 2024 -0800 More notation_size tweaks: AnnMeasure should include lyric sizes, and AnnStaffGroup should add 1 for each enclosed part/staff. commit 8f903d9 Author: Greg Chapman <[email protected]> Date: Sun Dec 1 19:29:09 2024 -0800 Fix comment typo. commit e6922a7 Author: Greg Chapman <[email protected]> Date: Sun Dec 1 19:22:43 2024 -0800 Don't precompute notation_size, cache it if it is ever computed. Many objects never are asked their notation size, especially if the scores are very similar, so don't pay the price unless you have to (but only pay it once). commit 352bc95 Author: Greg Chapman <[email protected]> Date: Wed Nov 27 17:49:58 2024 -0800 First cut at comparing different number of parts.
1 parent 0ba675f commit f7eefb8

27 files changed

+567
-195
lines changed

.pylintrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,7 @@ exclude-protected=_asdict,_fields,_replace,_source,_make
325325

326326
# Maximum number of arguments for function / method
327327
max-args=5
328+
max-positional-arguments=10
328329

329330
# maximum boolean expressions in a line (too-many-boolean-expressions)
330331
max-bool-expr=10

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11

22
The MIT License (MIT)
3-
Copyright (c) 2022-2024 Francesco Foscarin, Greg Chapman
3+
Copyright (c) 2022-2025 Francesco Foscarin, Greg Chapman
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
66

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ musicdiff is derived from: [music-score-diff](https://github.com/fosfrancesco/mu
77
by [Francesco Foscarin](https://github.com/fosfrancesco).
88

99
## Setup
10-
Depends on [music21](https://pypi.org/project/music21) (version 9.1+), [numpy](https://pypi.org/project/numpy), and [converter21](https://pypi.org/project/converter21) (version 3.2+). You also will need to configure music21 (instructions [here](https://web.mit.edu/music21/doc/usersGuide/usersGuide_01_installing.html)) to display a musical score (e.g. with MuseScore). Requires Python 3.10+.
10+
Depends on [music21](https://pypi.org/project/music21) (version 9.1+), [numpy](https://pypi.org/project/numpy), and [converter21](https://pypi.org/project/converter21) (version 3.3+). You also will need to configure music21 (instructions [here](https://web.mit.edu/music21/doc/usersGuide/usersGuide_01_installing.html)) to display a musical score (e.g. with MuseScore). Requires Python 3.10+.
1111

1212
## Usage
1313
On the command line:
@@ -26,9 +26,10 @@ On the command line:
2626
default this is ignored).
2727
-x/--exclude one or more named details to exclude from comparison. Can be any of the
2828
named details accepted by -i/--include.
29-
-o/--output one or both of two output formats: text (or t) or visual (or v); the default
30-
is visual). visual (or v) requests production of marked-up score PDFs; text
31-
(or t) requests production of diff-like text output.
29+
-o/--output one or more of three output formats: text (or t) or visual (or v) or ser (or s);
30+
the default is visual). visual (or v) requests production of marked-up score
31+
PDFs; text (or t) requests production of diff-like text output; ser (or s)
32+
requests a JSON text output containing Symbolic Error Ratio information.
3233

3334
file1 first music score file to compare (any format music21 or converter21 can parse)
3435
file2 second music score file to compare (any format music21 or converter21 can parse)

ReleaseNotes_4.1.0.txt

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
Changes since 4.0.0:
2+
Add new output option that prints JSON containing the symbolic error rate (SER =
3+
numSymbolErrors / numSymbolsInGroundTruth) to stdout (the JSON actually
4+
contains all three numbers). Ground truth is assumed to be the second file.
5+
If numSymbolsInGroundTruth == 0, SER will be numSymbolErrors, to avoid divide
6+
by zero.
7+
Add new API Visualization.get_ser_output() that returns a dict containing the
8+
symbolic error rate.
9+
In support of SER, notation_sizes (a.k.a. symbol counts) and diff costs (a.k.a.
10+
symbolic error counts) have been reviewed and updated:
11+
AnnNote.notation_size(): add 1 symbol for slash on grace note
12+
AnnExtra.notation_size(): 1 symbol for the text, add 1 symbol if there is any
13+
style specified
14+
AnnExtra diff error count: text diff is 1 symbol error, offset diff is 1 symbol
15+
error, duration diff is 1 symbol error, style diff is 1 symbol error
16+
AnnLyric.notation_size(): use len(text) as symbol count instead of 1;
17+
add 1 symbol if there's a verse number;
18+
add 1 symbol if there's a verse identifier different from the number;
19+
add 1 symbol if styled
20+
AnnLyric diff cost: text diff symbol error count is the Levenshtein distance,
21+
verse number diff is 1 symbol error, verse identifier diff is 1 symbol
22+
error, offset diff is 1 symbol error, style diff is 1 symbol error
23+
AnnMeasure.notation_size(): not just notes' symbols and extras' symbols, add in
24+
the lyrics' symbols
25+
AnnScore.notation_size(): not just parts' symbols, add in staff_groups' symbols
26+
and metadata_items' symbols
27+
Add support for comparing scores that have different number of parts (this previously
28+
caused a failure). The existing parts are assumed to line up by index (as before,
29+
score1 part 0 is compared with score2 part 0), and then we generate edits that
30+
either delete the extra parts in score1, or add the extra parts in score2. The
31+
number of symbol errors for those edits is simply the notation_size of (the
32+
number of symbols in) the added or deleted parts.
33+
Several smallish bugfixes.
34+

musicdiff/__init__.py

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414

1515
import sys
1616
import os
17+
import json
1718
import typing as t
1819
from pathlib import Path
1920

@@ -52,6 +53,8 @@ def diff(
5253
force_parse: bool = True,
5354
visualize_diffs: bool = True,
5455
print_text_output: bool = False,
56+
print_ser_output: bool = False,
57+
fix_first_file_syntax: bool = False,
5558
detail: DetailLevel | int = DetailLevel.Default
5659
) -> int | None:
5760
'''
@@ -77,6 +80,16 @@ def diff(
7780
visualize_diffs (bool): Whether or not to render diffs as marked up PDFs. If False,
7881
the only result of the call will be the return value (the number of differences).
7982
(default is True)
83+
print_text_output (bool): Whether or not to print diffs in diff-like text to stdout.
84+
(default is False)
85+
print_ser_output (bool): Whether or not to print the symbolic error rate (SER),
86+
which is computed as number of symbolic errors divided by the max number of
87+
symbols in the two scores.
88+
(default is False)
89+
fix_first_file_syntax (bool): Whether to attempt to fix syntax errors in the first
90+
file (and add the number of such fixes to the returned number of edits/cost in
91+
symbol errors).
92+
(default is False)
8093
detail (DetailLevel | int): What level of detail to use during the diff.
8194
Can be DecoratedNotesAndRests, OtherObjects, AllObjects, Default (currently
8295
AllObjects), or any combination (with | or &~) of those or NotesAndRests,
@@ -85,8 +98,9 @@ def diff(
8598
Style, Metadata, or Voicing.
8699
87100
Returns:
88-
int | None: The number of differences found (0 means the scores were identical,
89-
None means the diff failed)
101+
int | None: The total cost of the edits, i.e. the number of individual symbols
102+
that must be added or deleted. (0 means that the scores were identical, and
103+
None means that one or more of the input files failed to parse.)
90104
'''
91105
# Use the Humdrum/MEI importers from converter21 in place of the ones in music21...
92106
# Comment out this line to go back to music21's built-in Humdrum/MEI importers.
@@ -130,7 +144,11 @@ def diff(
130144
if not badArg1:
131145
# pylint: disable=broad-except
132146
try:
133-
sc = m21.converter.parse(score1, forceSource=force_parse)
147+
sc = m21.converter.parse(
148+
score1,
149+
forceSource=force_parse,
150+
acceptSyntaxErrors=fix_first_file_syntax
151+
)
134152
if t.TYPE_CHECKING:
135153
assert isinstance(sc, m21.stream.Score)
136154
score1 = sc
@@ -176,11 +194,10 @@ def diff(
176194
annotated_score2: AnnScore = AnnScore(score2, detail)
177195

178196
diff_list: list
179-
_cost: int
180-
diff_list, _cost = Comparison.annotated_scores_diff(annotated_score1, annotated_score2)
197+
cost: int
198+
diff_list, cost = Comparison.annotated_scores_diff(annotated_score1, annotated_score2)
181199

182-
numDiffs: int = len(diff_list)
183-
if numDiffs != 0:
200+
if cost != 0:
184201
if visualize_diffs:
185202
# you can change these three colors as you like...
186203
# Visualization.INSERTED_COLOR = 'red'
@@ -194,10 +211,21 @@ def diff(
194211
# 'score1 ' and 'score2 ', respectively, so you can see which is which.
195212
Visualization.show_diffs(score1, score2, out_path1, out_path2)
196213

197-
if print_text_output:
198-
text_output: str = Visualization.get_text_output(
199-
score1, score2, diff_list, score1Name=score1Name, score2Name=score2Name
200-
)
214+
if print_ser_output:
215+
ser_output: dict = Visualization.get_ser_output(
216+
cost, annotated_score2
217+
)
218+
jsonStr: str = json.dumps(ser_output, indent=4)
219+
print(jsonStr)
220+
221+
if print_text_output:
222+
text_output: str = Visualization.get_text_output(
223+
score1, score2, diff_list, score1Name=score1Name, score2Name=score2Name
224+
)
225+
if text_output:
226+
if print_ser_output and print_text_output:
227+
# put a blank line between them
228+
print('')
201229
print(text_output)
202230

203-
return numDiffs
231+
return cost

musicdiff/__main__.py

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -106,10 +106,23 @@
106106
"--output",
107107
default=["visual"],
108108
nargs="*",
109-
choices=["visual", "v", "text", "t"],
109+
choices=["visual", "v", "text", "t", "ser", "s"],
110110
help="'visual'/'v' is marked up scores, rendered to PDFs;"
111-
+ " 'text'/'t' is diff-like, written to stdout."
112-
+ " Either, both, or neither can be requested."
111+
+ " 'text'/'t' is diff-like, written to stdout;"
112+
+ " 'ser'/'s is the symbolic error rate (symbol errors/total symbols),"
113+
+ " written to stdout."
114+
+ " Any, all, or none of these can be requested."
115+
)
116+
117+
parser.add_argument(
118+
"--fix_first_file_syntax",
119+
action='store_true',
120+
help="If set, syntax errors in the first input file will be fixed"
121+
+ " (if possible) so the diff can continue. Any fixes will be"
122+
+ " added to the returned cost in symbol errors). Note that errors"
123+
+ " in the second file (assumed to be the ground truth) are never"
124+
+ " corrected. Note also that this currently only works for Humdrum"
125+
+ " **kern files."
113126
)
114127

115128
args = parser.parse_args()
@@ -222,16 +235,20 @@
222235

223236
visualize_diffs: bool = "visual" in args.output or "v" in args.output
224237
print_text_output: bool = "text" in args.output or "t" in args.output
238+
print_ser_output: bool = "ser" in args.output or "s" in args.output
239+
fix_first_file_syntax: bool = args.fix_first_file_syntax is True
225240

226-
numDiffs: int | None = diff(
241+
cost: int | None = diff(
227242
args.file1,
228243
args.file2,
229244
detail=detail,
230245
visualize_diffs=visualize_diffs,
231-
print_text_output=print_text_output
246+
print_text_output=print_text_output,
247+
print_ser_output=print_ser_output,
248+
fix_first_file_syntax=fix_first_file_syntax,
232249
)
233250

234-
if numDiffs is None:
251+
if cost is None:
235252
print('musicdiff failed.', file=sys.stderr)
236-
elif numDiffs == 0:
253+
elif cost == 0:
237254
print(f'Scores in {args.file1} and {args.file2} are identical.', file=sys.stderr)

0 commit comments

Comments
 (0)