fix(lesson): handle uppercase letters in word filtering#575
Open
Dronakurl wants to merge 4 commits into
Open
Conversation
The filterWordList and Word.matches functions used case-sensitive character matching, which filtered out words with uppercase first letters (German nouns, proper nouns, sentence starters). Fixed by converting characters to lowercase before checking against the codePoints set, which contains only lowercase letter code points from the language alphabet. Fixes aradzie#555 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Dictionary constructor was building the word index using original case codePoints, but focusedCodePoint lookups always use lowercase letters from the language alphabet. This caused words with uppercase first letters (German nouns) to be missed when filtering by focusedCodePoint, even though the general filter (Word.matches) was already fixed. Fixed by converting codePoints to lowercase when building the index. Fixes aradzie#555 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Changed Files
|
|
Please pull this request. I tried it, and it really is a huge improvement for German training! Not only with the rare letters, but also with all the other letters it shows so much more words that where hidden. German has inherently many words with initial uppercase, so these words and having them shown in uppercase is exactly doing what keybr has as a unique feature: practice words and letter combinations like they occur in daily writing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Message Template: Uppercase Words Fix
Summary
Fix case-sensitivity bug in dictionary filtering that prevented uppercase words from being used in guided lessons.
Related Issue
Fixes #555
Current State
When "Prefer natural words" is enabled, words with uppercase first letters are filtered out because the dictionary uses case-sensitive character matching. The German alphabet contains only lowercase letters, so words like "Hexe", "Mexiko", "Loyalität" are excluded from lessons.
This results in:
Proposed State
Convert characters to lowercase before checking against the codePoints set, allowing all words in the dictionary to be used regardless of capitalization.
Changes
Modified Files
packages/keybr-lesson/lib/dictionary.tsfilterWordList()to convert characters to lowercase before filteringWord.matches()to convert characters to lowercase before matchingCode Changes
Same pattern applied to
Word.matches().Impact
Testing
Related
Limitations
guided.ts:31)