Description
The way I have it set up is that the file is stored as UTF-8 in git, and I'm using .gitattributes to set the working tree encoding to ibm-1047. When cloning the repo, git converts and tags the file as IBM-1047. The issue is that for some characters such as ® (the registered symbol), it appears as ▒ after the conversion.
Reproduce
- Create a new repository.
- Create
file.txt containing the ® character.
- Create a
.gitattributes file containing either file.txt zos-working-tree-encoding=ibm-1047 git-encoding=iso8859-1 or file.txt working-tree-encoding=ibm-1047
- Clone the repository.
- Read
file.txt using vim or cat. It will display ▒ instead of ®
Additional info
Upon examination of the file with a hex editor, it appears that it's converting the ® character from C2 AE to AF which results in it being unreadable. Whereas for it to be readable, it would have to be 62 AF.
This is consistent with the behavior when using iconv to convert from UTF-8 to IBM-1047.
Whereas converting from ISO8859-1 to IBM-1047 seems to result in the correct conversion.