Skip to content

Reading an untagged file <= 8 bytes in size causes output encoding differences #84

@chrishodgins

Description

@chrishodgins

With the following perl program the output will appear corrupted unless the file is greater than 8 bytes in size. The file untagged-file-with-ebcdic.txt is untagged and only contains EBCDIC characters.

Perl test program:

open(my $fh, '<', 'untagged-file-with-ebcdic.txt');
while (my $row = <$fh>) {
	chomp $row;
	print "$row\n";
}
close($fh);

Shell example:

$ chtag -r untagged-file-with-ebcdic.txt
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F715
               1   2   3   4   5   6   7  \n
0000000008
$ perl test.pl
�������

### Now try again with slightly bigger contents
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F7F8    1500
               1   2   3   4   5   6   7   8  \n
0000000009
$ perl test.pl 
12345678

Repeating the same sequence with the file tagged as IBM-1047:

$ chtag -r untagged-file-with-ebcdic.txt
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F715
               1   2   3   4   5   6   7  \n
0000000008
$ perl test.pl
1234567

### Now try again with slightly bigger contents
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F7F8    1500
               1   2   3   4   5   6   7   8  \n
0000000009
$ perl test.pl 
12345678

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions