fix: emit valid PDF binary marker and omit empty /Contents by qiwzee · Pull Request #344 · signintech/gopdf

qiwzee · 2026-05-18T11:18:41Z

Fix two PDF compliance defects rejected by Adobe Acrobat / Microsoft Edge

Defect 1: Corrupted PDF binary marker comment

gopdf.go:1051 writes "%PDF-1.7\n%��\n\n" where �� are four
U+FFFD characters embedded in the source code. When Go writes this string
as UTF-8, each U+FFFD becomes the 3-byte sequence EF BF BD, so the output
file contains 12 bytes (% EF BF BD EF BF BD EF BF BD EF BF BD \n) instead
of the canonical 4-byte binary marker % E2 E3 CF D3 \n recommended by
ISO 32000-1:2008 §7.5.2.

Strict PDF parsers (Adobe Acrobat error 110, Microsoft Edge) sniff these
bytes as malformed UTF-8 and refuse to open the file. Lenient parsers like
PDFium (Chrome) tolerate it, which has hidden the bug.

This was reported in issue #225 (and others).

Fix: write the four canonical bytes via \x escape sequence:

fmt.Fprint(writer, "%PDF-1.7\n%\xe2\xe3\xcf\xd3\n\n")

Defect 2: Empty /Contents key in Page objects

page_obj.go:55 writes /Contents unconditionally. When a page has no
content stream (no native drawing commands — e.g. when only an imported
template is used), p.Contents is the empty string, producing the line:

/Contents

This is invalid PDF syntax (a dictionary key with no value). Per
ISO 32000-1:2008 §7.7.3.3, /Contents is OPTIONAL — pages without it
are spec-legal and render blank.

Fix: emit the /Contents line only when there is a value to emit.

- gopdf.go: replace mojibake bytes in the PDF binary-comment marker with the canonical 0xE2 0xE3 0xCF 0xD3 sequence so transfer tools and viewers reliably detect the file as binary. - page_obj.go: skip the " /Contents %s\n" line when PageObj.Contents is empty, preventing a malformed " /Contents \n" entry (seen e.g. on imported pages that have no content stream). - Add tests covering both fixes: - TestPdfBinaryHeader asserts the exact header bytes and that every marker byte is >= 128 per the PDF spec. - TestPageObjWriteOmitsContentsWhenEmpty / ...IncludesContentsWhenSet pin down PageObj.write behavior for both empty and populated Contents.

oneplus1000 · 2026-05-19T16:39:46Z

thank you

oneplus1000 merged commit 325e193 into signintech:master May 19, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: emit valid PDF binary marker and omit empty /Contents#344

fix: emit valid PDF binary marker and omit empty /Contents#344
oneplus1000 merged 1 commit into
signintech:masterfrom
qiwzee:fix/pdf-binary-marker-and-empty-contents

qiwzee commented May 18, 2026

Uh oh!

Uh oh!

oneplus1000 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qiwzee commented May 18, 2026

Fix two PDF compliance defects rejected by Adobe Acrobat / Microsoft Edge

Defect 1: Corrupted PDF binary marker comment

Defect 2: Empty /Contents key in Page objects

Uh oh!

Uh oh!

oneplus1000 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants