Skip to content

Commit 35b8321

Browse files
authored
Fill out BinaryParsing documentation (#16)
This expands on the API documentation and adds a DocC folder with curation and initial articles about using the library.
1 parent 2ddef13 commit 35b8321

27 files changed

+1297
-57
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,10 @@ import BinaryParsing
3232

3333
extension QOI.Header {
3434
init(parsing input: inout ParserSpan) throws {
35-
try #magicNumber("qoif", parsing: &input)
35+
let magic = try UInt32(parsingBigEndian: &input)
36+
guard magic == 0x71_6f_69_66 else {
37+
throw QOIError()
38+
}
3639

3740
// Loading 'Int' requires a byte count or guaranteed-size storage
3841
self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Array Parsers
2+
3+
Parse arrays of bytes or other values.
4+
5+
## Topics
6+
7+
### Byte array parsers
8+
9+
- ``Swift/Array/init(parsingRemainingBytes:)``
10+
- ``Swift/Array/init(parsing:byteCount:)``
11+
12+
### General array parsers
13+
14+
- ``Swift/Array/init(parsingAll:parser:)``
15+
- ``Swift/Array/init(parsing:count:parser:)-(_,FixedWidthInteger,_)``
16+
- ``Swift/Array/init(parsing:count:parser:)-(_,Int,_)``
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
# Getting Started with BinaryParsing
2+
3+
Get up to speed with a library designed to make parsing binary data safe, efficient, and easy to understand.
4+
5+
## Overview
6+
7+
The BinaryParsing library provides a comprehensive set of tools for safely parsing binary data in Swift. The library provides the ``ParserSpan`` type, a consumable, memory-safe view into binary data, and defines a convention for writing concise, composable parsing functions.
8+
9+
Using the library's tools — including the span type, parser primitives, and operators for working with newly parsed values — you can prevent common pitfalls like buffer overruns, integer overflows, and type confusion that can lead to security vulnerabilities or crashes.
10+
11+
### A span type for parsing
12+
13+
A ``ParserSpan`` is a view into binary data that tracks your current position and the remaining number of bytes. All the provided parsers consume data from the start of the span, shrinking its size as they produce values. Unlike unsafe pointer operations, `ParserSpan` automatically prevents you from reading past the end of your data.
14+
15+
### Library-provided parsers
16+
17+
The library provides parsers for standard library integers, strings, ranges, and arrays of bytes or custom-parsed types. The convention for these is an initializer with an `inout ParserSpan` parameter, along with any other configuration parameters that are required. These parsers all throw a `ParsingError`, and throw when encoutering memory safety, type safety, or integer overflow errors.
18+
19+
For example, the parsing initializers for `Int` take the parser span as well as storage type or storage size and endianness:
20+
21+
```swift
22+
let values = try myData.withParserSpan { input in
23+
let value1 = try Int(parsing: &input, storedAsBigEndian: Int32.self)
24+
let value2 = try Int(parsing: &input, byteCount: 4, endianness: .big)
25+
}
26+
```
27+
28+
Designing parser APIs as initializers is only a convention. If it feels more natural to write some parsers as free functions, static functions, or even as a parsing type, that's okay! You'll find cases of each of these in the project's [Examples directory][examples].
29+
30+
## Example: QOI Header
31+
32+
Let's explore BinaryParsing through a real-world example: parsing the header for an image stored in the QOI ([Quite OK Image][qoi]) format. QOI is a simple lossless image format that demonstrates many common patterns in binary parsing.
33+
34+
### The QOI header structure
35+
36+
A QOI file begins with a 14-byte header, as shown in the specification:
37+
38+
```c
39+
qoi_header {
40+
char magic[4]; // magic bytes "qoif"
41+
uint32_t width; // image width in pixels (BE)
42+
uint32_t height; // image height in pixels (BE)
43+
uint8_t channels; // 3 = RGB, 4 = RGBA
44+
uint8_t colorspace; // 0 = sRGB with linear alpha
45+
// 1 = all channels linear
46+
};
47+
```
48+
49+
### Parser implementation
50+
51+
Our declaration for the header in Swift corresponds to the specification, with `width` and `height` defined as `Int` and custom enumerations for the channels and colorspace:
52+
53+
```swift
54+
extension QOI {
55+
struct Header {
56+
var width: Int
57+
var height: Int
58+
var channels: Channels
59+
var colorspace: ColorSpace
60+
}
61+
62+
enum Channels: UInt8 {
63+
case rgb = 3, rgba = 4
64+
}
65+
66+
enum ColorSpace: UInt8 {
67+
case sRGB = 0, linear = 1
68+
}
69+
}
70+
```
71+
72+
The parsing initializer follows the convention set by the library, with an `inout ParserSpan` parameter:
73+
74+
```swift
75+
extension QOI.Header {
76+
init(parsing input: inout ParserSpan) throws {
77+
// Parsing goes here!
78+
}
79+
}
80+
```
81+
82+
Next, we'll walk through the implementation of that initializer, line by line, to look at the safety and ease of use in the BinaryParsing library APIs.
83+
84+
#### Magic number validation
85+
86+
The first value in the binary data is a "magic number" – a common practice in binary formats that acts as a quick check that you're reading the right kind of file and working with the correct endianness. The code uses a `UInt32` initialzer to load a 32-bit big-endian value, and then checks it for correctness using `guard`:
87+
88+
```swift
89+
let magic = try UInt32(parsingBigEndian: &input)
90+
guard magic == 0x71_6f_69_66 else {
91+
throw QOIError()
92+
}
93+
```
94+
95+
#### Parsing dimensions
96+
97+
Next, the width and height are also stored as 32-bit values, but we want to use them in our type as `Int` values. Instead of parsing `UInt32` values and _then_ converting them to `Int`, we'll use an `Int` parser that specifies the storage type, handling any possible overflow:
98+
99+
```swift
100+
self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
101+
self.height = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
102+
```
103+
104+
### Parsing `RawRepresentable` types
105+
106+
Because the `Channels` and `ColorSpace` enumerations are backed by a `FixedWidthInteger` type, the library provides parsers that load and validate the parsed values. These parsers throw an error if the parsed value isn't one of the type's declared cases:
107+
108+
```swift
109+
self.channels = try Channels(parsing: &input)
110+
self.colorspace = try ColorSpace(parsing: &input)
111+
```
112+
113+
### Safe arithmetic
114+
115+
After parsing all of the header's values, the last step is to perform some validation. Using the library's optional multiplication operator (`*?`) allows for concise arithmetic while preventing integer overflow errors:
116+
117+
```swift
118+
guard let pixelCount = width *? height,
119+
pixelCount <= maxPixelCount,
120+
width > 0, height > 0
121+
else { throw QOIError() }
122+
```
123+
124+
### Bringing it together
125+
126+
The full parser implementation, as shown below, protects against buffer overruns, integer overflow, arithmetic overflow, type invalidity, and pointer lifetime errors:
127+
128+
```swift
129+
extension QOI.Header {
130+
init(parsing input: inout ParserSpan) throws {
131+
let magic = try UInt32(parsingBigEndian: &input)
132+
guard magic == 0x71_6f_69_66 else {
133+
throw QOIError()
134+
}
135+
136+
self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
137+
self.height = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
138+
self.channels = try Channels(parsing: &input)
139+
self.colorspace = try ColorSpace(parsing: &input)
140+
141+
guard let pixelCount = width *? height,
142+
pixelCount <= maxPixelCount,
143+
width > 0, height > 0
144+
else { throw QOIError() }
145+
}
146+
}
147+
```
148+
149+
[qoi]: https://qoiformat.org/
150+
[examples]: https://github.com/apple/swift-binary-parsing/tree/main/Examples
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Integer Parsers
2+
3+
Parse standard library integer types.
4+
5+
## Overview
6+
7+
The `BinaryParsing` integer parsers provide control over three different aspects of loading integers from raw data:
8+
9+
- _Size:_ The size of the data in memory can be specified in three different ways. Use an integer type's direct parsing initializer, like `UInt16(parsingBigEndian:)`, to load from the exact size of the integer; use a parser with a `byteCount` parameter to specify an exact number of bytes; or use a parser like `Int(parsing:storedAsBigEndian:)` to load and convert from another integer's size in memory.
10+
- _Endianness_: The endianness of a value in memory can be specified either by choosing a parsing initializer with the required endianness or by passing an ``Endianness`` value to a parser. Note that endianness is not relevant when parsing a single-byte integer or an integer stored as a single byte.
11+
- _Signedness_: The signedness of the parsed value is chosen by the type being parsed or, for the parsers like `Int(parsing:storedAs:)`, by the storage type of the parsed value.
12+
13+
## Topics
14+
15+
### Fixed-size parsers
16+
17+
- ``SingleByteInteger/init(parsing:)``
18+
- ``MultiByteInteger/init(parsingBigEndian:)``
19+
- ``MultiByteInteger/init(parsingLittleEndian:)``
20+
- ``MultiByteInteger/init(parsing:endianness:)``
21+
22+
### Byte count-based parsers
23+
24+
- ``Swift/FixedWidthInteger/init(parsingBigEndian:byteCount:)``
25+
- ``Swift/FixedWidthInteger/init(parsingLittleEndian:byteCount:)``
26+
- ``Swift/FixedWidthInteger/init(parsing:endianness:byteCount:)``
27+
28+
### Parsing and converting
29+
30+
- ``Swift/FixedWidthInteger/init(parsing:storedAs:)``
31+
- ``Swift/FixedWidthInteger/init(parsing:storedAsBigEndian:)``
32+
- ``Swift/FixedWidthInteger/init(parsing:storedAsLittleEndian:)``
33+
- ``Swift/FixedWidthInteger/init(parsing:storedAs:endianness:)``
34+
35+
### Endianness
36+
37+
- ``Endianness``
38+
39+
### Supporting protocols
40+
41+
- ``SingleByteInteger``
42+
- ``MultiByteInteger``
43+
- ``PlatformWidthInteger``
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Miscellaneous Parsers
2+
3+
Parse ranges and custom raw representable types.
4+
5+
## Topics
6+
7+
### Range parsers
8+
9+
- ``Swift/Range/init(parsingStartAndEnd:boundsParser:)-(_,(ParserSpan)(ParsingError)->Bound)``
10+
- ``Swift/Range/init(parsingStartAndCount:parser:)-(_,(ParserSpan)(ParsingError)->Bound)``
11+
- ``Swift/ClosedRange/init(parsingStartAndEnd:boundsParser:)-(_,(ParserSpan)(ParsingError)->Bound)``
12+
13+
### `RawRepresentable` parsers
14+
15+
- ``Swift/RawRepresentable/init(parsing:)``
16+
- ``Swift/RawRepresentable/init(parsingBigEndian:)``
17+
- ``Swift/RawRepresentable/init(parsingLittleEndian:)``
18+
- ``Swift/RawRepresentable/init(parsing:endianness:)``
19+
- ``Swift/RawRepresentable/init(parsing:storedAs:)``
20+
- ``Swift/RawRepresentable/init(parsing:storedAsBigEndian:)``
21+
- ``Swift/RawRepresentable/init(parsing:storedAsLittleEndian:)``
22+
- ``Swift/RawRepresentable/init(parsing:storedAs:endianness:)``
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Optional Operations
2+
3+
Safely perform calculations with optional-producing operators.
4+
5+
## Overview
6+
7+
Optional operators provide a way to seamlessly work with newly parsed
8+
values without risk of integer overflow or other common errors that
9+
may result in a runtime error.
10+
11+
For example, the following code parses two values from a ``ParserSpan``,
12+
and then uses them to create a range:
13+
14+
```swift
15+
let start = try UInt16(parsingBigEndian: &input)
16+
let count = try UInt8(parsing: &input)
17+
guard let range = start ..<? (start +? count) else {
18+
throw MyParsingError(...)
19+
}
20+
```
21+
22+
## Topics
23+
24+
### Arithmetic operators
25+
26+
- ``Swift/Optional/+?(_:_:)``
27+
- ``Swift/Optional/-?(_:_:)``
28+
- ``Swift/Optional/*?(_:_:)``
29+
- ``Swift/Optional//?(_:_:)``
30+
- ``Swift/Optional/%?(_:_:)``
31+
32+
### Assigning arithmetic operators
33+
34+
- ``Swift/Optional/+?=(_:_:)``
35+
- ``Swift/Optional/-?=(_:_:)``
36+
- ``Swift/Optional/*?=(_:_:)``
37+
- ``Swift/Optional//?=(_:_:)``
38+
- ``Swift/Optional/%?=(_:_:)``
39+
40+
### Range operators
41+
42+
- ``Swift/Optional/..<?(_:_:)``
43+
- ``Swift/Optional/...?(_:_:)``
44+
45+
### Collection subscripting
46+
47+
- ``Swift/Collection/subscript(ifInBounds:)-(Index)``
48+
- ``Swift/Collection/subscript(ifInBounds:)-(FixedWidthInteger)``
49+
- ``Swift/Collection/subscript(ifInBounds:)-(Range<Index>)``
50+
- ``Swift/Collection/subscript(ifInBounds:)-(Range<FixedWidthInteger>)``
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# String Parsers
2+
3+
Parse strings of different lengths and encodings.
4+
5+
## Topics
6+
7+
### UTF-8 parsers
8+
9+
- ``Swift/String/init(parsingNulTerminated:)``
10+
- ``Swift/String/init(parsingUTF8:)``
11+
- ``Swift/String/init(parsingUTF8:count:)``
12+
13+
### UTF-16 parsers
14+
15+
- ``Swift/String/init(parsingUTF16:)``
16+
- ``Swift/String/init(parsingUTF16:codeUnitCount:)``
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Throwing Operations
2+
3+
Use throwing variations of arithmetic methods, integer conversions, and collection subscripting.
4+
5+
## Overview
6+
7+
## Topics
8+
9+
### Arithmetic operations
10+
11+
- ``Swift/FixedWidthInteger/addingThrowingOnOverflow(_:)``
12+
- ``Swift/FixedWidthInteger/subtractingThrowingOnOverflow(_:)``
13+
- ``Swift/FixedWidthInteger/multipliedThrowingOnOverflow(by:)``
14+
- ``Swift/FixedWidthInteger/dividedThrowingOnOverflow(by:)``
15+
- ``Swift/FixedWidthInteger/remainderThrowingOnOverflow(dividingBy:)``
16+
17+
### Assigning arithmetic operations
18+
19+
- ``Swift/FixedWidthInteger/addThrowingOnOverflow(_:)``
20+
- ``Swift/FixedWidthInteger/subtractThrowingOnOverflow(_:)``
21+
- ``Swift/FixedWidthInteger/multiplyThrowingOnOverflow(by:)``
22+
- ``Swift/FixedWidthInteger/divideThrowingOnOverflow(by:)``
23+
- ``Swift/FixedWidthInteger/formRemainderThrowingOnOverflow(dividingBy:)``
24+
25+
### Integer conversion
26+
27+
- ``Swift/BinaryInteger/init(throwingOnOverflow:)``
28+
29+
### Collection subscripting
30+
31+
- ``Swift/Collection/subscript(throwing:)->Self.Element``
32+
- ``Swift/Collection/subscript(throwing:)->Self.SubSequence``
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# ``BinaryParsing``
2+
3+
A library for building safe, efficient binary parsers in Swift.
4+
5+
## Overview
6+
7+
The `BinaryParsing` library provides a set of tools for safely parsing binary
8+
data, while managing type and memory safety and eliminating common value-based
9+
undefined behavior, such as integer overflow. The library provides:
10+
11+
- ``ParserSpan`` and ``ParserRange``: a raw span that is designed for efficient
12+
consumption of binary data, and a range type that represents a portion of that
13+
span for deferred processing. A `ParserSpan` is most often consumed from the
14+
front, and also supports seeking operations throughout the span.
15+
- Parsing initializers for standard library integer types, strings, arrays, and
16+
ranges, that specifically enable safe parsing practices. The library also
17+
provides parsing initializers that validate the result for `RawRepresentable`
18+
types.
19+
- Optional-producing operators and throwing methods for common arithmetic and
20+
other operations, for calculations with untrusted parsed values.
21+
- Adapters for data and collection types to make parsing simple at the call
22+
site.
23+
24+
25+
## Topics
26+
27+
### Essentials
28+
29+
- <doc:GettingStarted>
30+
- ``ParserSpan``
31+
- ``ParserRange``
32+
33+
### Parsing tools
34+
35+
- <doc:IntegerParsers>
36+
- <doc:StringParsers>
37+
- <doc:ArrayParsers>
38+
- <doc:MiscellaneousParsers>
39+
40+
### Working with untrusted values
41+
42+
- <doc:OptionalOperations>
43+
- <doc:ThrowingOperations>
44+
45+
### Error handling
46+
47+
- ``ParsingError``
48+
- ``ThrownParsingError``

0 commit comments

Comments
 (0)