Skip to content

Fill out BinaryParsing documentation #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,10 @@ import BinaryParsing

extension QOI.Header {
init(parsing input: inout ParserSpan) throws {
try #magicNumber("qoif", parsing: &input)
let magic = try UInt32(parsingBigEndian: &input)
guard magic == 0x71_6f_69_66 else {
throw QOIError()
}

// Loading 'Int' requires a byte count or guaranteed-size storage
self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
Expand Down
16 changes: 16 additions & 0 deletions Sources/BinaryParsing/Documentation.docc/Articles/ArrayParsers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Array Parsers

Parse arrays of bytes or other values.

## Topics

### Byte array parsers

- ``Swift/Array/init(parsingRemainingBytes:)``
- ``Swift/Array/init(parsing:byteCount:)``

### General array parsers

- ``Swift/Array/init(parsingAll:parser:)``
- ``Swift/Array/init(parsing:count:parser:)-(_,FixedWidthInteger,_)``
- ``Swift/Array/init(parsing:count:parser:)-(_,Int,_)``
150 changes: 150 additions & 0 deletions Sources/BinaryParsing/Documentation.docc/Articles/GettingStarted.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Getting Started with BinaryParsing

Get up to speed with a library designed to make parsing binary data safe, efficient, and easy to understand.

## Overview

The BinaryParsing library provides a comprehensive set of tools for safely parsing binary data in Swift. The library provides the ``ParserSpan`` type, a consumable, memory-safe view into binary data, and defines a convention for writing concise, composable parsing functions.

Using the library's tools — including the span type, parser primitives, and operators for working with newly parsed values — you can prevent common pitfalls like buffer overruns, integer overflows, and type confusion that can lead to security vulnerabilities or crashes.

### A span type for parsing

A ``ParserSpan`` is a view into binary data that tracks your current position and the remaining number of bytes. All the provided parsers consume data from the start of the span, shrinking its size as they produce values. Unlike unsafe pointer operations, `ParserSpan` automatically prevents you from reading past the end of your data.

### Library-provided parsers

The library provides parsers for standard library integers, strings, ranges, and arrays of bytes or custom-parsed types. The convention for these is an initializer with an `inout ParserSpan` parameter, along with any other configuration parameters that are required. These parsers all throw a `ParsingError`, and throw when encoutering memory safety, type safety, or integer overflow errors.

For example, the parsing initializers for `Int` take the parser span as well as storage type or storage size and endianness:

```swift
let values = try myData.withParserSpan { input in
let value1 = try Int(parsing: &input, storedAsBigEndian: Int32.self)
let value2 = try Int(parsing: &input, byteCount: 4, endianness: .big)
}
```

Designing parser APIs as initializers is only a convention. If it feels more natural to write some parsers as free functions, static functions, or even as a parsing type, that's okay! You'll find cases of each of these in the project's [Examples directory][examples].

## Example: QOI Header

Let's explore BinaryParsing through a real-world example: parsing the header for an image stored in the QOI ([Quite OK Image][qoi]) format. QOI is a simple lossless image format that demonstrates many common patterns in binary parsing.

### The QOI header structure

A QOI file begins with a 14-byte header, as shown in the specification:

```c
qoi_header {
char magic[4]; // magic bytes "qoif"
uint32_t width; // image width in pixels (BE)
uint32_t height; // image height in pixels (BE)
uint8_t channels; // 3 = RGB, 4 = RGBA
uint8_t colorspace; // 0 = sRGB with linear alpha
// 1 = all channels linear
};
```

### Parser implementation

Our declaration for the header in Swift corresponds to the specification, with `width` and `height` defined as `Int` and custom enumerations for the channels and colorspace:

```swift
extension QOI {
struct Header {
var width: Int
var height: Int
var channels: Channels
var colorspace: ColorSpace
}

enum Channels: UInt8 {
case rgb = 3, rgba = 4
}

enum ColorSpace: UInt8 {
case sRGB = 0, linear = 1
}
}
```

The parsing initializer follows the convention set by the library, with an `inout ParserSpan` parameter:

```swift
extension QOI.Header {
init(parsing input: inout ParserSpan) throws {
// Parsing goes here!
}
}
```

Next, we'll walk through the implementation of that initializer, line by line, to look at the safety and ease of use in the BinaryParsing library APIs.

#### Magic number validation

The first value in the binary data is a "magic number" – a common practice in binary formats that acts as a quick check that you're reading the right kind of file and working with the correct endianness. The code uses a `UInt32` initialzer to load a 32-bit big-endian value, and then checks it for correctness using `guard`:

```swift
let magic = try UInt32(parsingBigEndian: &input)
guard magic == 0x71_6f_69_66 else {
throw QOIError()
}
```

#### Parsing dimensions

Next, the width and height are also stored as 32-bit values, but we want to use them in our type as `Int` values. Instead of parsing `UInt32` values and _then_ converting them to `Int`, we'll use an `Int` parser that specifies the storage type, handling any possible overflow:

```swift
self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
self.height = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
```

### Parsing `RawRepresentable` types

Because the `Channels` and `ColorSpace` enumerations are backed by a `FixedWidthInteger` type, the library provides parsers that load and validate the parsed values. These parsers throw an error if the parsed value isn't one of the type's declared cases:

```swift
self.channels = try Channels(parsing: &input)
self.colorspace = try ColorSpace(parsing: &input)
```

### Safe arithmetic

After parsing all of the header's values, the last step is to perform some validation. Using the library's optional multiplication operator (`*?`) allows for concise arithmetic while preventing integer overflow errors:

```swift
guard let pixelCount = width *? height,
pixelCount <= maxPixelCount,
width > 0, height > 0
else { throw QOIError() }
```

### Bringing it together

The full parser implementation, as shown below, protects against buffer overruns, integer overflow, arithmetic overflow, type invalidity, and pointer lifetime errors:

```swift
extension QOI.Header {
init(parsing input: inout ParserSpan) throws {
let magic = try UInt32(parsingBigEndian: &input)
guard magic == 0x71_6f_69_66 else {
throw QOIError()
}

self.width = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
self.height = try Int(parsing: &input, storedAsBigEndian: UInt32.self)
self.channels = try Channels(parsing: &input)
self.colorspace = try ColorSpace(parsing: &input)

guard let pixelCount = width *? height,
pixelCount <= maxPixelCount,
width > 0, height > 0
else { throw QOIError() }
}
}
```

[qoi]: https://qoiformat.org/
[examples]: https://github.com/apple/swift-binary-parsing/tree/main/Examples
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Integer Parsers

Parse standard library integer types.

## Overview

The `BinaryParsing` integer parsers provide control over three different aspects of loading integers from raw data:

- _Size:_ The size of the data in memory can be specified in three different ways. Use an integer type's direct parsing initializer, like `UInt16(parsingBigEndian:)`, to load from the exact size of the integer; use a parser with a `byteCount` parameter to specify an exact number of bytes; or use a parser like `Int(parsing:storedAsBigEndian:)` to load and convert from another integer's size in memory.
- _Endianness_: The endianness of a value in memory can be specified either by choosing a parsing initializer with the required endianness or by passing an ``Endianness`` value to a parser. Note that endianness is not relevant when parsing a single-byte integer or an integer stored as a single byte.
- _Signedness_: The signedness of the parsed value is chosen by the type being parsed or, for the parsers like `Int(parsing:storedAs:)`, by the storage type of the parsed value.

## Topics

### Fixed-size parsers

- ``SingleByteInteger/init(parsing:)``
- ``MultiByteInteger/init(parsingBigEndian:)``
- ``MultiByteInteger/init(parsingLittleEndian:)``
- ``MultiByteInteger/init(parsing:endianness:)``

### Byte count-based parsers

- ``Swift/FixedWidthInteger/init(parsingBigEndian:byteCount:)``
- ``Swift/FixedWidthInteger/init(parsingLittleEndian:byteCount:)``
- ``Swift/FixedWidthInteger/init(parsing:endianness:byteCount:)``

### Parsing and converting

- ``Swift/FixedWidthInteger/init(parsing:storedAs:)``
- ``Swift/FixedWidthInteger/init(parsing:storedAsBigEndian:)``
- ``Swift/FixedWidthInteger/init(parsing:storedAsLittleEndian:)``
- ``Swift/FixedWidthInteger/init(parsing:storedAs:endianness:)``

### Endianness

- ``Endianness``

### Supporting protocols

- ``SingleByteInteger``
- ``MultiByteInteger``
- ``PlatformWidthInteger``
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Miscellaneous Parsers

Parse ranges and custom raw representable types.

## Topics

### Range parsers

- ``Swift/Range/init(parsingStartAndEnd:boundsParser:)-(_,(ParserSpan)(ParsingError)->Bound)``
- ``Swift/Range/init(parsingStartAndCount:parser:)-(_,(ParserSpan)(ParsingError)->Bound)``
- ``Swift/ClosedRange/init(parsingStartAndEnd:boundsParser:)-(_,(ParserSpan)(ParsingError)->Bound)``

### `RawRepresentable` parsers

- ``Swift/RawRepresentable/init(parsing:)``
- ``Swift/RawRepresentable/init(parsingBigEndian:)``
- ``Swift/RawRepresentable/init(parsingLittleEndian:)``
- ``Swift/RawRepresentable/init(parsing:endianness:)``
- ``Swift/RawRepresentable/init(parsing:storedAs:)``
- ``Swift/RawRepresentable/init(parsing:storedAsBigEndian:)``
- ``Swift/RawRepresentable/init(parsing:storedAsLittleEndian:)``
- ``Swift/RawRepresentable/init(parsing:storedAs:endianness:)``
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Optional Operations

Safely perform calculations with optional-producing operators.

## Overview

Optional operators provide a way to seamlessly work with newly parsed
values without risk of integer overflow or other common errors that
may result in a runtime error.

For example, the following code parses two values from a ``ParserSpan``,
and then uses them to create a range:

```swift
let start = try UInt16(parsingBigEndian: &input)
let count = try UInt8(parsing: &input)
guard let range = start ..<? (start +? count) else {
throw MyParsingError(...)
}
```

## Topics

### Arithmetic operators

- ``Swift/Optional/+?(_:_:)``
- ``Swift/Optional/-?(_:_:)``
- ``Swift/Optional/*?(_:_:)``
- ``Swift/Optional//?(_:_:)``
- ``Swift/Optional/%?(_:_:)``

### Assigning arithmetic operators

- ``Swift/Optional/+?=(_:_:)``
- ``Swift/Optional/-?=(_:_:)``
- ``Swift/Optional/*?=(_:_:)``
- ``Swift/Optional//?=(_:_:)``
- ``Swift/Optional/%?=(_:_:)``

### Range operators

- ``Swift/Optional/..<?(_:_:)``
- ``Swift/Optional/...?(_:_:)``

### Collection subscripting

- ``Swift/Collection/subscript(ifInBounds:)-(Index)``
- ``Swift/Collection/subscript(ifInBounds:)-(FixedWidthInteger)``
- ``Swift/Collection/subscript(ifInBounds:)-(Range<Index>)``
- ``Swift/Collection/subscript(ifInBounds:)-(Range<FixedWidthInteger>)``
16 changes: 16 additions & 0 deletions Sources/BinaryParsing/Documentation.docc/Articles/StringParsers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# String Parsers

Parse strings of different lengths and encodings.

## Topics

### UTF-8 parsers

- ``Swift/String/init(parsingNulTerminated:)``
- ``Swift/String/init(parsingUTF8:)``
- ``Swift/String/init(parsingUTF8:count:)``

### UTF-16 parsers

- ``Swift/String/init(parsingUTF16:)``
- ``Swift/String/init(parsingUTF16:codeUnitCount:)``
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Throwing Operations

Use throwing variations of arithmetic methods, integer conversions, and collection subscripting.

## Overview

## Topics

### Arithmetic operations

- ``Swift/FixedWidthInteger/addingThrowingOnOverflow(_:)``
- ``Swift/FixedWidthInteger/subtractingThrowingOnOverflow(_:)``
- ``Swift/FixedWidthInteger/multipliedThrowingOnOverflow(by:)``
- ``Swift/FixedWidthInteger/dividedThrowingOnOverflow(by:)``
- ``Swift/FixedWidthInteger/remainderThrowingOnOverflow(dividingBy:)``

### Assigning arithmetic operations

- ``Swift/FixedWidthInteger/addThrowingOnOverflow(_:)``
- ``Swift/FixedWidthInteger/subtractThrowingOnOverflow(_:)``
- ``Swift/FixedWidthInteger/multiplyThrowingOnOverflow(by:)``
- ``Swift/FixedWidthInteger/divideThrowingOnOverflow(by:)``
- ``Swift/FixedWidthInteger/formRemainderThrowingOnOverflow(dividingBy:)``

### Integer conversion

- ``Swift/BinaryInteger/init(throwingOnOverflow:)``

### Collection subscripting

- ``Swift/Collection/subscript(throwing:)->Self.Element``
- ``Swift/Collection/subscript(throwing:)->Self.SubSequence``
48 changes: 48 additions & 0 deletions Sources/BinaryParsing/Documentation.docc/BinaryParsing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# ``BinaryParsing``

A library for building safe, efficient binary parsers in Swift.

## Overview

The `BinaryParsing` library provides a set of tools for safely parsing binary
data, while managing type and memory safety and eliminating common value-based
undefined behavior, such as integer overflow. The library provides:

- ``ParserSpan`` and ``ParserRange``: a raw span that is designed for efficient
consumption of binary data, and a range type that represents a portion of that
span for deferred processing. A `ParserSpan` is most often consumed from the
front, and also supports seeking operations throughout the span.
- Parsing initializers for standard library integer types, strings, arrays, and
ranges, that specifically enable safe parsing practices. The library also
provides parsing initializers that validate the result for `RawRepresentable`
types.
- Optional-producing operators and throwing methods for common arithmetic and
other operations, for calculations with untrusted parsed values.
- Adapters for data and collection types to make parsing simple at the call
site.


## Topics

### Essentials

- <doc:GettingStarted>
- ``ParserSpan``
- ``ParserRange``

### Parsing tools

- <doc:IntegerParsers>
- <doc:StringParsers>
- <doc:ArrayParsers>
- <doc:MiscellaneousParsers>

### Working with untrusted values

- <doc:OptionalOperations>
- <doc:ThrowingOperations>

### Error handling

- ``ParsingError``
- ``ThrownParsingError``
Loading