Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cc.audio.wav module #1928

Open
wants to merge 2 commits into
base: mc-1.20.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,17 @@ Typically DFPWM audio is read from [the filesystem][`fs.ReadHandle`] or a [a web
and converted a format suitable for [`speaker.playAudio`].

## Encoding and decoding files
This modules exposes two key functions, [`make_decoder`] and [`make_encoder`], which construct a new decoder or encoder.
This module exposes two key functions, [`make_decoder`] and [`make_encoder`], which construct a new decoder or encoder.
The returned encoder/decoder is itself a function, which converts between the two kinds of data.

These encoders and decoders have lots of hidden state, so you should be careful to use the same encoder or decoder for
a specific audio stream. Typically you will want to create a decoder for each stream of audio you read, and an encoder
for each one you write.

## Converting audio to DFPWM
DFPWM is not a popular file format and so standard audio processing tools will not have an option to export to it.
DFPWM is not a popular file format and so standard audio processing tools may not have an option to export to it.
Instead, you can convert audio files online using [music.madefor.cc], the [LionRay Wav Converter][LionRay] Java
application or development builds of [FFmpeg].
application or [FFmpeg] 5.1 or later.

[music.madefor.cc]: https://music.madefor.cc/ "DFPWM audio converter for Computronics and CC: Tweaked"
[LionRay]: https://github.com/gamax92/LionRay/ "LionRay Wav Converter "
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
-- SPDX-FileCopyrightText: 2024 The CC: Tweaked Developers
--
-- SPDX-License-Identifier: MPL-2.0

--[[-
Read WAV audio files into a table, including audio data.

WAV is a common file format used to store audio with metadata, including
information about the type of audio stored inside. WAV can store many different
types of codecs inside, including PCM and [DFPWM][`cc.audio.dfpwm`].

This module exposes a function to parse a WAV file into a table, [`readWAV`].
This function takes in the binary data from a WAV file, and outputs a more
usable table format with all the metadata and file audio inside. It also has a
[`readWAVFile`] function to simplify reading from a single file.

@see speaker.playAudio To play the chunks decoded by this module.
@since 1.113.0
@usage Reads "data/example.wav" into a table, prints its codec, sample rate,
and length in seconds, and plays the audio on a speaker.

```lua
local wav = require("cc.audio.wav")
local speaker = peripheral.find("speaker")

local audio = wav.readWAVFile("data/example.wav")
print("Codec type:", audio.codec)
print("Sample rate:", audio.sampleRate, "Hz")
-- audio.length is the length in samples; divide by sample rate to get seconds
print("Length:", audio.length / audio.sampleRate, "s")

for chunk in audio.read, 131072 do
while not speaker.playAudio(chunk) do
os.pullEvent("speaker_audio_empty")
end
end
```
]]

local expect = require "cc.expect".expect
local dfpwm = require "cc.audio.dfpwm"

local str_unpack, str_sub, math_floor = string.unpack, string.sub, math.floor

local dfpwmUUID = "3ac1fa38-811d-4361-a40d-ce53ca607cd1" -- UUID for DFPWM in WAV files

local function uuidBytes(uuid) return uuid:gsub("-", ""):gsub("%x%x", function(c) return string.char(tonumber(c, 16)) end) end

local wavExtensible = {
dfpwm = uuidBytes(dfpwmUUID),
pcm = uuidBytes "01000000-0000-1000-8000-00aa00389b71",
msadpcm = uuidBytes "02000000-0000-1000-8000-00aa00389b71",
alaw = uuidBytes "06000000-0000-1000-8000-00aa00389b71",
ulaw = uuidBytes "07000000-0000-1000-8000-00aa00389b71",
adpcm = uuidBytes "11000000-0000-1000-8000-00aa00389b71",
pcm_float = uuidBytes "03000000-0000-1000-8000-00aa00389b71",
}

local wavMetadata = {
IPRD = "album",
INAM = "title",
IART = "artist",
IWRI = "author",
IMUS = "composer",
IPRO = "producer",
IPRT = "trackNumber",
ITRK = "trackNumber",
IFRM = "trackCount",
PRT1 = "partNumber",
PRT2 = "partCount",
TLEN = "length",
IRTD = "rating",
ICRD = "date",
ITCH = "encodedBy",
ISFT = "encoder",
ISRF = "media",
IGNR = "genre",
ICMT = "comment",
ICOP = "copyright",
ILNG = "language",
}

--[[- Read WAV data into a table.

The returned table contains the following fields:
- `codec`: A string with information about the codec used in the file (one of `u8`, `s16`, `s24`, `s32`, `f32`, `dfpwm`)
- `sampleRate`: The sample rate of the audio in Hz. If this is not 48000, the file will need to be resampled to play correctly.
- `channels`: The number of channels in the file (1 = mono, 2 = stereo).
- `length`: The number of samples in the file. Divide by sample rate to get seconds.
- `metadata`: If the WAV file contains `INFO` metadata, this table contains the metadata.
Known keys are converted to friendly names like `artist`, `album`, and `track`, while unknown keys are kept the same.
Otherwise, this table is empty.
- `read(length: number): number[]...`: This is a function that reads the audio data in chunks.
It takes the number of samples to read, and returns each channel chunk as multiple return values.
Channel data is in the same format as `speaker.playAudio` takes: 8-bit signed numbers.

@tparam string data The WAV data to read.
@treturn table The decoded WAV file data table.
]]
local function readWAV(data)
expect(1, data, "string")
local bitDepth, dataType, blockAlign
local temp, pos = str_unpack("c4", data)
if temp ~= "RIFF" then error("bad argument #1 (not a WAV file)", 2) end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise CC is a little inconsistent here1, but I think we should probably follow what Lua does, and return nil + the error message instead of erroring here.

Footnotes

  1. For example, textutils.unserialiseJSON should probably return nil, message rather than erroring.

pos = pos + 4
temp, pos = str_unpack("c4", data, pos)
if temp ~= "WAVE" then error("bad argument #1 (not a WAV file)", 2) end
local retval = { metadata = {} }
while pos <= #data do
local size
temp, pos = str_unpack("c4", data, pos)
size, pos = str_unpack("<I", data, pos)
if temp == "fmt " then
local chunk = str_sub(data, pos, pos + size - 1)
pos = pos + size
local format
format, retval.channels, retval.sampleRate, blockAlign, bitDepth = str_unpack("<HHIxxxxHH", chunk)
if format == 1 then
dataType = bitDepth == 8 and "unsigned" or "signed"
retval.codec = (bitDepth == 8 and "u" or "s") .. bitDepth
elseif format == 3 then
dataType = "float"
retval.codec = "f32"
elseif format == 0xFFFE then
bitDepth = str_unpack("<H", chunk, 19)
local uuid = str_sub(chunk, 25, 40)
if uuid == wavExtensible.pcm then
dataType = bitDepth == 8 and "unsigned" or "signed"
retval.codec = (bitDepth == 8 and "u" or "s") .. bitDepth
elseif uuid == wavExtensible.dfpwm then
dataType = "dfpwm"
retval.codec = "dfpwm"
elseif uuid == wavExtensible.pcm_float then
dataType = "float"
retval.codec = "f32"
else error("unsupported WAV file", 2) end
else error("unsupported WAV file", 2) end
elseif temp == "data" then
local data = str_sub(data, pos, pos + size - 1)
if #data < size then error("invalid WAV file", 2) end
if not retval.length then retval.length = size / blockAlign end
pos = pos + size
local pos = 1
local channels = retval.channels
if dataType == "dfpwm" then
local decoder = dfpwm.make_decoder()
function retval.read(samples)
if pos > #data then return nil end
local chunk = decoder(str_sub(data, pos, pos + math.ceil(samples * channels / 8) - 1))
pos = pos + math.ceil(samples * channels / 8)
local res = {}
for i = 1, channels do
local c = {}
res[i] = c
for j = 1, samples do
c[j] = chunk[(j - 1) * channels + i]
end
end
return table.unpack(res)
end
else
local format = dataType == "unsigned" and "I" .. (bitDepth / 8) or (dataType == "signed" and "i" .. (bitDepth / 8) or "f")
local transform
if dataType == "unsigned" then
function transform(n) return n - 128 end
elseif dataType == "signed" then
if bitDepth == 16 then function transform(n) return math_floor(n / 0x100) end
elseif bitDepth == 24 then function transform(n) return math_floor(n / 0x10000) end
elseif bitDepth == 32 then function transform(n) return math_floor(n / 0x1000000) end end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an error here for an invalid bit depth?

elseif dataType == "float" then
function transform(n) return math_floor(n * (n < 0 and 128 or 127)) end
end
function retval.read(samples)
if pos > #data then return nil end
local chunk = { ("<" .. format:rep(math.min(samples * channels, (#data - pos + 1) / (bitDepth / 8)))):unpack(data, pos) }
Copy link
Member

@SquidDev SquidDev Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I don't feel good about this line (the extra format string allocation, and then putting it into a table), but some quick benchmarks do show it's the quickest approach :(. The crimes we must commit sometimes!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't like it either, but I found the same results after a comparison in AUKit. It's gross, but it's for the best.

pos = table.remove(chunk)
local res = {}
for i = 1, channels do
local c = {}
res[i] = c
for j = 1, samples do
c[j] = transform(chunk[(j - 1) * channels + i])
end
end
return table.unpack(res)
end
end
elseif temp == "fact" then
retval.length, pos = str_unpack("<I4", data, pos)
elseif temp == "LIST" then
local type = str_unpack("c4", data, pos)
if type == "INFO" then
local e = pos + size
pos = pos + 4
while pos < e do
local str
type, str, pos = str_unpack("!2<c4s4Xh", data, pos)
str = str:gsub("\0+$", "")
if wavMetadata[type] then retval.metadata[wavMetadata[type]] = tonumber(str) or str
else retval.metadata[type] = tonumber(str) or str end
end
else pos = pos + size end
else pos = pos + size end
end
if not retval.read then error("invalid WAV file", 2) end
return retval
end

--- Reads a WAV file from a path.
--
-- This functions identically to [`readWAV`], but reads from a file instead.
--
-- @tparam string path The (absolute) path to read from.
-- @treturn table The decoded WAV file table.
-- @see readWAV To read WAV data from a string.
local function readWAVFile(path)
expect(1, path, "string")
local file = assert(fs.open(path, "rb"))
local data = file.readAll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could avoid reading the whole file into memory at once. I wonder if readWAV could take a "reader" function (like function(bytes: number): string|nil), so you can just do readWAV(file.read) directly.

This would require some changes to parsing the INFO section (which uses string.unpack("s")), but I think otherwise should be possible?

I guess this would then mean there needs to be a close method on the WAV handle, to clean up the underlying file handle. Ughr.

file.close()
return readWAV(data)
end

return { readWAV = readWAV, readWAVFile = readWAVFile }
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,6 @@ local function get_speakers(name)
end
end

local function pcm_decoder(chunk)
local buffer = {}
for i = 1, #chunk do
buffer[i] = chunk:byte(i) - 128
end
return buffer
end

local function report_invalid_format(format)
printError(("speaker cannot play %s files."):format(format))
local pp = require "cc.pretty"
Expand Down Expand Up @@ -63,60 +55,35 @@ elseif cmd == "play" then
end

local start = handle.read(4)
local pcm = false
local wav = false
local size = 16 * 1024 - 4
if start == "RIFF" then
handle.read(4)
if handle.read(8) ~= "WAVEfmt " then
handle.close()
error("Could not play audio: Unsupported WAV file", 0)
end

local fmtsize = ("<I4"):unpack(handle.read(4))
local fmt = handle.read(fmtsize)
local format, channels, rate, _, _, bits = ("<I2I2I4I4I2I2"):unpack(fmt)
if not ((format == 1 and bits == 8) or (format == 0xFFFE and bits == 1)) then
handle.close()
error("Could not play audio: Unsupported WAV file", 0)
end
if channels ~= 1 or rate ~= 48000 then
print("Warning: Only 48 kHz mono WAV files are supported. This file may not play correctly.")
local data = start .. handle.readAll()
handle.close()
local ok
ok, handle = pcall(require("cc.audio.wav").readWAV, data)
if not ok then
printError("Could not play audio:")
error(err, 0)
end
if format == 0xFFFE then
local guid = fmt:sub(25)
if guid ~= "\x3A\xC1\xFA\x38\x81\x1D\x43\x61\xA4\x0D\xCE\x53\xCA\x60\x7C\xD1" then -- DFPWM format GUID
handle.close()
error("Could not play audio: Unsupported WAV file", 0)
end
size = size + 4
else
pcm = true
size = 16 * 1024 * 8
end

repeat
local chunk = handle.read(4)
if chunk == nil then
handle.close()
error("Could not play audio: Invalid WAV file", 0)
elseif chunk ~= "data" then -- Ignore extra chunks
local size = ("<I4"):unpack(handle.read(4))
handle.read(size)
end
until chunk == "data"

handle.read(4)
wav = true
start = nil
if handle.sampleRate ~= 48000 then error("Could not play audio: Unsupported sample rate") end
if handle.channels ~= 1 then printError("This audio file has more than one channel. It may not play correctly.") end
-- Detect several other common audio files.
elseif start == "OggS" then return report_invalid_format("Ogg")
elseif start == "fLaC" then return report_invalid_format("FLAC")
elseif start:sub(1, 3) == "ID3" then return report_invalid_format("MP3")
elseif start == "<!DO" --[[<!DOCTYPE]] then return report_invalid_format("HTML")
end

print("Playing " .. file)
if handle.metadata and handle.metadata.title and handle.metadata.artist then
print("Playing " .. handle.metadata.artist .. " - " .. handle.metadata.title)
else
print("Playing " .. file)
end

local decoder = pcm and pcm_decoder or require "cc.audio.dfpwm".make_decoder()
local decoder = wav and function(c) return c end or require "cc.audio.dfpwm".make_decoder()
while true do
local chunk = handle.read(size)
if not chunk then break end
Expand Down
Loading
Loading