-
Notifications
You must be signed in to change notification settings - Fork 2
Type consistency fixes and Cortex-M0+ assembly optimizations #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Type consistency fixes and Cortex-M0+ assembly optimizations #40
Conversation
- Use size_t for memory function parameters and string utilities - Add const qualifiers to function parameters where appropriate - Use UINT32_MAX sentinel instead of -1 for EEPROM wear-leveling indices - Fix type consistency in rfm69 (retries), USB CDC, and other modules - Simplify __STRUNCATE to avoid sign conversion warning - Clean up usbCDCTask with early continue pattern Co-Authored-By: Claude <[email protected]>
- Replace uint_fast8_t/int_fast8_t with uint8_t/int8_t for storage - Use int32_t for hot-path loop counters to avoid extend instructions - Change function parameters to uint8_t where values are small - Use .bit.FIELD = 1 pattern instead of .reg |= MASK for single bits - Add explicit casts to silence truncation warnings - Mask PMUX values to 4 bits for bit-field assignment Saves RAM by using appropriate sized types while maintaining performance in DSP hot-path functions. Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
I definitely prefer the .reg and mask format, rather than the bit fields. It’s clear exactly what is happening, and it makes the use of the vendor defines rather than magic numbers. As ever, I’m not precious about! Nice work on the typing improvements, the ainRemap alone is worth more than 100 bytes 👌 |
Well, the problem with .reg and mask, you need a cast because the ~ makes a 32bit out of the vendor defines. That's also very bad/annoying that they're so many defines, that's not type-safe😢. |
|
Yes, fair on the 32bit extension for the narrower registers.
Welcome to lovely world of vendor headers 👍 |
- util: Add utilUtoa for unsigned int-to-string, refactor utilItoa to delegate to utilUtoa after handling sign - dataPack: Add strnCatUint using utilUtoa, update call sites for unsigned values (msg, wattHour, pulseCnt, id) - dataPack: Fix int16_t truncation warning with explicit cast - emon32.h: Change PackedDataCommon V[] to uint16_t (voltage is positive) Co-Authored-By: Claude <[email protected]>
- Change wlIdxNxtWr from uint32_t to uint8_t with UINT8_MAX sentinel - Remove unused pIdx parameter from eepromWriteWL() - Fix signed/unsigned type mismatches in eeprom.c - Add explicit casts for i2cDataWrite address.lsb calls - Update loop counters to appropriate unsigned types - Add 'u' suffix to constants for type consistency Co-Authored-By: Claude <[email protected]>
- configTimeToCycles: return uint16_t instead of int32_t - dataPackPacked: return uint8_t instead of int8_t - uartPutsNonBlocking: change len from uint16_t to size_t - Fix loop counter types (int32_t -> uint8_t/uint32_t) Co-Authored-By: Claude <[email protected]>
- Change int32_t loop counters to uint32_t in emon_CM.c - Update related type signatures in emon_CM.h - Fix loop counter types in emon32.c and configuration.c Co-Authored-By: Claude <[email protected]>
- ainRemap: int8_t -> uint8_t (values are 0-11, always positive) - mapCTLog: int8_t -> uint8_t - Fix related loop counter types Co-Authored-By: Claude <[email protected]>
Update parameter types for consistency: - inBufferClear: int32_t -> size_t - dmacChannelDisable: uint32_t -> uint8_t - uartInterruptEnable: uint32_t -> uint8_t - ecmConfigChannel/configChannelCT/configChannelV: int8_t -> int_fast8_t - ds18b20StartSample: int32_t -> uint8_t Co-Authored-By: Claude <[email protected]>
- driver_DMAC: use bit field access instead of mask - emon32: uint32_t -> size_t for loop counter - periph_DS18B20: int -> uint8_t for loop counters - pulse: uint8_t -> size_t for index parameters, add stddef.h Co-Authored-By: Claude <[email protected]>
Update all OneWire functions to use size_t for opaIdx parameter since it is used as an array index. Co-Authored-By: Claude <[email protected]>
- Change offset from int32_t to uint32_t with adjusted logic - Add unsigned suffix to position literals - Use uint8_t for loop counter in tempSetup Co-Authored-By: Claude <[email protected]>
- board_def.h: add unsigned suffix to OVERSAMPLING_RATIO - configuration.c: vCh1/vCh2 to uint8_t, fix qfp_int2float/uint2float usage - emon_CM.c: mapLogCT/discardCycles to uint8_t, samplePeriodus to uint32_t, calibrationPhase idxCT to int_fast8_t, ecmClearEnergyChannel idx to size_t - emon_CM.h: add stddef.h, update vChan1/vChan2 to uint8_t, fix signatures Co-Authored-By: Claude <[email protected]>
|
This PR will also fix #44 (Type mismatch: pulseGetCount returns uint64_t but pulseCnt is uint32_t) |
The pulseCnt fields in Emon32Dataset_t and Emon32Cumulative_t are uint32_t, so align the internal storage and API to match. Fixes openenergymonitor#44 Co-Authored-By: Claude <[email protected]>
configuration.c: - Use size_t for channel indices and buffer positions - Use UINT8_MAX instead of -1 for clearAccumIdx sentinel - Simplify putFloat padding loop - Use uint8_t for accumulator index parsing emon_CM.c: - Change floorf_ return type from int to int32_t - Add explicit q15_t cast in applyCorrection - Add const qualifiers to local variables Co-Authored-By: Claude <[email protected]>
|
BTW, if you plan to merge PR41, that'd be cool to not wait too much... since the utils functions are generating a lot of warnings too :D |
Co-Authored-By: Claude <[email protected]>
|
Haven't forgotten @FredM67, just a lot on :) I'll try and get it merged tomorrow, all being well. |
- getUniqueID: int32_t -> size_t for index parameter - printf: %ld/%d -> %x for size_t channel numbers - driver_USB: use size_t for loop counter - Add stddef.h include for size_t Co-Authored-By: Claude <[email protected]>
- board_def: add unsigned suffix to EEPROM_CONFIG_SIZE - configuration: use uint8_t/size_t for loop counters - configuration: fix printf format specifiers for size_t - configuration: use unsigned literal 1024u - eeprom: add unsigned suffix to literal Co-Authored-By: Claude <[email protected]>
- Add utilUtoa as the core unsigned-to-string conversion - Simplify utilItoa to handle sign and delegate to utilUtoa - Remove unused functions: utilAbs, utilStrReverse, utilStrlen - No code duplication between signed and unsigned variants Co-Authored-By: Claude <[email protected]>
Refactor utilAtoi to use utilAtoui as core function, similar to the utilItoa/utilUtoa pattern. This eliminates code duplication and provides a dedicated function for unsigned conversions. Co-Authored-By: Claude <[email protected]>
- Add union members (u8/u16/u32, i8/i16/i32) to ConvInt_t and ConvUint_t for direct access to different bit widths without casts - Replace utilAtoi with utilAtoui throughout configuration.c where unsigned values are expected - Fix floorf_ conversion warnings using union type-punning instead of casts for IEEE 754 bit manipulation - Suppress warnings for third-party TinyUSB dcd_samd.c Co-Authored-By: Claude <[email protected]>
- configuration.c: Use union to extract uint8_t from char subtraction for ze1-12 digit parsing - emon_CM.c: Use union to extract q15_t from int for absolute value calculation in zero-crossing detection Co-Authored-By: Claude <[email protected]>
- util.c: Add const to base parameter in utilUtoa and utilItoa to match declarations in util.h - dataPack.c: Add const to json parameter in dataPackSerial to match declaration in dataPack.h Co-Authored-By: Claude <[email protected]>
Mark pCfgV as const since it is only read from, not modified. This clarifies the parameter is input-only. Co-Authored-By: Claude <[email protected]>
Replace int32_t, uint8_t, and other integer types with size_t for loop counters used as array indices. This improves type consistency and avoids potential sign comparison warnings. Co-Authored-By: Claude <[email protected]>
…tency-fixes # Conflicts: # Makefile
Replace inline squaring with optimized Cortex-M0+ assembly routines: - usqr64 for unsigned 32->64 bit square (numSamples in calcRMS) - ssqr64 for signed 32->64 bit square (V and CT samples) Change ssqr64 return type from int64_t to uint64_t since x² is always non-negative. These are used in the performance-critical RAMFUNC paths: - ecmInjectSample: voltage and current squaring for RMS - calcRMS: sample count squaring Co-Authored-By: Claude <[email protected]>
Add optimized Cortex-M0+ assembly routine for signed multiplication with 64-bit result, ~2x faster than compiler-generated code. Performance on target (100k iterations): - smul64 (ASM): 121,093 us - (i64)a * b: 244,260 us (2.02x slower) Also add validation and performance tests for smul64, and syscall stubs to silence linker warnings in test firmware. Co-Authored-By: Claude <[email protected]>
Replace (int64_t)(thisCT * V) with smul64(thisCT, V) for the CT×voltage power accumulations. This is ~2x faster on Cortex-M0+. 4 call sites updated: - sumPA[0], sumPB[0] for primary voltage - sumPA[1], sumPB[1] for L-L load (secondary voltage) Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Claude <[email protected]>
Move usqr64, ssqr64, and smul64 to .ramfunc section to avoid flash wait states in the hot DSP path. Co-Authored-By: Claude <[email protected]>
Channel numbers were printed in hex (iCala, iCalb, iCalc) instead of decimal (iCal10, iCal11, iCal12) for channels 10-12. Co-Authored-By: Claude <[email protected]>
Summary
This PR improves type consistency across the codebase and adds optimized assembly routines for performance-critical arithmetic operations.
Type Consistency
size_tfor array indices and memory function parametersuint8_t,uint32_t,int32_t) based on value rangesconstqualifiers to function parameters where values are read-onlyConvInt_t,ConvUint_t) for type-safe integer width accessUtility Functions
utilUtoa/utilAtouifor unsigned conversionsutilItoa/utilAtoito delegate to unsigned variantsAssembly Optimizations (Cortex-M0+)
usqr64: unsigned 32→64 bit square (~2.3x faster)ssqr64: signed 32→64 bit square (~2.5x faster)smul64: signed 32×32→64 bit multiply (~2x faster).ramfuncsection) to avoid flash wait statesApplied to hot-path DSP functions in
ecmInjectSampleandecmProcessSet.Other Fixes
uint64_ttouint32_tTest plan
Closes #36, closes #39, closes #44, closes #45
🤖 Generated with Claude Code