Skip to content

Conversation

@FredM67
Copy link
Contributor

@FredM67 FredM67 commented Jan 8, 2026

Summary

  • Optimize util conversion functions using shift-based operations instead of software division
  • Add benchmark test for measuring performance on ARM Cortex-M0+ target

Performance Results

Measured on emon32 hardware (SAMD21, 1000 iterations per test value):

Function Original Optimized Improvement vs newlib stdlib
utilItoa 89,795 µs 33,784 µs 2.7x 2.0x faster
utilAtoi 33,768 µs 12,643 µs 2.7x 5.1x faster
utilFtoa 84,628 µs 63,126 µs 25% N/A
utilAtof 159,728 µs 95,508 µs 40% ~4% slower*

*utilAtof supports comma as decimal separator (European locales), which stdlib atof does not.

Key Optimizations

  • fastDiv10() using shifts/adds only - avoids expensive software divide on M0+
  • Process strings left-to-right to eliminate string reversal pass
  • Multiply by 10 using (x<<3)+(x<<1) instead of *10
  • utilAtoi and utilAtof no longer modify their input buffers

Test plan

  • Benchmark shows correctness (ITOA outputs match between util and stdlib)
  • All conversions produce correct results
  • Run full firmware build and test

🤖 Generated with Claude Code

FredM67 and others added 2 commits January 8, 2026 17:29
Use shift-based operations instead of software division.

Performance comparison (1000 iterations x test values):

| Function  | Original   | Optimized  | vs stdlib        |
|-----------|------------|------------|------------------|
| utilItoa  | 89,795 us  | 33,784 us  | 2.0x faster      |
| utilAtoi  | 33,768 us  | 12,643 us  | 5.1x faster      |
| utilFtoa  | 84,628 us  | 63,126 us  | N/A              |
| utilAtof  | 159,728 us | 95,508 us  | ~4% slower*      |

*utilAtof supports comma as decimal separator (European locales)

Key optimizations:
- fastDiv10() using shifts/adds instead of hardware divide
- Process strings left-to-right to eliminate reversal
- Multiply by 10 using (x<<3)+(x<<1) instead of *10
- utilAtoi/utilAtof no longer modify input buffer

Also adds benchmark test for measuring performance on target.

Co-Authored-By: Claude <[email protected]>
Add const correctness to utilAtof() and utilAtoi() functions since they
only read from their string buffer parameters and never modify them.
This improves API safety by preventing accidental modifications and
allows these functions to accept string literals and const char*
arguments without requiring casts.

Co-Authored-By: Claude <[email protected]>
SAMD21 has a single-cycle hardware multiplier, so the compiler
optimizes `* 10` to a single `muls` instruction. The shift-based
approach ((x << 3) + (x << 1)) was not faster and hurt readability.

Co-Authored-By: Claude <[email protected]>
@FredM67
Copy link
Contributor Author

FredM67 commented Jan 11, 2026

Hi @awjlogan
Will you merge this PR or do you prefer to leave the code as it is in main?

@awjlogan
Copy link
Collaborator

Hi @awjlogan Will you merge this PR or do you prefer to leave the code as it is in main?

Almost certainly - just need to focus a bit on the CM part at the moment :)

Replace shift-based *100 with direct multiplication.
SAMD has single-cycle 32b multiplier making shifts unnecessary.

Co-Authored-By: Claude <[email protected]>
@FredM67
Copy link
Contributor Author

FredM67 commented Jan 13, 2026

@awjlogan
Beep ;-)

@awjlogan awjlogan merged commit 3f93624 into openenergymonitor:main Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants