Fix _str_to_int precision loss above 2^53 and add math_eval tests by chenchenpan · Pull Request #29 · google-deepmind/simply

chenchenpan · 2026-06-14T05:39:21Z

Summary

math_eval._str_to_int converted strings to int via float(x), which
silently loses precision for integers larger than 2^53 — the largest integer a
float64 can represent exactly. This resolves the existing TODO on that
function and adds the first test coverage for math_eval.py.

The bug

float('9007199254740993') (2^53 + 1) rounds to 9007199254740992.0, so
_str_to_int returned 9007199254740992 — off by one. Larger values drift
further (e.g. '12345678901234567890' -> 12345678901234567168).

This matters for MATH-style answer grading: _str_to_int feeds _normalize,
so two distinct large integers could normalize to the same string and be graded
as equal.

The fix

Parse integer-formatted strings directly with int(x) (arbitrary precision),
and fall back to float() only for non-integer formats:

def _str_to_int(x: str) -> int:
  x = x.replace(',', '')
  try:
    return int(x)
  except ValueError:
    return int(float(x))

'1.0' and '5e3' raise ValueError on int() and fall through to the float
path, so existing behavior for those is preserved.

Tests

Adds simply/utils/math_eval_test.py (absltest + parameterized, matching
the repo convention) — the first tests for this module. Covers answer
extraction, \boxed{}/\fbox{} parsing, normalization helpers, eval gating,
split_tuple, the >2^53 regression cases, and an end-to-end match test
(skipped when sympy is unavailable). 66 tests, all passing.

Notes

No public API change; _str_to_int is module-private.
No new dependencies.

google-cla · 2026-06-14T05:39:32Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

`_str_to_int` converted strings via `float(x)` before `int(x)`, which silently lost precision for integers larger than 2^53 (the largest integer a float64 represents exactly). For example '9007199254740993' (2^53 + 1) returned 9007199254740992. Parse integer-formatted strings directly with `int(x)`, which has arbitrary precision, and fall back to `float()` only for non-integer formats like '1.0' or '5e3'. This resolves the existing TODO. Also add simply/utils/math_eval_test.py, the first test coverage for this module: answer extraction, boxed-answer parsing, normalization helpers, eval gating, the >2^53 regression cases, and an end-to-end `match` test that skips when sympy is unavailable. 66 tests, all pass.

…oogle-deepmind#29

chenchenpan force-pushed the fix/math-eval-int-precision branch from 6e73c05 to 069ee68 Compare June 14, 2026 05:54

chenchenpan added a commit to chenchenpan/simply-googledeepmind that referenced this pull request Jun 14, 2026

Add 2026-06-14 experiment log: upstream rebase, onboarding notes, PR g…

2d7667c

…oogle-deepmind#29

crazydonkey200 merged commit b1064e6 into google-deepmind:main Jun 14, 2026
5 checks passed

chenchenpan added a commit to chenchenpan/simply-googledeepmind that referenced this pull request Jun 14, 2026

Add 2026-06-14 experiment log: upstream rebase, onboarding notes, PR g…

bd2fa61

…oogle-deepmind#29

chenchenpan deleted the fix/math-eval-int-precision branch June 14, 2026 23:47

chenchenpan added a commit to chenchenpan/simply-googledeepmind that referenced this pull request Jun 15, 2026

Add 2026-06-14 experiment log: upstream rebase, onboarding notes, PR g…

45a2cb9

…oogle-deepmind#29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix _str_to_int precision loss above 2^53 and add math_eval tests#29

Fix _str_to_int precision loss above 2^53 and add math_eval tests#29
crazydonkey200 merged 1 commit into
google-deepmind:mainfrom
chenchenpan:fix/math-eval-int-precision

chenchenpan commented Jun 14, 2026

Uh oh!

google-cla Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chenchenpan commented Jun 14, 2026

Summary

The bug

The fix

Tests

Notes

Uh oh!

google-cla Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants