Skip to content

Conversation

kayabaNerve
Copy link
Contributor

Provides a FieldElement backed by the underlying curve25519-dalek implementation.

Unfortunately, the underlying implementation is unsafe to directly expose, which can be near-immediately observed when tested. The solution here is to only rely on the underlying implementation for a minimal amount of operations, encoding and decoding at each step to effect a reduction.

Note there is a reduce function provided by each backend, yet their type signatures are non-uniform so they could not be called as an alternative to to_repr/from_repr. That would be an acceptable alternative.

This branch includes a test over the newly introduced wrapper, which defers to a library I wrote to test implementers off the ff/group APIs. While I did not introduce it as a dev-dependency, this ad-hoc offering is suitable to:

  1. Verify baseline correctness
  2. Observe how the wrapper fails when the to_repr/from_repr calls after every operation are removed

Resolves #389.

These changes are largely straightforward, with the note prior, the only
arithmetic supported prior was for borrows. Now that we've added support
for non-borrowed addition, the existing code is 'unnecessarily borrowing',
hence the added lint.

The larger question applicable is how this exposes FieldElement from several
different backends, assuming they're all correct for arbitrary usage. The
implementation of ConstantTimeEq has the following notes.

```rs
    /// Test equality between two `FieldElement`s.  Since the
    /// internal representation is not canonical, the field elements
    /// are normalized to wire format before comparison.
```

While that doesn't suggest the mathematical operations are incomplete, it
should be double-checked that no backend bounded the amount of operations
they're correct to on the presumption the curve formulas would never exceed so
many operations.
…y have

inconsistent internal representations
Also fixes the feature flagging for the wide reduction function.
This forces a reduction after every operation via converting to bytes
(canonical) and re-reading it. There's probably a better way...
@kayabaNerve
Copy link
Contributor Author

The no-std test failures appear entirely unrelated to my commits.

Run cargo hack build -p curve25519-dalek --target thumbv7em-none-eabi --release --each-feature --exclude-features default,std,os_rng
warning: specified feature `std` not found in package `curve25519-dalek`
warning: specified feature `os_rng` not found in package `curve25519-dalek`

I am happy to see this work with all backends however.

/// implementations. Its size and internals are not guaranteed to have
/// any specific properties and are not covered by semver.
#[derive(Clone, Copy, PartialEq, Eq)]
pub struct FfFieldElement(FieldElement);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about adding another field element type, especially with a name like FfFieldElement (Finite field field element?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a thin wrapper around the existing FieldElement into the ff backend. It's exported publicly as FieldElement, but as it's a distinct type, it obviously needed its own name internally. The alternative would be this in its own module (as now), with the underlying FieldElement imported under the alias UnderlyingFieldElement, which an earlier draft of mine did. I'm happy to make that change, though I'll note I don't substantially believe this to be new.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I realize now my comment doesn't iterate over the primary issue. FieldElement does not satisfy the ff API and will not without invasive changes and performance regressions, due to its policy of lazy reduction.

A new type, isolating from the rest of dalek while ensuring proper fulfillment of the intended API, truly seemed optimal.

Alternatively, exposing everything under a hazmat feature and then a wrapper crate may define this ff-compliant type with a pin to an exact version of dalek, including patch, would work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many field element types in this codebase I worry about introducing another, especially one with the same name as a different FieldElement type (but only when re-exported), it just seems very confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heard. While I can point out it's a higher-level on the stack and feature-gated, it'd be better to discuss solutions.

How about it's own file, plus the nomenclature Ff25519 or anything of similar distance? The alternatives would be:

  1. Exposing the unsafe FieldElement type, under a hazmat feature, letting crates define this ff wrapper itself. I'm fine with that but consider it more damaging than any complexity of this patch.
  2. Not exposing the 2**255-19 field. As someone who's maintained my own field impl, and am now planning to maintain my own fork of dalek with this patch if not upstreamed, I'd continue as I have but be disappointed. I truly believe this represents a real-world use and this patch can be done incredibly non-intrusively.

@tarcieri
Copy link
Contributor

There are similar issues with lazy reductions in crates like k256 and p521.

Perhaps support for lazy reductions should be added to ff, e.g. with a separate type to represent unreduced field elements?

@kayabaNerve
Copy link
Contributor Author

The Monero project recently held a public competition for optimized implementations of two elliptic curves, replacing the ones I presented as reference deferring to crypto-bigint Residue.

Multiple implementations attempted lazy arithmetic, and all were incorrect in doing so. Lazy aritmetic, bounding correctness, shoves the footgun to the caller for what can be marginal performance gains. While I won't say it's impossible to leverage the Rust type system to check at compile-time the amount of correct operations remain, I believe the effort entirely outstrips the gains, except within immediate, tight scopes such as point addition. The one potential alternative for which I can see any reasonable merit is for a field element which allows a single addition (so for a 255-bit field, producing an unreduced 256-bit value) before either reducing explicitly or being used in a multiplication (producing a 512-bit value to be wide-reduced as existing). The bound on a single addition prevents having to encode a entire counter system into the type system, and should reasonably real-life parameters. I'm unsure how marginal the pair of saved reductions for combinations of the form (a + b) * (c + d) would be however. While interesting, I'll ask that be revisited elsewhere than this PR.

My PR here wraps the FieldElement into a type which is no longer lazy, reducing after every single operation, to ensure safe public usage. While that comes at a performance penalty compared to the underlying FieldElement as used within dalek, dalek itself suffers no performance penalty and users no longer have to implement the field themselves (which will statistically be the real performance penalty. I'm present here BECAUSE my own implementation premised on crypto-bigint is notably too slow in comparison).

Residue held up amazingly well in Monero's contest, for the record :) Overall, the best submissions fared ~30% better for point addition.

@tarcieri
Copy link
Contributor

tarcieri commented Jul 15, 2025

While I won't say it's impossible to leverage the Rust type system to check at compile-time the amount of correct operations remain, I believe the effort entirely outstrips the gains

I think type-safe abstractions around lazy reduction are definitely possible. The footgun exists because implementations reuse the same types for reduced and unreduced field elements, easily leading to caller confusion due to the need to "remember to reduce" or the code is mathematically broken.

Beyond that, I actually don't think such type-safe abstractions are conceptually that hard to implement. The "loose" field element representation in p521 is a similar idea, for example: https://github.com/RustCrypto/elliptic-curves/blob/master/p521/src/arithmetic/field/loose.rs

However, having attempted it in the past in k256 (and failed), trying to retrofit type-safe lazy reduction into a codebase which didn't use it initially is somewhat grueling due to the need to retroactively audit the codebase, locate all of the places where there is an unnamed unreduced field element, and change its type accordingly, as opposed to writing everything with separate types in the first place where you could diligently factor the relevant code into the correct type (not that this is a dig at curve25519-dalek or k256, more like "hard won knowledge").

All that said, it's abundantly clear to me in 20/20 hindsight that lazy reduction without such type safety is hugely problematic like you're saying @kayabaNerve, and makes it very easy to introduce subtle bugs without diligent code review.

See also: relevant k256 issue: RustCrypto/elliptic-curves#531 (perhaps we should open a similar one on this repo?)

@kayabaNerve
Copy link
Contributor Author

For what it's worth, after calling another developer and as someone who has a 255-bit field, I'm onboard for introducing a LazyScalar type, to a codebase currently entirely using full reductions, which allows computing (a + b) * (c + d) while only reducing the 512-bit product. It's when the capacity of the additions exceeds 1 the design gets annoying, but one could potentially take a solution like typenum and define capacity recursively?

Definitely worth a discussion elsewhere. I may also posit a wrapper usable...

@kayabaNerve
Copy link
Contributor Author

Sorry to bump this issue, but what is needed to action this? I can accept if it won't be actioned (even though I believe there is a clear use-case for it), but I'm unclear if this would be accepted, what has to be done to get it merged. My last comment was:

Heard. While I can point out it's a higher-level on the stack and feature-gated, it'd be better to discuss solutions.

How about it's own file, plus the nomenclature Ff25519 or anything of similar distance? The alternatives would be:

  1. Exposing the unsafe FieldElement type, under a hazmat feature, letting crates define this ff wrapper itself. I'm fine with that but consider it more damaging than any complexity of this patch.
  2. Not exposing the 2**255-19 field. As someone who's maintained my own field impl, and am now planning to maintain my own fork of dalek with this patch if not upstreamed, I'd continue as I have but be disappointed. I truly believe this represents a real-world use and this patch can be done incredibly non-intrusively.

@tarcieri
Copy link
Contributor

tarcieri commented Aug 8, 2025

I remain curious about whether we could put together traits to be upstreamed to ff which could capture lazy reduction and avoid the need to introduce an intermediate wrapper type. It's a problem for k256 as well. The core problem here, IMO, is the traits in ff don't map to how these types of field elements operate.

The approach in this PR seems like one worth avoiding if possible due to its numerous drawbacks (need for an intermediate wrapper type, performance loss due to eager reductions).

I'm unclear if this would be accepted, what has to be done to get it merged.

It's something we can consider if the other options don't work out. That said, the way it's factored/implemented now, it also seems like something that could alternatively remain in a third party crate.

@kayabaNerve
Copy link
Contributor Author

kayabaNerve commented Aug 8, 2025

Oh. I didn't realize that was still being included under this issue. Sorry.

In that case, my work would be:

  1. Upstreaming trait LazyField into ff, with a struct NonLazy: Field which immediately normalizes
  2. For each backend, correctly defining a struct Ff25519 as each backend will have its own bounds on potential uses

I don't hate that as a long-term goal. I do worry about how long that may take. If I knew that:

  1. If I submit to ff/curve25519-dalek a PR within a week
  2. It'll be in the next release of ff/curve25519-dalek within the next three months
  3. It'll support some specific Rust version (IIRC 1.84)

I would, because that's what my goal for usage within the Monero project requires. I'm just concerned there is no and cannot be such a guarantee, which leaves me to:

  1. Argue this as-is, as I truly don't believe Lazy should be a blocker. While this does have overhead by constantly reducing, it shouldn't be too much worse than anything which reduces on every step (eh, maybe some of the vector backends will have a non-trivial performance hit...).
  2. Accept the sooner I start, the sooner it'll be merged and released.

I'll accept option two, but I will ask you for the following compromise:

If I make Lazy PRs to both ff and curve25519-dalek, meaning we will eventually have the optimal solution, can we agree this is suboptimal yet merge it as-is (or with edits such as its own file, with a better name) until the Lazy PRs filter through the publication pipeline? If they do so in a timely fashion, and this never makes it into any release, great. If they are sufficiently slow to propagate, then at least I have this as desired.

Regardless, yep, I'll move forward with lazy PRs when I next have time. Sorry for misunderstand that was required on my end.


Also, again, this can't be done in a third-party crate UNLESS:

  1. We publish the underlying impls under a hazmat feature, due to the unsafe API. I am fine with this, as prior stated.
  2. The third-party crate has a build script to download curve25519-dalek, rip the relevant source files out, copying them into the local source tree, and then defining this wrapper.

@tarcieri
Copy link
Contributor

tarcieri commented Aug 8, 2025

I was thinking more like: try to create a zero-cost abstraction which avoids the need to have a suboptimal wrapper type to fit an API it diverges from.

And if we can't do that, consider other options, i.e. this PR.

I thought you already had this implemented in a third party crate? So perhaps I'm confused.

We publish the underlying impls under a hazmat feature, due to the unsafe API. I am fine with this, as prior stated.

What exactly do you mean by this? We can put more functionality under hazmat if need be, I'm just unclear what "underlying impls" means. You mean the various field backends?

This PR is basically additive, so can you spell out specifically what functionality is needed under hazmat?

@kayabaNerve
Copy link
Contributor Author

No, such a wrapper wasn't already implemented.

Monero held a contest for EC impls. Some impls premised themselves on lazy reduction, as curve25519-dalek does, and all trusted the caller to handle it correctly. All were also unsafe.

The solution would be a type-safe wrapper. None were submitted and I did not have one. Since then, I prototyped one premised on typenum and could work on upstreaming to ff. I just have to put in the work there, and it'll take time to get published in ff before being adopted here.

Re: hazmat, for all of the underlying backends:

  • If not hazmat, pub(crate) struct FieldElement, as current.
  • If hazmat, pub struct FieldElement.

This would allow a crate which depends on curve25519-dalek to enable the hazmat feature and locally define the exact ff wrapper I did in this PR. That can't be done right now though as the various FE structs and functions are entirely privately.

@tarcieri
Copy link
Contributor

Okay, apologies, I work on too many elliptic curve crates and forgot that curve25519-dalek doesn't actually expose a FieldElement type yet. Will have to think about that one.

Ideally I think it would be nice if there could be a single common FieldElement type for both external and internal use (i.e. replacing the existing FieldElement type aliases with a newtype whose definition doesn't change depending on which backend is enabled), but even exporting it under the hazmat API it seems unsafe/unwise to do so without first having a type-safe API for lazy reduction.

All that said, talking to @rozbb he sounded potentially interested in this approach of defining an entire separate type for external hazmat consumption which always eagerly reduces, if a more parsimonious approach proves too difficult.

@kayabaNerve
Copy link
Contributor Author

I'll poke at updating this for optional lazy reduction, and the other item in this repo I was pinged on, shortly. I'd like to maintain this as a MINIMAL PR to finally achieve a publicly accessible FieldElement however, hence why I pushed another commit.

As for whether FromUniformBytes<64> is minimal, it is for my use-case 😅 And it's a proper utilization of the ff traits, so I don't believe it to be problematic.

@kayabaNerve
Copy link
Contributor Author

I'm doing the work now on this. The traits don't make sense for curve25519-dalek but I'll do the end-to-end here before we discuss where to upstream (ff/elliptic-curve?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exposing the field element API
2 participants