Skip to content

Conversation

kkysen
Copy link
Contributor

@kkysen kkysen commented Aug 2, 2025

We do this by recursively checking whether an expression is const. This is generally done in a conservative manner, modulo the ExplicitCast bug (#853), non-const sizeof(VLA)s, and f128's non-const methods (#1262). For #853, this does generate incorrect code even on conservative, but I'll work on fixing that next.

Also, because we now allow certain operations that are unsafe, like ptr arithmetic, we wrap all consts in an unsafe block. This is similar to how all fns we generate are marked unsafe fns even if they don't contain any unsafe operations within them. We can improve this, but for now, this works and is correct.

Also, the output was being non-deterministic due to the usage of HashMaps for macro maps like macro_invocations, so I changed these to IndexMaps that are order-preserving.

@kkysen kkysen requested a review from fw-immunant August 2, 2025 08:52
@fw-immunant
Copy link
Contributor

What variation in output did we see due to nondeterministic iteration? Was it reordered output definitions or something more subtle/messy? Definitely worthwhile to make our output deterministic here, I'm just curious how this manifested.

Copy link
Contributor Author

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What variation in output did we see due to nondeterministic iteration? Was it reordered output definitions or something more subtle/messy? Definitely worthwhile to make our output deterministic here, I'm just curious how this manifested.

It affected the order of macro expansions that we folded over, which ended up meaning sometimes a macro expansion would be replaced by the macro definition, but not always.

@@ -29,287 +29,245 @@ pub struct fn_ptrs {
pub fn2: Option<unsafe extern "C" fn(std::ffi::c_int) -> std::ffi::c_int>,
}
pub type zstd_platform_dependent_type = std::ffi::c_long;
pub const NESTED_INT: std::ffi::c_int = 0xffff as std::ffi::c_int;
pub const true_0: std::ffi::c_int = unsafe { 1 as std::ffi::c_int };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fw-immunant, note that this does achieve a lot of what #290 does. It's similar to the portable types, stuff. There, we'll now emit a uint32_t type alias instead of a u32 directly, and this will emit a true_0 const instead of a true literal directly. Both could then be refactored pretty easily. I'm not exactly sure how #290 works, but the advantage here is that it matches where the C used the true and false macros, and if C uses 0 and non-zero directly, then it keeps that behavior. And then I'm not sure how #290 handles ints vs _Bools either.

We can also resurrect #295 to use r#true raw identifiers instead if that's helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There definitely is some overlap in spirit but I think it's probably worthwhile to address this from both sides--this PR will clean up some uses of #define symbols, but it seems like we're still mostly translating them as integers.

Do you know what might help us translate LITERAL_BOOL as const LITERAL_BOOL: bool = ...;? It seems like recreate_const_macro_from_expansions might be able to realize that all uses are boolifying it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what might help us translate LITERAL_BOOL as const LITERAL_BOOL: bool = ...;? It seems like recreate_const_macro_from_expansions might be able to realize that all uses are boolifying it.

I'm not sure, but I'll look into it.

@kkysen kkysen force-pushed the kkysen/translate-const-macros-minimal branch from 00730ae to a962605 Compare August 7, 2025 09:16
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch 2 times, most recently from a4708f4 to da528f1 Compare August 7, 2025 09:30
@kkysen kkysen force-pushed the kkysen/translate-const-macros-minimal branch from 469e4c3 to 6c689e1 Compare August 7, 2025 09:40
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch 2 times, most recently from f96631d to 2e42fc9 Compare August 7, 2025 10:00
@fw-immunant
Copy link
Contributor

One thing I'd like to include here is documentation (just a comment probably) relating the conservative/pessimistic analysis in is_const_expr with the less-conservative (aka unsound) analysis used by --translate-const-macros=experimental: in the latter case, we simply call ctx.set_const(true) and translate the expression. This has the effect of making various checks of (effectively) the form if ctx.is_const() { return error } trigger at various points in translating the expression where we recognize it to be incompatible with const. I don't think we need to merge the is_const_expr approach with this one, as the latter is fail-open (and does so in practice, e.g. in the translation of function call expressions which results in #803). But we should note for readers that there are effectively two implementations of this logic, and if/when we attempt to improve the experimental mode it would be by adding more logic of this form.

Copy link
Contributor

@fw-immunant fw-immunant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First 3 commits look good, but please add the comment/documentation I mention relating is_const_expr to ctx.is_const.

4th commit should have some test coverage before merging.

Base automatically changed from kkysen/translate-const-macros-minimal to master August 8, 2025 04:43
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch 2 times, most recently from b3e290b to a6ce89e Compare August 8, 2025 09:44
@kkysen kkysen changed the base branch from master to kkysen/reapply-translate-const-macros-conservative-default August 8, 2025 09:44
Copy link
Contributor Author

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4th commit should have some test coverage before merging.

Done in a6ce89e.

@kkysen
Copy link
Contributor Author

kkysen commented Aug 8, 2025

This has the effect of making various checks of (effectively) the form if ctx.is_const() { return error } trigger at various points in translating the expression where we recognize it to be incompatible with const. I don't think we need to merge the is_const_expr approach with this one, as the latter is fail-open (and does so in practice, e.g. in the translation of function call expressions which results in #803).

So you're saying the existing approach of checking if ctx.is_const() is incorrect, right?

@fw-immunant
Copy link
Contributor

This has the effect of making various checks of (effectively) the form if ctx.is_const() { return error } trigger at various points in translating the expression where we recognize it to be incompatible with const. I don't think we need to merge the is_const_expr approach with this one, as the latter is fail-open (and does so in practice, e.g. in the translation of function call expressions which results in #803).

So you're saying the existing approach of checking if ctx.is_const() is incorrect, right?

It's currently known to be incorrect, e.g. it doesn't check function calls for constness, resulting in #803. But it's not fundamentally wrong to implement constness checking that way if we covered every possible case during translation, and for an implementation that wants to cover all cases I think it's probably the right way to implement it--we don't want a separate full AST traversal for each analysis because they would need to stay synchronized with respect to the particular way we translate every idiom so they can reason about the Rust we're generating. For now I think it's fine to have two parallel approaches (a conservative one and an aspirationally precise one) as long as we document what's going on.

@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch from 2654c30 to a6ce89e Compare August 9, 2025 00:27
@kkysen kkysen changed the base branch from kkysen/reapply-translate-const-macros-conservative-default to kkysen/const-macro-expansions-override_ty-cast August 13, 2025 08:23
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch from a6ce89e to 0f97e7d Compare August 13, 2025 08:23
Base automatically changed from kkysen/const-macro-expansions-override_ty-cast to master August 16, 2025 06:39
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch 3 times, most recently from d0c85d0 to 650e7de Compare August 19, 2025 09:00
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch 3 times, most recently from 38064aa to 8b6aad4 Compare August 21, 2025 06:14
@kkysen kkysen requested a review from fw-immunant August 21, 2025 06:23
kkysen added a commit that referenced this pull request Aug 21, 2025
kkysen added a commit that referenced this pull request Aug 21, 2025
…sion_test}` into `IndexMap`s instead of `HashMap`s (#1334)

* Split out of #1306.

Previously, `macro_invocations` was a `HashMap`, and thus iterating
through it was unordered, which populated the `Vec<CExprId>` of
`macro_expansions` non-deterministically, which then resulted in
non-deterministic output from `--translate-const-macros conservative`.

I also changed the other `macro_*` maps to `IndexMap`, as many other
maps in `TypedAstContext` are already `IndexMap`s, too, and it's likely
that we want these stably ordered and deterministic.
Copy link
Contributor Author

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currently known to be incorrect, e.g. it doesn't check function calls for constness, resulting in #803. But it's not fundamentally wrong to implement constness checking that way if we covered every possible case during translation, and for an implementation that wants to cover all cases I think it's probably the right way to implement it--we don't want a separate full AST traversal for each analysis because they would need to stay synchronized with respect to the particular way we translate every idiom so they can reason about the Rust we're generating. For now I think it's fine to have two parallel approaches (a conservative one and an aspirationally precise one) as long as we document what's going on.

Fixed in 604c58a.

@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch from 604c58a to cdb7bd0 Compare August 21, 2025 19:18
@kkysen kkysen changed the base branch from master to kkysen/const-macros-unsafe-block August 21, 2025 19:19
Copy link
Contributor

@fw-immunant fw-immunant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in the last commit need some fixes (see my inline comments at 604c58a), but that can be addressed in a follow up. These changes look good, but we should avoid making all translations of const macros unsafe now that we're doing them by default. The goal of the conservative const macro translation effort is to generate fully idiomatic code for cases simple enough for us to do so, so we should avoid regressing translation of the simplest cases like #define FOO 5.

So this can merge into kkysen/const-macros-unsafe-block if you like, but I'd rather not merge that branch into master while it adds unsafe to the simplest const macro translations. Merging fw/const-less-unsafe into this branch (or into kkysen/const-macros-unsafe-block after this merges into it) would be my suggested route to that.

The stacked PR workflow is confusing but hopefully the above makes sense.

Copy link
Contributor Author

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good, but we should avoid making all translations of const macros unsafe now that we're doing them by default. The goal of the conservative const macro translation effort is to generate fully idiomatic code for cases simple enough for us to do so, so we should avoid regressing translation of the simplest cases like #define FOO 5.

This discussion kind of got split over multiple PRs, so I just wanted to link to the other parts here in case you miss it: #1335 (review), #1339 (review).

So this can merge into kkysen/const-macros-unsafe-block if you like, but I'd rather not merge that branch into master while it adds unsafe to the simplest const macro translations. Merging fw/const-less-unsafe into this branch (or into kkysen/const-macros-unsafe-block after this merges into it) would be my suggested route to that.

The stacked PR workflow is confusing but hopefully the above makes sense.

I'm not sure what the point of avoiding a temporary change to master is. We're not publishing a separate version in the meantime. From what I understand, the goal is not to keep things perfect all the time, but to make steady improvements quickly. We also haven't published --translate-const-macros conservative, so I don't see how changing what that emits to be a regression. So I'd much prefer to merge these PRs in order.

kkysen added a commit that referenced this pull request Aug 29, 2025
…ranslate-const-macros conservative` (#1335)

* Split out of #1306.

Some operations, such as ptr arithmetic, are `unsafe` and can be done in
const macros. So as an initially overly conservative implementation,
just make all `const`s use `unsafe` blocks in case `unsafe` operations
are used. This is what we already do for `fn`s, for example, even if all
of the operations in a `fn` are safe. We can improve this, but for now,
this works and is correct.
Base automatically changed from kkysen/const-macros-unsafe-block to master August 29, 2025 20:58
…re `CExprKind`s

We do this by recursively checking whether an expression is `const`.
This is generally done in a conservative manner,
modulo a few bugs and unhandled things, namely:

* the `ExplicitCast` bug (#853)
* non-`const` `sizeof(VLA)`s
* `sizeof`s and other `UnaryType` exprs (e.x. `alignof`)
* `f128`'s non-`const` methods (#1262)
* statements
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch from cdb7bd0 to 81551c0 Compare September 2, 2025 05:37
…re of the conservative and experimental const expr checks
@kkysen kkysen force-pushed the kkysen/expanded-translate-const-macros-conservative branch from 81551c0 to ecddc95 Compare September 2, 2025 06:00
Copy link
Contributor Author

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in the last commit need some fixes (see my inline comments at 604c58a), but that can be addressed in a follow up.

Fixed in 81551c0..ecddc95.

@kkysen kkysen merged commit c458496 into master Sep 2, 2025
5 checks passed
@kkysen kkysen deleted the kkysen/expanded-translate-const-macros-conservative branch September 2, 2025 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problematic macro translation with --translate-const-macros
2 participants