Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug-fmt-detail option #123940

Merged
merged 1 commit into from
Aug 29, 2024
Merged

debug-fmt-detail option #123940

merged 1 commit into from
Aug 29, 2024

Conversation

kornelski
Copy link
Contributor

@kornelski kornelski commented Apr 14, 2024

I'd like to propose a new option that makes #[derive(Debug)] generate no-op implementations that don't print anything, and makes {:?} in format strings a no-op.

There are a couple of motivations for this:

  1. A more thorough stripping of debug symbols. Binaries stripped of debug symbols still retain some of them through Debug implementations. It's hard to avoid that without compiler's help, because debug formatting can be used in many places, including dependencies, and their loggers, asserts, panics, etc.
    • In my testing it gives about 2% binary size reduction on top of all other binary-minimizing best practices (including panic_immediate_abort). There are targets like Web WASM or embedded where users pay attention to binary sizes.
    • Users distributing closed-source binaries may not want to "leak" any symbol names as a matter of principle.
  2. Adds ability to test whether code depends on specifics of the Debug format implementation in unwise ways (e.g. trying to get data unavailable via public interface, or using it as a serialization format). Because current Rust's debug implementation doesn't change, there's a risk of it becoming a fragile de-facto API that won't be possible to change in the future. An option that "breaks" it can act as a grease.

This implementation is a -Z fmt-debug=opt flag that takes:

  • full — the default, current state.
  • none — makes derived Debug and {:?} no-ops. Explicit impl Debug for T implementations are left unharmed, but {:?} format won't use them, so they may get dead-code eliminated if they aren't invoked directly.
  • shallow — makes derived Debug print only the type's name, without recursing into fields. Fieldless enums print their variant names. {:?} works.

The shallow option is a compromise between minimizing the Debug code, and compatibility. There are popular proc-macro crates that use Debug::fmt as a way to convert enum values into their Rust source code.

There's a corresponding cfg flag: #[cfg(fmt_debug = "none")] that can be used in user code to react to this setting to minimize custom Debug implementations or remove unnecessary formatting helper functions.

@rustbot
Copy link
Collaborator

rustbot commented Apr 14, 2024

r? @estebank

rustbot has assigned @estebank.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 14, 2024
@rustbot
Copy link
Collaborator

rustbot commented Apr 14, 2024

Some changes occurred in cfg and check-cfg configuration

cc @Urgau

@rust-log-analyzer

This comment has been minimized.

@kpreid
Copy link
Contributor

kpreid commented Apr 14, 2024

I'd love to see this happen.

I have a suggestion for a refinement: In the none case, I think it might be better to write a constant string rather than the empty string. While this does slightly negate the code-size benefit, it will aid debuggability to get a recognizable string rather than a mysteriously empty one (because many different bugs can cause empty strings, and the empty string might even be used intentionally with the expectation that no value's Debug representation is the empty string).

This constant string should be "[object Object]" — just kidding. Maybe something like "<disabled>", "{debug-fmt-detail=none}", or perhaps, for minimality while still using Rust idiom,"_".

@kornelski kornelski force-pushed the remove-derived-debug branch from 7437d70 to aabf2f6 Compare April 14, 2024 22:29
@rust-log-analyzer

This comment has been minimized.

@kornelski kornelski force-pushed the remove-derived-debug branch from aabf2f6 to 1721c63 Compare April 15, 2024 00:40
@rustbot
Copy link
Collaborator

rustbot commented Apr 15, 2024

Some changes occurred in tests/ui/check-cfg

cc @Urgau

@rust-log-analyzer

This comment has been minimized.

@kornelski
Copy link
Contributor Author

@kpreid But none isn't merely writing an empty string. It makes Debug::fmt an empty function that doesn't touch the Formatter, which makes it possible to optimize it entirely.

There is LoweringContext::lower_format_args that maybe could replace {:?} with a string literal. However, it needs to keep the arg expressions for their side effects. Perhaps the args could be changed to something like drop(&arg) or (&arg, "").1, but I wasn't able to figure out how to transform the expressions. It needs to modify FormatArgs with ast::Expr, but the lower_format_args seems to have access only to hir::Expr.

@kornelski kornelski force-pushed the remove-derived-debug branch from 1721c63 to bceb534 Compare April 15, 2024 01:46
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@kornelski kornelski force-pushed the remove-derived-debug branch 2 times, most recently from bf36e51 to 3905795 Compare April 20, 2024 15:45
@rustbot
Copy link
Collaborator

rustbot commented Apr 20, 2024

Some changes occurred in src/doc/rustc/src/check-cfg.md

cc @Urgau

@rust-log-analyzer

This comment has been minimized.

@kornelski kornelski force-pushed the remove-derived-debug branch from 3905795 to 92cac91 Compare April 20, 2024 18:12
@rust-log-analyzer

This comment has been minimized.

@kornelski kornelski force-pushed the remove-derived-debug branch from 7736b47 to 88b9edc Compare August 28, 2024 22:32
@compiler-errors
Copy link
Member

@rfcbot fcp cancel

This does not need FCP approval now, since it will need another FCP on stabilization (if ever). This should count as having been MCP'd though, since it's been like 3 months and nobody has spoken up.

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 29, 2024

@compiler-errors proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Aug 29, 2024
@compiler-errors
Copy link
Member

This should be ready for review.

@jieyouxu jieyouxu added -Zfmt-debug Unstable option: fmt-debug F-fmt_debug `#![feature(fmt_debug)]` labels Aug 29, 2024
@Urgau
Copy link
Member

Urgau commented Aug 29, 2024

Thanks for the new option, and sorry it took so long for the PR be ready for final approval.

r? @Urgau
@bors r+

@bors
Copy link
Collaborator

bors commented Aug 29, 2024

📌 Commit 88b9edc has been approved by Urgau

It is now in the queue for this repository.

@rustbot rustbot assigned Urgau and unassigned estebank Aug 29, 2024
@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 29, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
…llaumeGomez

Rollup of 7 pull requests

Successful merges:

 - rust-lang#123940 (debug-fmt-detail option)
 - rust-lang#128166 (Improved `checked_isqrt` and `isqrt` methods)
 - rust-lang#128970 (Add `-Zlint-llvm-ir`)
 - rust-lang#129316 (riscv64imac: allow shadow call stack sanitizer)
 - rust-lang#129690 (Add `needs-unwind` compiletest directive to `libtest-thread-limit` and replace some `Path` with `path` in `run-make`)
 - rust-lang#129732 (Add `unreachable_pub`, round 3)
 - rust-lang#129743 (Fix rustdoc clippy lints)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
…llaumeGomez

Rollup of 7 pull requests

Successful merges:

 - rust-lang#123940 (debug-fmt-detail option)
 - rust-lang#128166 (Improved `checked_isqrt` and `isqrt` methods)
 - rust-lang#128970 (Add `-Zlint-llvm-ir`)
 - rust-lang#129316 (riscv64imac: allow shadow call stack sanitizer)
 - rust-lang#129690 (Add `needs-unwind` compiletest directive to `libtest-thread-limit` and replace some `Path` with `path` in `run-make`)
 - rust-lang#129732 (Add `unreachable_pub`, round 3)
 - rust-lang#129743 (Fix rustdoc clippy lints)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 015e937 into rust-lang:master Aug 29, 2024
6 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 29, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
Rollup merge of rust-lang#123940 - kornelski:remove-derived-debug, r=Urgau

debug-fmt-detail option

I'd like to propose a new option that makes `#[derive(Debug)]` generate no-op implementations that don't print anything, and makes `{:?}` in format strings a no-op.

There are a couple of motivations for this:

1. A more thorough stripping of debug symbols. Binaries stripped of debug symbols still retain some of them through `Debug` implementations. It's hard to avoid that without compiler's help, because debug formatting can be used in many places, including dependencies, and their loggers, asserts, panics, etc.
   * In my testing it gives about 2% binary size reduction on top of all other binary-minimizing best practices (including `panic_immediate_abort`). There are targets like Web WASM or embedded where users pay attention to binary sizes.
   * Users distributing closed-source binaries may not want to "leak" any symbol names as a matter of principle.
2. Adds ability to test whether code depends on specifics of the `Debug` format implementation in unwise ways (e.g. trying to get data unavailable via public interface, or using it as a serialization format). Because current Rust's debug implementation doesn't change, there's a risk of it becoming a fragile de-facto API that [won't be possible to change in the future](https://www.hyrumslaw.com/). An option that "breaks" it can act as a [grease](https://www.rfc-editor.org/rfc/rfc8701.html).

This implementation is a `-Z fmt-debug=opt` flag that takes:

* `full` — the default, current state.
* `none` — makes derived `Debug` and `{:?}` no-ops. Explicit `impl Debug for T` implementations are left unharmed, but `{:?}` format won't use them, so they may get dead-code eliminated if they aren't invoked directly.
* `shallow` — makes derived `Debug` print only the type's name, without recursing into fields. Fieldless enums print their variant names. `{:?}` works.

The `shallow` option is a compromise between minimizing the `Debug` code, and compatibility. There are popular proc-macro crates that use `Debug::fmt` as a way to convert enum values into their Rust source code.

There's a corresponding `cfg` flag: `#[cfg(fmt_debug = "none")]` that can be used in user code to react to this setting to minimize custom `Debug` implementations or remove unnecessary formatting helper functions.
@kornelski kornelski deleted the remove-derived-debug branch August 29, 2024 23:33
@kornelski
Copy link
Contributor Author

Thank you all for helping to merge this.

@sffc
Copy link

sffc commented Dec 27, 2024

I'm really happy to see progress on binary size in Rust. Many of our clients in ICU4X care about this subject, and anything we can do here increases our value proposition.

I would like to see more documentation about how this is intended to work with custom Debug impls. There are a number of reasons why we can't always use #[derive(Debug)]:

  1. #[derive(Debug)] adds T: Debug bounds to all generics, which is sometimes not necessary, such as when the type is used in a PhantomData. We have numerous custom Debug impls that do what #[derive(Debug)] "should do" but without the extra, unnecessary bound.

  2. Fields that should be omitted from the debug impl. One example I've encountered before was a Mutex field pointing to a pre-computed value in a cache. I wrote a custom Debug impl in an effort to make the Debug impl not deadlock, since the value in the Mutex didn't change the data identity of the object.

  3. In general when we can provide a more useful debug string than derive(Debug)

Would something like #[cfg(fmt_debug = "shallow")] work to allow size reduction of custom Debug impls? If so, it would be nice to document this so that a size-conscious library like ICU4X can instrument our custom Debug impls to leverage the new flag.


Additional question: would it be possible for format!("a {b:?} c") to compile to something that concat!s the prefix, suffix, and type_name of b? This seems maybe more friendly than mutating the Debug impls to have different behavior based on compile flags.

@Manishearth
Copy link
Member

Manishearth commented Dec 27, 2024

Would something like #[cfg(fmt_debug = "shallow")] work

This does currently work, it should be documented.

Additional question: would it be possible for format!("a {b:?} c") to compile to something that concat!s the prefix, suffix, and type_name of b? This seems maybe more friendly than mutating the Debug impls to have different behavior based on compile flags.

An important feature of -Zfmt-debug=shallow is that it still formats primitives: strings and integers work fine, it's just that wrappers around them don't. This option is more like an alternate version of fmt-debug=none where instead of interpolating empty strings it would have a little bit more context, keeping debug strings somewhat distinguishable.

Worth sketching out what you're looking for exactly. Currently, the behavior of println!("{:?} / {:?} / {:?} / {:?}", 1, "foo", WrapsU8(1), HasCustomDebug); is

  • -Zfmt-debug=full: 1 / "foo" / WrapsU8(1) / (this has a custom debug impl)
  • -Zfmt-debug=shallow: 1 / "foo" / WrapsU8 / (this has a custom debug impl) (This works by changing the behavior of derive(Debug))
  • -Zfmt-debug=none: / / / (This works by changing the behavior of format!())

In your proposal, what would you have -Zfmt-debug=type-names-only produce? u8 / str / WrapsU8 / HasCustomDebug? That could work, though I'm not sure if it's much better than what "none" produces. This is the approach you have described with concat!().

Trying to get the behavior of 1 / "foo" / WrapsU8 / HasCustomDebug is trickier, though: which debug impls get passed through, and which get erased? The "shallow" and "none" approaches operate on different parts of compilation: Shallow impacts custom derives, and none impacts formatting, but formatting doesn't know anything about the types and can't make any distinction between primitives and not-primitives.

@Manishearth
Copy link
Member

Really, for the custom Debug issue, a thing I've wanted for ages is Rust to have a uniform list of attributes that can be applied to all of its custom derives that allow for skipping and replace-with-function-ing derive behavior on a field. Should write that RFC at some point.

@sffc
Copy link

sffc commented Dec 28, 2024

Since fmt-debug=none builds on fmt-debug=shallow, then shouldn't #[cfg(fmt_debug = "shallow")] really be #[cfg(any(fmt_debug = "shallow", fmt_debug="none"))]? That doesn't seem very readable or future-proof.

Yeah, u8 / str / WrapsU8 / HasCustomDebug seems modestly more useful than / / / but also more foolproof than 1 / "foo" / WrapsU8(1) / (this has a custom debug impl). The other thing I like about the concat! approach is that it eliminates the use of core::fmt::Formatter which further reduces binary size.

Thought: a format syntax character could prevent the debug impl from being removed by the fmt-debug config. Something like {:?!} (not sure which syntax characters are available).

I think a well-balanced setting would be:

  1. Change format! and all other standard library formatting macros to substitute type_of instead of impl Debug and concatenate the strings to reduce the need for core::fmt::Formatter
  2. Add a syntax character such as in {:?!} to keep using impl Debug
  3. No change to derived or custom Debug impls. Hopefully they will be more likely to be DCE'd.

I don't think the current proposal has an option for this.

@Manishearth
Copy link
Member

Since fmt-debug=none builds on fmt-debug=shallow

...kiiiind of, fmt-debug=none does blank out derive(Debug) (it's unclear to me why, it seems redundant) but that is not its primary mechanism, it primarily works by making format!() no longer call Debug at all. As such you should not need to use cfg to change custom derive impls to fit the none case unless you care about derive(Debug) specifically, which seems unnecessary for a fmt-debug=none build. Documentation should probably be written somewhere to suggest that people writing custom Debug impls do something special for cfg(fmt-debug=shallow) (but nothing more).

Yeah, u8 / str / WrapsU8 / HasCustomDebug seems modestly more useful than / / / but also more foolproof than 1 / "foo" / WrapsU8(1) / (this has a custom debug impl).

Can you expand on what you mean here?

A real format string would probably have more than just / and would be traceable to its source, especially as a panic. / / / is not too clear and knowing the types would be nice, but a real format string like number of iterations: {:?} does not particularly benefit from knowing the type of the number. but not the value. Some generic cases, maybe, but if that matters I'd rather ask people to include the type name in the debug string.

I think "use the type name instead of blank" is probably a viable way to improve fmt-debug=none without adding much cost1. It's not clear to me if having a separate mode for that behavior brings much value.

Add a syntax character such as in {:?!} to keep using impl Debug

n.b. fmt-debug=none does continue to support Display. To a large part the primitives vs non primitives distinction is already handled by using {}. This just doesn't help for types which are only Debug.

use of a DebugAsDisplay wrapper object would solve that problem. Path has such a wrapper type but it's not generic and reusable.

Footnotes

  1. There are potential problems with large types here. Rust could use an internal std::any::type_name_truncated() for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-Zfmt-debug Unstable option: fmt-debug F-fmt_debug `#![feature(fmt_debug)]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.