Add `must-use-output` attribute #3773

Kixunil · 2025-02-16T12:10:15Z

This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.

Cc @joshtriplett

Rendered

This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.

The PR number is 3773.

joshtriplett · 2025-02-16T12:19:09Z

text/0000-must-use-output.md

+
+Note also that some functions may be useless to call even when they take in truly immutable (`Freeze`) references. For instance `clone` is wasteful if the original is not accessed.


Suggested change

Note also that some functions may be useless to call even when they take in truly immutable (`Freeze`) references. For instance `clone` is wasteful if the original is not accessed.

This is true, but I don't think it's on-topic for this RFC, and addressing it would need some additional care to avoid the annoying ripple-effect problem of "don't write the clone on the last use". Can we drop it from this RFC?

How about we mention it but don't propose to mark it just yet? Maybe some library has a use for it in some other context and I don't think adding code to ban Freeze references is worth it.

@Kixunil I'm not suggesting adding code to ban it, just that it's something different and may not want to use the same handling. Could you move it to one of the later notes sections (e.g. future work), then, along with the note that this isn't proposing to mark such values?

I'm confused. If we allow it right away and there's a clear use case with clone-like functions, which people could write how is it a "future possibility"? Note that while I wrote clone in the list below, I can simply remove it from the list and reword this to:

Note also that some functions may be useless to call even when they take in truly immutable (Freeze) references. For instance clone is wasteful if the original is not accessed, however this RFC is not proposing adding the attribute to core's Clone::clone just yet.

And put "Add the attribute to clone" in "Future possibilities"

text/0000-must-use-output.md

This was overlooked in initial PR.

text/3773-must-use-output.md

These address the review comments where the author completely agrees with the reviewer.

Kixunil · 2025-02-16T12:55:41Z

I've incorporated all suggestions I completely agree with.

text/3773-must-use-output.md

This addresses all comments raised by Kevin Reid and one by Josh Triplett.

Kixunil · 2025-02-16T19:26:42Z

Thanks @kpreid ! I've addressed your feedback. Also @joshtriplett thanks for your review so far, I have also addressed the other rewording.

joshtriplett · 2025-02-26T23:28:52Z

text/3773-must-use-output.md

+- Make it a `clippy` lint instead. However not everyone uses `clippy` and the need to communicate which arguments are considered important would diminish its effect. `#[must_use]` is a native rustc lint, and this should be as well, using the same machinery.
+- Try to somehow analyze the code and not require the attribute. This seems hard and it could lead into libraries accidentally triggering warnings in users code if they change the body of the function.
+- Implement a more full feature — e.g. by doing some things in "Future possibilities" section. However, this feature should be useful even without them.
+- Have the attribute on the function instead listing the names of paramters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities".


Suggested change

- Have the attribute on the function instead listing the names of paramters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities".

- Have the attribute on the function instead listing the names of parameters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities".

joshtriplett · 2025-02-26T23:29:21Z

text/3773-must-use-output.md

+[rationale-and-alternatives]: #rationale-and-alternatives
+
+- We could also simply not do this but the potential mistakes it catches are real.
+- The name could be something different and preferably shorter. The name used here was suggested by Josh Triplett and is pretty readable and it can be used before stabilization in the absence of better one. A pretty short and clear `#[must_read]` was also suggested by Kevin Reid.


For the record, while I mildly prefer must_use_output, I would check a box for must_read. I do think it somewhat has the problem that must_use would (it could be read as applying to the body of the function rather than to the caller), but it does seem slightly better.

maybe #[caller_must_use]?

@kennytm Sadly still pretty long but I like it better than must_use_output. It at least doesn't pose "what do you mean by the output?" question.

@Kixunil Agreed. must_use and must_read don't answer "who". must_use_output specifies that indirectly: "the one that gets this as an output must use it". But caller_must_use specifies it directly. That seems pretty clear to me.

clarfonthey · 2025-02-27T02:37:33Z

While I like the idea of this feature, I do wonder how many custom attributes we're going to have to add to the language before we just decide to reintroduce the idea of labeling functions as pure.

Kixunil · 2025-02-27T07:25:43Z

@clarfonthey note that if we had pure attribute/keyword it wouldn't help here because, according to Wikipedia:

the function has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams)

An attribute that disallows side effects except for allocation and mutation of parameters might be interesting but we would need to invent a name for it. :)

clarfonthey · 2025-02-28T00:29:33Z

I personally think that it's okay to consider a function pure if its mutations are explicitly defined; the issue with purity is when they're not defined, i.e. external state.

There isn't much of a distinction between taking and returning a value versus taking a mutable reference, other than calling convention. The idea of purity really boils down to whether a function call can be removed if its output is not used, which is what's relevant here.

Sure, maybe calling it pure is a bad idea, and maybe const isn't good enough either, but we're definitely approaching the point where typing pure fn is a whole lot easier than adding multiple attributes.

Note also that under this model, stuff like must_use are still useful, but useful for cases when it's actually not creating dead code and instead suggesting to the user they should handle extra cases. So, pushing to a vector that isn't used means the entire lifetime of the vector can be deleted, whereas ignoring an error still means the function is run.

The other benefit of this is that, with an actual language feature for this, we can do complete dead code analysis, whereas at the moment, the best we can do is emit one lint at the final call site if the code path ends.

traviscross · 2025-03-07T08:48:50Z

text/3773-must-use-output.md

+
+- We could also simply not do this but the potential mistakes it catches are real.
+- The name could be something different and preferably shorter. The name used here was suggested by Josh Triplett and is pretty readable and it can be used before stabilization in the absence of better one. A pretty short and clear `#[must_read]` was also suggested by Kevin Reid.
+- We could write `#[must_use]` on a reference parameter instead. The downside would be that this could be mistaken for saying that the *callee* must use the parameter, rather than the *caller*.


This doesn't seem like much of a problem to me. We already lint on function parameters that are unused within the callee¹, so if I saw #[must_use] on a &mut parameter, I'd expect this would refer to an obligation on the caller, especially as that's the existing connotation of #[must_use].

Barring better proposals (and hopefully more concise ones than #[must_use_output]), my own preference is to just use #[must_use] for this.

Footnotes

Except for self, unfortunately, due to not having a good way to suppress it, but realistically we would never use #[must_use] &self in this way, to require something of the callee. ↩

Yeah, I also feel like it wouldn't be that big deal and people would figure it out quickly. But at the same time I'm not bothered by the name too much and I respect whatever decision will be made by the appropriate team.

I also prefer #[must_use]. When applied to a function, it also means "the output at the call site must be used", and people don't have a problem understanding it.

Maybe the attribute could be named differently, something along the lines of the function having side effect that needs to be observed #[must_use(side_effect)] or #[is_a_setter]?

We have a couple of parallel discussions occurring about the name, one here and one at #3773 (comment) .

From that discussion: caller_must_use seems like it removes the ambiguity.

Kixunil · 2025-03-07T16:36:33Z

@clarfonthey I agree pretty much completely and I think excluding mutation through pointer arguments was a bad idea when someone coined the term pure. But the term exists and if possible I'd prefer a different one, if possible, to avoid confusion.

Aloso · 2025-03-09T10:14:48Z

text/3773-must-use-output.md

+
+Have a way to explain to the compiler how smart pointers work so that this can be used on `Pin` as well.
+
+Have an attribute that can express "or" relationship between parameters and return types. For instance, `Vec::pop` is sometimes used to remove the last element and then pass the vec to something else (e.g. trim `\n` at the end) and sometimes it's used to get the last value and the vec is no longer relevant. Someting like `#[must_use_at_least_one(self, return)]` on function could handle this case.


This should just be the default behaviour when a function has multiple outputs (e.g. an output parameter as well as a return type) that should be used. You have to use at least one, not all of them.

P.S. This is another reason why I prefer the name #[must_use] instead of #[must_use_output], because it is conceptually the same thing.

This should just be the default behaviour when a function has multiple outputs (e.g. an output parameter as well as a return type) that should be used. You have to use at least one, not all of them.

Then we would need a way to override it in the other direction, because there are absolutely functions for which you need to use all the output parameters.

Yes, to give a specific example: core::mem::swap.

But also for write and read you should use the returned value specifically. Just using the writer is not enough.

Aloso · 2025-03-09T10:26:43Z

Related: #[must_use] should go to the return type:

fn foo() -> #[must_use] Output;

This feels more consistent with #[must_use_output] on function parameters.

This would also address the problem that you can't mark part of a return type as #[must_use]:

#[must_use]
fn read() -> Result<usize> {...}

read()?; // does not trigger the lint

What we want is this:

fn read() -> Result<#[must_use] usize> {...}

Kixunil · 2025-03-10T07:31:28Z

@Aloso in principle yes but there isn't much we can do now about it apart from having it as an alternative way to specify the atribute.

ahicks92 · 2025-03-11T03:40:50Z

I'm not sure where specifically in the RFC to comment but the RFC and this thread both don't seem to be mentioning it. As presented this will warn on data structures which impl Drop (it has to, otherwise the vec example at the top cannot work), but it is entirely possible to use a data structure to hold RAII guards (as one example). That is:

struct FancyDataStructure {
    something_atomic: AtomicU64,
    lock: Mutex<()>,
}

fn foo() {
    // snip...
    let mut guards = vec![];
    for thing in structures{
        guards.push(thing.lock.lock().unwrap());
    }

    // Don't touch guards, but when we drop they'l unlock.
}

As far as I can tell that is roughly equivalent to the first example but we actually do for real rely on the drop only. If you're wondering why you'd do something like this, it can come up in concurrent data structures when it is safe for readers to access the other fields.

As another possible write-only do-not-consume case, consider File where you are passing raw fds off to something else but need the File objects to stay around.

The trouble is that in order to do this generically--at least as far as I see--the rule has to become more complex. Something like "warn if the container's type parameters do not impl Drop" or something. Otherwise, you get false positives in these cases. Consider a crate like smallvec, which is what I would really use for cases like these; it has to extend to that.

There's the obvious rule "if any type parameter impls Drop then ignore it" but that immediately fails out on Vec<Box<Anything>> or Vec<Vec<Anything>>.

I kind of agree with @clarfonthey in that this is sort of poor man's pure. As written you can basically only use it on pure functions. If it can't handle the Drop case then even putting it in std doesn't work unless the function is pure.

Of course I imagine there'd be some way to turn this off if it's a false positive. Maybe people feel that we should make people do that. I have weak opinions on this whole thing in general: I see no harm in it but have never personally run into a case where it matters.

kennytm · 2025-03-12T06:21:50Z

@ahicks92 in typical Rust that example will be written not as a for loop but a map+collect:

let guards = structures.iter()
    .map(|s| s.lock.lock().unwrap())
    .collect::<Vec<_>>();

which currently rustc will warn #[warn(unused_variables)]. The standard solution is to name it with a leading underscore:

let _guards = structures.iter()
    .map(|s| s.lock.lock().unwrap())
    .collect::<Vec<_>>();

I think the situation should be the same here. If we require to use Vec::push and don't want the warning and don't want to use #[expect] or drop(guards), maybe the RFC should support recognizing variables with a leading underscore:

    let mut _guards = vec![];
    for thing in structures{
        _guards.push(thing.lock.lock().unwrap()); 
        // suppress warning since `Vec::push` is called on `&mut _guards`?
    }

ahicks92 · 2025-03-12T20:58:19Z

_ prefixes don't play well with this, especially since they are taught to people as unused variable:

let mut v = vec![];
v.push(something);
if let Some(front) = v.front() {
    // stuff.
}
v.push(another_thing);

Where does the _ go?

let _ = whatever.push(...) under the same logic as Result might be the best option, otherwise as far as I can tell you have to allow() it somewhere at least with current mechanisms. For this I'd at least like the mitigation/override to be on/near the line that triggers it.

I also feel the need to push back on the "not idiomatic" response. I started with significant compiler contributions in 2017 and have spent the last 4 years doing Rust full-time. I do understand that iterators are in some sense better and sometimes faster, I do understand that most experienced Rust people use them, but I could also make an argument for why I don't always do that rooted primarily in interacting with polyglot teams. In this case though, the goal is illustrating a problem I see with the RFC in a straightforward manner, not coming up with something idiomatic or arguing about it. In fact "learn to use iterator chains, and it won't warn" is exactly the opposite of what I would want to do to anyone--that's actively hostile if it's the solution, whether that hostility is intended or not, and in addition isn't something that Rustc could advise people about.

kennytm · 2025-03-13T13:37:11Z

@ahicks92 I haven't used the word "idiomatic" in my response ("standard solution" is talking about suppressing the unused_variable lint), I'm just pointing out if this is written in terms of iterators instead of for-loop, we see that rustc provided one escape hatch which this RFC may copy. I'm sorry if you find it confused.

In more complex situation one should #[expect] the lint to clearly convey the intention why it's ok v is no longer used.

let mut v = vec![];
v.push(something);
if let Some(front) = v.front() {
    // stuff.
}
#[expect(unused_must_use, reason="v takes ownership of another_thing which is a lock, ensuring it won't be released until v is dropped at the end of function")]
v.push(another_thing);

joshtriplett · 2025-05-28T18:30:13Z

We discussed this in today's @rust-lang/lang meeting.

We agreed that #[must_use] is less ambiguous than I previously thought, for a few different reasons. One of them is that we already check that parameters are used, and require that they're named with a leading `_ if they aren't.

Given that, we'd like to see this updated to use #[must_use] after all. Thank you!

traviscross · 2025-05-28T19:41:33Z

@rustbot labels -I-lang-nominated -P-lang-drag-3 +I-lang-radar
@rustbot author

rustbot · 2025-05-28T19:41:36Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

Kixunil added 2 commits February 16, 2025 13:04

Add initial version of must-use-output

d51e8a8

This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.

Update the must-use-output RFC with PR number

0b18cb1

The PR number is 3773.