-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add must-use-output
attribute
#3773
base: master
Are you sure you want to change the base?
Conversation
This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.
The PR number is 3773.
text/0000-must-use-output.md
Outdated
|
||
Note also that some functions may be useless to call even when they take in truly immutable (`Freeze`) references. For instance `clone` is wasteful if the original is not accessed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that some functions may be useless to call even when they take in truly immutable (`Freeze`) references. For instance `clone` is wasteful if the original is not accessed. |
This is true, but I don't think it's on-topic for this RFC, and addressing it would need some additional care to avoid the annoying ripple-effect problem of "don't write the clone on the last use". Can we drop it from this RFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we mention it but don't propose to mark it just yet? Maybe some library has a use for it in some other context and I don't think adding code to ban Freeze
references is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kixunil I'm not suggesting adding code to ban it, just that it's something different and may not want to use the same handling. Could you move it to one of the later notes sections (e.g. future work), then, along with the note that this isn't proposing to mark such values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. If we allow it right away and there's a clear use case with clone
-like functions, which people could write how is it a "future possibility"? Note that while I wrote clone
in the list below, I can simply remove it from the list and reword this to:
Note also that some functions may be useless to call even when they take in truly immutable (
Freeze
) references. For instanceclone
is wasteful if the original is not accessed, however this RFC is not proposing adding the attribute tocore
'sClone::clone
just yet.
And put "Add the attribute to clone
" in "Future possibilities"
This was overlooked in initial PR.
These address the review comments where the author completely agrees with the reviewer.
I've incorporated all suggestions I completely agree with. |
This addresses all comments raised by Kevin Reid and one by Josh Triplett.
Thanks @kpreid ! I've addressed your feedback. Also @joshtriplett thanks for your review so far, I have also addressed the other rewording. |
- Make it a `clippy` lint instead. However not everyone uses `clippy` and the need to communicate which arguments are considered important would diminish its effect. `#[must_use]` is a native rustc lint, and this should be as well, using the same machinery. | ||
- Try to somehow analyze the code and not require the attribute. This seems hard and it could lead into libraries accidentally triggering warnings in users code if they change the body of the function. | ||
- Implement a more full feature — e.g. by doing some things in "Future possibilities" section. However, this feature should be useful even without them. | ||
- Have the attribute on the function instead listing the names of paramters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Have the attribute on the function instead listing the names of paramters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities". | |
- Have the attribute on the function instead listing the names of parameters. This could make it nicer to extend to support the "or" relationship described in "Future possibilities". |
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
- We could also simply not do this but the potential mistakes it catches are real. | ||
- The name could be something different and preferably shorter. The name used here was suggested by Josh Triplett and is pretty readable and it can be used before stabilization in the absence of better one. A pretty short and clear `#[must_read]` was also suggested by Kevin Reid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, while I mildly prefer must_use_output
, I would check a box for must_read
. I do think it somewhat has the problem that must_use
would (it could be read as applying to the body of the function rather than to the caller), but it does seem slightly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe #[caller_must_use]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kennytm Sadly still pretty long but I like it better than must_use_output
. It at least doesn't pose "what do you mean by the output?" question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kixunil Agreed. must_use
and must_read
don't answer "who". must_use_output
specifies that indirectly: "the one that gets this as an output must use it". But caller_must_use
specifies it directly. That seems pretty clear to me.
While I like the idea of this feature, I do wonder how many custom attributes we're going to have to add to the language before we just decide to reintroduce the idea of labeling functions as pure. |
@clarfonthey note that if we had
An attribute that disallows side effects except for allocation and mutation of parameters might be interesting but we would need to invent a name for it. :) |
I personally think that it's okay to consider a function pure if its mutations are explicitly defined; the issue with purity is when they're not defined, i.e. external state. There isn't much of a distinction between taking and returning a value versus taking a mutable reference, other than calling convention. The idea of purity really boils down to whether a function call can be removed if its output is not used, which is what's relevant here. Sure, maybe calling it pure is a bad idea, and maybe Note also that under this model, stuff like The other benefit of this is that, with an actual language feature for this, we can do complete dead code analysis, whereas at the moment, the best we can do is emit one lint at the final call site if the code path ends. |
|
||
- We could also simply not do this but the potential mistakes it catches are real. | ||
- The name could be something different and preferably shorter. The name used here was suggested by Josh Triplett and is pretty readable and it can be used before stabilization in the absence of better one. A pretty short and clear `#[must_read]` was also suggested by Kevin Reid. | ||
- We could write `#[must_use]` on a reference parameter instead. The downside would be that this could be mistaken for saying that the *callee* must use the parameter, rather than the *caller*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like much of a problem to me. We already lint on function parameters that are unused within the callee1, so if I saw #[must_use]
on a &mut
parameter, I'd expect this would refer to an obligation on the caller, especially as that's the existing connotation of #[must_use]
.
Barring better proposals (and hopefully more concise ones than #[must_use_output]
), my own preference is to just use #[must_use]
for this.
Footnotes
-
Except for
self
, unfortunately, due to not having a good way to suppress it, but realistically we would never use#[must_use] &self
in this way, to require something of the callee. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I also feel like it wouldn't be that big deal and people would figure it out quickly. But at the same time I'm not bothered by the name too much and I respect whatever decision will be made by the appropriate team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also prefer #[must_use]
. When applied to a function, it also means "the output at the call site must be used", and people don't have a problem understanding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the attribute could be named differently, something along the lines of the function having side effect that needs to be observed #[must_use(side_effect)]
or #[is_a_setter]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a couple of parallel discussions occurring about the name, one here and one at #3773 (comment) .
From that discussion: caller_must_use
seems like it removes the ambiguity.
@clarfonthey I agree pretty much completely and I think excluding mutation through pointer arguments was a bad idea when someone coined the term |
|
||
Have a way to explain to the compiler how smart pointers work so that this can be used on `Pin` as well. | ||
|
||
Have an attribute that can express "or" relationship between parameters and return types. For instance, `Vec::pop` is sometimes used to remove the last element and then pass the vec to something else (e.g. trim `\n` at the end) and sometimes it's used to get the last value and the vec is no longer relevant. Someting like `#[must_use_at_least_one(self, return)]` on function could handle this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be the default behaviour when a function has multiple outputs (e.g. an output parameter as well as a return type) that should be used. You have to use at least one, not all of them.
P.S. This is another reason why I prefer the name #[must_use]
instead of #[must_use_output]
, because it is conceptually the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be the default behaviour when a function has multiple outputs (e.g. an output parameter as well as a return type) that should be used. You have to use at least one, not all of them.
Then we would need a way to override it in the other direction, because there are absolutely functions for which you need to use all the output parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, to give a specific example: core::mem::swap
.
But also for write
and read
you should use the returned value specifically. Just using the writer is not enough.
Related: fn foo() -> #[must_use] Output; This feels more consistent with This would also address the problem that you can't mark part of a return type as #[must_use]
fn read() -> Result<usize> {...}
read()?; // does not trigger the lint What we want is this: fn read() -> Result<#[must_use] usize> {...} |
@Aloso in principle yes but there isn't much we can do now about it apart from having it as an alternative way to specify the atribute. |
I'm not sure where specifically in the RFC to comment but the RFC and this thread both don't seem to be mentioning it. As presented this will warn on data structures which impl Drop (it has to, otherwise the vec example at the top cannot work), but it is entirely possible to use a data structure to hold RAII guards (as one example). That is:
As far as I can tell that is roughly equivalent to the first example but we actually do for real rely on the drop only. If you're wondering why you'd do something like this, it can come up in concurrent data structures when it is safe for readers to access the other fields. As another possible write-only do-not-consume case, consider The trouble is that in order to do this generically--at least as far as I see--the rule has to become more complex. Something like "warn if the container's type parameters do not impl Drop" or something. Otherwise, you get false positives in these cases. Consider a crate like smallvec, which is what I would really use for cases like these; it has to extend to that. There's the obvious rule "if any type parameter impls Drop then ignore it" but that immediately fails out on I kind of agree with @clarfonthey in that this is sort of poor man's pure. As written you can basically only use it on pure functions. If it can't handle the Drop case then even putting it in std doesn't work unless the function is pure. Of course I imagine there'd be some way to turn this off if it's a false positive. Maybe people feel that we should make people do that. I have weak opinions on this whole thing in general: I see no harm in it but have never personally run into a case where it matters. |
@ahicks92 in typical Rust that example will be written not as a for loop but a map+collect: let guards = structures.iter()
.map(|s| s.lock.lock().unwrap())
.collect::<Vec<_>>(); which currently rustc will warn let _guards = structures.iter()
.map(|s| s.lock.lock().unwrap())
.collect::<Vec<_>>(); I think the situation should be the same here. If we require to use let mut _guards = vec![];
for thing in structures{
_guards.push(thing.lock.lock().unwrap());
// suppress warning since `Vec::push` is called on `&mut _guards`?
} |
Where does the
I also feel the need to push back on the "not idiomatic" response. I started with significant compiler contributions in 2017 and have spent the last 4 years doing Rust full-time. I do understand that iterators are in some sense better and sometimes faster, I do understand that most experienced Rust people use them, but I could also make an argument for why I don't always do that rooted primarily in interacting with polyglot teams. In this case though, the goal is illustrating a problem I see with the RFC in a straightforward manner, not coming up with something idiomatic or arguing about it. In fact "learn to use iterator chains, and it won't warn" is exactly the opposite of what I would want to do to anyone--that's actively hostile if it's the solution, whether that hostility is intended or not, and in addition isn't something that Rustc could advise people about. |
@ahicks92 I haven't used the word "idiomatic" in my response ("standard solution" is talking about suppressing the In more complex situation one should let mut v = vec![];
v.push(something);
if let Some(front) = v.front() {
// stuff.
}
#[expect(unused_must_use, reason="v takes ownership of another_thing which is a lock, ensuring it won't be released until v is dropped at the end of function")]
v.push(another_thing); |
This adds an RFC proposing a new attribute that would help detect mistakes when a parameter of a function is not used after the function returns. E.g. when inserting into a collection that is then immediately dropped.
Cc @joshtriplett
Rendered