On Mon, Sep 19, 2022 at 4:50 PM Alex Gaynor <[email protected]> wrote:
>
> Rust's rules are that a function that's safe must not exhibit UB, no
> matter what arguments they're called with. This can be done with
> static checking or dynamic checking, with obvious trade offs between
> the two.
I think you are missing just how many things are "unsafe" in certain
contexts and *cannot* be validated.
This is not some kind of "a few special things".
This is things like absolutely _anything_ that allocates memory, or
takes a lock, or does a number of other things.
Those things are simply not "safe" if you hold a spinlock, or if you
are in a RCU read-locked region.
And there is literally no way to check for it in certain configurations. None.
So are you going to mark every single function that takes a mutex as
being "unsafe"?
Or are you just going to accept and understand that "hey, exactly like
with integer overflows, sometimes it will be checked, and sometimes it
just won't be".
Because that is literally the reality of the kernel. Sometimes you
WILL NOT have the checks, and you literally CANNOT have the checks.
This is just how reality is. You don't get to choose the universe you live in.
Linus
On Mon, Sep 19, 2022 at 4:58 PM Linus Torvalds
<[email protected]> wrote:
>
> This is not some kind of "a few special things".
>
> This is things like absolutely _anything_ that allocates memory, or
> takes a lock, or does a number of other things.
Examples of "number of other things" ends up being things like
"accessing user memory", which depending on what you are doing may be
very common too.
And btw, it's not only about the (multiple kinds of) atomic regions.
We have other context rules in the kernel too, like "does floating
point or vector unit calculations". Which you can actually do, but
only in a kernel_fpu_begin/kernel_fpu_end region.
Now, the floating point thing is rare enough that it's probably fine
to just say "no floating point at all in Rust code". It tends to be
very special code, so you'd write it in C or inline assembly, because
you're doing special things like using the vector unit for crypto
hashes using special CPU instructions.
Linus
On Mon, Sep 19, 2022 at 04:58:06PM -0700, Linus Torvalds wrote:
> On Mon, Sep 19, 2022 at 4:50 PM Alex Gaynor <[email protected]> wrote:
> >
> > Rust's rules are that a function that's safe must not exhibit UB, no
> > matter what arguments they're called with. This can be done with
> > static checking or dynamic checking, with obvious trade offs between
> > the two.
>
> I think you are missing just how many things are "unsafe" in certain
> contexts and *cannot* be validated.
>
> This is not some kind of "a few special things".
>
> This is things like absolutely _anything_ that allocates memory, or
> takes a lock, or does a number of other things.
>
> Those things are simply not "safe" if you hold a spinlock, or if you
> are in a RCU read-locked region.
>
> And there is literally no way to check for it in certain configurations. None.
>
> So are you going to mark every single function that takes a mutex as
> being "unsafe"?
>
In this early experiment stage, if something is unsafe per Rust safety
requirements, maybe we should mark it as "unsafe"? Not because Rust
safety needs trump kernel needs, but because we need showcases to the
Rust langauge people the things that are not working very well in Rust
today.
I definitely agree that we CANNOT change kernel behaviors to solely
fulfil Rust safety requirements, we (Rust-for-Linux people) should
either find a way to check in compiler time or just mark it as "unsafe".
Maybe I'm naive ;-) But keeping Rust safety requirements as they are
helps communication with the people on the other side (Rust
langauge/compiler): "Hey, I did everything per your safety requirements,
and it ends like this, I'm not happy about it, could you figure out
something helpful? After all, Rust is a *system programming" language,
it should be able to handle things like these".
Or we want to say "kernel is special, so please give me a option so that
I don't need to worry about these UBs and deal with my real problems"?
I don't have the first hand experience, but seems this is what we have
been doing with C for many years. Do we want to try a new strategy? ;-)
But perhaps it's not new, maybe it's done a few times already but didn't
end well..
Anyway, if I really want to teach Rust language/compiler people "I know
what I'm doing, the problem is that the language is not ready". What
should I do?
Regards,
Boqun
> Or are you just going to accept and understand that "hey, exactly like
> with integer overflows, sometimes it will be checked, and sometimes it
> just won't be".
>
> Because that is literally the reality of the kernel. Sometimes you
> WILL NOT have the checks, and you literally CANNOT have the checks.
>
> This is just how reality is. You don't get to choose the universe you live in.
>
> Linus
Linus Torvalds <[email protected]> writes:
> On Mon, Sep 19, 2022 at 4:58 PM Linus Torvalds
> <[email protected]> wrote:
>>
>> This is not some kind of "a few special things".
>>
>> This is things like absolutely _anything_ that allocates memory, or
>> takes a lock, or does a number of other things.
>
> Examples of "number of other things" ends up being things like
> "accessing user memory", which depending on what you are doing may be
> very common too.
>
> And btw, it's not only about the (multiple kinds of) atomic regions.
>
> We have other context rules in the kernel too, like "does floating
> point or vector unit calculations". Which you can actually do, but
> only in a kernel_fpu_begin/kernel_fpu_end region.
>
> Now, the floating point thing is rare enough that it's probably fine
> to just say "no floating point at all in Rust code". It tends to be
> very special code, so you'd write it in C or inline assembly, because
> you're doing special things like using the vector unit for crypto
> hashes using special CPU instructions.
I just want to point out that there are ways of representing things like
the context you are running in during compile time. I won't argue they
are necessarily practical, but there are ways and by exploring those
ways some practical solutions may result.
Instead of saying:
spin_lock(&lock);
do_something();
spin_unlock(&lock);
It is possible to say:
with_spin_lock(&lock, do_something());
This can be taken a step farther and with_spin_lock can pass a ``token''
say a structure with no members that disappears at compile time that
let's the code know it has the spinlock held.
In C I would do:
struct have_spin_lock_x {
// nothing
};
do_something(struct have_spin_lock_x context_guarantee)
{
...;
}
I think most of the special contexts in the kernel can be represented in
a similar manner. A special parameter that can be passed and will
compile out.
I don't recall seeing anything like that tried in the kernel so I don't
know if it makes sense or if it would get too wordy to live, but it is
possible. If passing a free context parameter is not too wordy it would
catch silly errors, and would hopefully leave more mental space for
developers to pay attention to the other details of the problems they
are solving.
*Shrug* I don't know if rust allows for free parameters like that and
if it does I don't know if it would be cheap enough to live.
Eric
On Tue, 20 Sep 2022 10:55:27 -0500
"Eric W. Biederman" <[email protected]> wrote:
> Linus Torvalds <[email protected]> writes:
>
> > On Mon, Sep 19, 2022 at 4:58 PM Linus Torvalds
> > <[email protected]> wrote:
> >>
> >> This is not some kind of "a few special things".
> >>
> >> This is things like absolutely _anything_ that allocates memory, or
> >> takes a lock, or does a number of other things.
> >
> > Examples of "number of other things" ends up being things like
> > "accessing user memory", which depending on what you are doing may
> > be very common too.
> >
> > And btw, it's not only about the (multiple kinds of) atomic regions.
> >
> > We have other context rules in the kernel too, like "does floating
> > point or vector unit calculations". Which you can actually do, but
> > only in a kernel_fpu_begin/kernel_fpu_end region.
> >
> > Now, the floating point thing is rare enough that it's probably fine
> > to just say "no floating point at all in Rust code". It tends to be
> > very special code, so you'd write it in C or inline assembly,
> > because you're doing special things like using the vector unit for
> > crypto hashes using special CPU instructions.
>
> I just want to point out that there are ways of representing things
> like the context you are running in during compile time. I won't
> argue they are necessarily practical, but there are ways and by
> exploring those ways some practical solutions may result.
>
> Instead of saying:
> spin_lock(&lock);
> do_something();
> spin_unlock(&lock);
>
> It is possible to say:
> with_spin_lock(&lock, do_something());
>
> This can be taken a step farther and with_spin_lock can pass a
> ``token'' say a structure with no members that disappears at compile
> time that let's the code know it has the spinlock held.
>
> In C I would do:
> struct have_spin_lock_x {
> // nothing
> };
>
> do_something(struct have_spin_lock_x context_guarantee)
> {
> ...;
> }
>
> I think most of the special contexts in the kernel can be represented
> in a similar manner. A special parameter that can be passed and will
> compile out.
>
> I don't recall seeing anything like that tried in the kernel so I
> don't know if it makes sense or if it would get too wordy to live,
> but it is possible. If passing a free context parameter is not too
> wordy it would catch silly errors, and would hopefully leave more
> mental space for developers to pay attention to the other details of
> the problems they are solving.
>
> *Shrug* I don't know if rust allows for free parameters like that and
> if it does I don't know if it would be cheap enough to live.
I believe this was mentioned by Wedson in one of his previous emails.
This pattern is quite common in Rust code. It looks like this:
#[derive(Clone, Copy)]
pub struct Token<'a>(PhantomData<&'a ()>);
pub fn with_token<T>(f: impl for<'a> FnOnce(Token<'a>) -> T) ->
T { f(Token(PhantomData))
}
Any function that requires something can just take token by value, e.g.
with `token: Token<'_>`. Since Token is a zero-sized type (ZST), this
parameter will be omitted during code generation, so it won't affect
the ABI and this has no runtime cost.
Example on godbolt: https://godbolt.org/z/9n954cG4d, showing that the
token is actually all optimised out.
It should be noted however, atomic context is not something that a
token can represent. You can only use tokens to restrict what you *can*
do, but not what you *can't* do. There is no negative reasoning with
tokens, you can't create a function that can only be called when you
don't have token.
You can use tokens to represent non-atomic contexts, but that'll be
really painful because this requires carrying a token in almost all
functions. This kind of API also works well for FPU contexts.
Best,
Gary
> On Sep 20, 2022, at 6:39 PM, Gary Guo <[email protected]> wrote:
>
> It should be noted however, atomic context is not something that a
> token can represent. You can only use tokens to restrict what you *can*
> do, but not what you *can't* do. There is no negative reasoning with
> tokens, you can't create a function that can only be called when you
> don't have token.
On the other hand, it ought to be feasible to implement that kind of ’negative reasoning' as a custom lint. It might not work as well as something built into the language, but it should work decently well, and could serve as a prototype for a future built-in feature.
On Wed, Sep 21, 2022 at 02:42:18AM -0400, comex wrote:
>
>
> > On Sep 20, 2022, at 6:39 PM, Gary Guo <[email protected]> wrote:
> >
> > It should be noted however, atomic context is not something that a
> > token can represent. You can only use tokens to restrict what you *can*
> > do, but not what you *can't* do. There is no negative reasoning with
> > tokens, you can't create a function that can only be called when you
> > don't have token.
>
> On the other hand, it ought to be feasible to implement that kind of
> ’negative reasoning' as a custom lint. It might not work as well as
> something built into the language, but it should work decently well,
> and could serve as a prototype for a future built-in feature.
Interesting, do you have an example somewhere?
Regards,
Boqun
>> On the other hand, it ought to be feasible to implement that kind of
>> ’negative reasoning' as a custom lint. It might not work as well as
>> something built into the language, but it should work decently well,
>> and could serve as a prototype for a future built-in feature.
>
> Interesting, do you have an example somewhere?
>
> Regards,
> Boqun
After some searching, I found this, which someone wrote several years ago for a
very similar purpose:
https://github.com/thepowersgang/tag_safe/
> This is a linter designed originally for use with a kernel, where functions
> need to be marked as "IRQ safe" (meaning they are safe to call within an IRQ
> handler, and handle the case where they may interrupt themselves).
> If a function is annotated with #[req_safe(ident)] (where ident can be
> anything, and defines the type of safety) this linter will check that all
> functions called by that function are either annotated with the same
> annotation or #[is_safe(ident)], OR they do not call functions with the
> reverse #[is_unsafe(ident)] annotation.
Note that the code won't work as-is with recent rustc. rustc's API for custom
lints is not stable, and in fact rustc has deprecated linter plugins entirely
[1], though there are alternative approaches to using custom lints [2]. Still,
it's a good example of the approach.
One fundamental caveat is that it doesn't seem to have the sophistication
needed to be sound with respect to indirect calls.
For example, suppose you have a function that fetches a callback from some
structure and calls it. Whether this function is IRQ-safe depends on whether
the callback is expected to be IRQ-safe, so in order to safety-check this, you
would need an annotation on either the callback field or the function pointer
type. This is more complex than just putting annotations on function
definitions.
Or suppose you have the following code:
fn foo() {
bar(|| do_something_not_irq_safe());
}
If `foo` is expected to be IRQ-safe, this may or may not be sound, depending on
whether `bar` calls the callback immediately or saves it for later. If `bar`
saves it for later, then it could be marked unconditionally IRQ-safe. But if
`bar` calls it immediately, then it's neither IRQ-safe nor IRQ-unsafe, but
effectively generic over IRQ safety. You could pessimistically mark it
IRQ-unsafe, but Rust has tons of basic helper methods that accept callbacks and
call them immediately; not being able to use any of them in an IRQ-safe context
would be quite limiting.
In short, a fully sound approach requires not just checking which functions
call which, but having some kind of integration with the type system. This is
the kind of issue that I was thinking of when I said a custom lint may not work
as well as something built into the language.
However, I do think it's *possible* to handle it soundly from a lint,
especially if it focuses on typical use cases and relies on manual annotations
for the rest. Alternately, even an unsound lint would be a good first step.
It wouldn't really comport with Rust's ethos of making safety guarantees
ironclad rather than heuristic, but it would serve as a good proof of concept
for a future language feature, while likely being helpful in practice in the
short term.
[1] https://github.com/rust-lang/rust/pull/64675/files
[2] https://www.trailofbits.com/post/write-rust-lints-without-forking-clippy
I don't know if there was ever a direct answer to this question; I was
hoping someone with more involvement in the project would jump in.
> In this early experiment stage, if something is unsafe per Rust safety
> requirements, maybe we should mark it as "unsafe"? Not because Rust
> safety needs trump kernel needs, but because we need showcases to the
> Rust langauge people the things that are not working very well in Rust
> today.
>
> I definitely agree that we CANNOT change kernel behaviors to solely
> fulfil Rust safety requirements, we (Rust-for-Linux people) should
> either find a way to check in compiler time or just mark it as "unsafe".
From the perspective of a Linux outsider, marking huge swaths of Rust
code "unsafe" doesn't actually sound incorrect to me. If a function
may exhibit undefined behavior, it's unsafe; thus, it should be marked
as such. True, you lose some of Rust's guarantees within the code
marked "unsafe", but this is what permits Rust to enforce those
guarantees elsewhere. Having lots of uses of "unsafe" in kernel code
is probably to be expected, and (in my opinion) not actually a sign
that things "are not working very well in Rust." It's not even
particularly verbose, if the callers of unsafe functions are generally
also marked "unsafe"; in these cases, only the signatures are
affected.
The last sentence of Boqun's quote above captures the intent of
"unsafe" perfectly: it simply indicates that code cannot be proven at
compile time not to exhibit undefined behavior. It's important for
systems languages to enable such code to be written, which is
precisely why Rust has the "unsafe" keyword.
On Mon, Sep 19, 2022 at 6:45 PM Boqun Feng <[email protected]> wrote:
>
> On Mon, Sep 19, 2022 at 04:58:06PM -0700, Linus Torvalds wrote:
> > On Mon, Sep 19, 2022 at 4:50 PM Alex Gaynor <[email protected]> wrote:
> > >
> > > Rust's rules are that a function that's safe must not exhibit UB, no
> > > matter what arguments they're called with. This can be done with
> > > static checking or dynamic checking, with obvious trade offs between
> > > the two.
> >
> > I think you are missing just how many things are "unsafe" in certain
> > contexts and *cannot* be validated.
> >
> > This is not some kind of "a few special things".
> >
> > This is things like absolutely _anything_ that allocates memory, or
> > takes a lock, or does a number of other things.
> >
> > Those things are simply not "safe" if you hold a spinlock, or if you
> > are in a RCU read-locked region.
> >
> > And there is literally no way to check for it in certain configurations. None.
> >
> > So are you going to mark every single function that takes a mutex as
> > being "unsafe"?
> >
>
> In this early experiment stage, if something is unsafe per Rust safety
> requirements, maybe we should mark it as "unsafe"? Not because Rust
> safety needs trump kernel needs, but because we need showcases to the
> Rust langauge people the things that are not working very well in Rust
> today.
>
> I definitely agree that we CANNOT change kernel behaviors to solely
> fulfil Rust safety requirements, we (Rust-for-Linux people) should
> either find a way to check in compiler time or just mark it as "unsafe".
>
> Maybe I'm naive ;-) But keeping Rust safety requirements as they are
> helps communication with the people on the other side (Rust
> langauge/compiler): "Hey, I did everything per your safety requirements,
> and it ends like this, I'm not happy about it, could you figure out
> something helpful? After all, Rust is a *system programming" language,
> it should be able to handle things like these".
>
> Or we want to say "kernel is special, so please give me a option so that
> I don't need to worry about these UBs and deal with my real problems"?
> I don't have the first hand experience, but seems this is what we have
> been doing with C for many years. Do we want to try a new strategy? ;-)
> But perhaps it's not new, maybe it's done a few times already but didn't
> end well..
>
> Anyway, if I really want to teach Rust language/compiler people "I know
> what I'm doing, the problem is that the language is not ready". What
> should I do?
>
> Regards,
> Boqun
>
> > Or are you just going to accept and understand that "hey, exactly like
> > with integer overflows, sometimes it will be checked, and sometimes it
> > just won't be".
> >
> > Because that is literally the reality of the kernel. Sometimes you
> > WILL NOT have the checks, and you literally CANNOT have the checks.
> >
> > This is just how reality is. You don't get to choose the universe you live in.
> >
> > Linus