LinuxLists.cc - Re: [PATCH v4 7/7] rust: workqueue: add examples

2023-10-03 20:13:40

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

+//! #[pin_data]
+//! struct MyStruct {
+//! value: i32,
+//! #[pin]
+//! work: Work<MyStruct>,
+//! }
+//!
+//! impl_has_work! {
+//! impl HasWork<Self> for MyStruct { self.work }
+//! }
+//!
+//! impl MyStruct {
+//! fn new(value: i32) -> Result<Arc<Self>> {
+//! Arc::pin_init(pin_init!(MyStruct {
+//! value,
+//! work <- new_work!("MyStruct::work"),
+//! }))
+//! }
+//! }
+//!
+//! impl WorkItem for MyStruct {
+//! type Pointer = Arc<MyStruct>;
+//!
+//! fn run(this: Arc<MyStruct>) {
+//! pr_info!("The value is: {}", this.value);
+//! }
+//! }
+//!
+//! /// This method will enqueue the struct for execution on the system workqueue, where its value
+//! /// will be printed.
+//! fn print_later(val: Arc<MyStruct>) {
+//! let _ = workqueue::system().enqueue(val);
+//! }

I understand that this is highly opionated, but is it possible to make
the initialization less verbose?

Because the C version looks much, much cleaner and easier to grasp:

struct my_struct {
i32 value;
struct work_struct work;
};

void log_value(struct work_struct *work)
{
struct my_struct *s = container_of(work, struct my_struct, work);
pr_info("The value is: %d\n", s->value);
}

void print_later(struct my_struct &s)
{
INIT_WORK(&s->work, log_value);
schedule_work(&s->work);
}

2023-10-03 22:30:01

by Alice Ryhl

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

On Tue, Oct 3, 2023 at 10:13PM Konstantin Shelekhin <[email protected]> wrote:
> +//! #[pin_data]
> +//! struct MyStruct {
> +//! value: i32,
> +//! #[pin]
> +//! work: Work<MyStruct>,
> +//! }
> +//!
> +//! impl_has_work! {
> +//! impl HasWork<Self> for MyStruct { self.work }
> +//! }
> +//!
> +//! impl MyStruct {
> +//! fn new(value: i32) -> Result<Arc<Self>> {
> +//! Arc::pin_init(pin_init!(MyStruct {
> +//! value,
> +//! work <- new_work!("MyStruct::work"),
> +//! }))
> +//! }
> +//! }
> +//!
> +//! impl WorkItem for MyStruct {
> +//! type Pointer = Arc<MyStruct>;
> +//!
> +//! fn run(this: Arc<MyStruct>) {
> +//! pr_info!("The value is: {}", this.value);
> +//! }
> +//! }
> +//!
> +//! /// This method will enqueue the struct for execution on the system workqueue, where its value
> +//! /// will be printed.
> +//! fn print_later(val: Arc<MyStruct>) {
> +//! let _ = workqueue::system().enqueue(val);
> +//! }
>
> I understand that this is highly opionated, but is it possible to make
> the initialization less verbose?

The short answer is yes. There are safe alternatives that are much less
verbose. Unfortunately, those alternatives give up some of the features
that this design has. Specifically, they give up the feature that allows
you to embed the work_struct inside custom structs. I need to be able to
embed the work_struct inside of custom structs, so I did not go that
route.

There are also some parts of this (mainly `impl_has_work!`) that I am
unhappy with. I would be happy to see a solution that doesn't need it,
but I couldn't figure out how to avoid it.

> Because the C version looks much, much cleaner and easier to grasp:
>
> struct my_struct {
> i32 value;
> struct work_struct work;
> };
>
> void log_value(struct work_struct *work)
> {
> struct my_struct *s = container_of(work, struct my_struct, work);
> pr_info("The value is: %d\n", s->value);
> }
>
> void print_later(struct my_struct &s)
> {
> INIT_WORK(&s->work, log_value);
> schedule_work(&s->work);
> }

Although I think that a part of this is just whether you are familiar
with Rust syntax, there is definitely some truth to this. Your code is a
lot closer to the machine code of what actually happens here. Perhaps it
would be interesting to see what you get if you just unsafely do exactly
the same thing in Rust? It would look something like this:

struct MyStruct {
value: i32,
work: bindings::work_struct,
}

unsafe fn log_value(work: *mut bindings::work_struct) {
unsafe {
let s = container_of!(work, MyStruct, work);
pr_info!("The value is: {}", (*s).value);
}
}

unsafe fn print_later(s: *mut bindings::work_struct) {
unsafe {
bindings::INIT_WORK(&mut (*s).work, log_value);
bindings::schedule_work(&mut (*s).work);
}
}

(I didn't try to compile this.)

The problem with this approach is that it uses unsafe in driver code,
but the goal behind Rust abstractions is to isolate all of the related
unsafe code. The idea being that drivers using the workqueue do not need
any unsafe code to use it. This means that, assuming these workqueue
abstractions are correct, no driver can accidentally cause memory
unsafety by using the workqueue wrong.

The main difficult part of making this happen is the container_of
operation. We need to somehow verify *at compile time* that the
container_of in log_value really is given a pointer to the work field of
a MyStruct. Other than the things that are just how Rust looks, most of
the verbosity is necessary to make this compile-time check possible.

Another thing it does is handle proper transfer of ownership. In my
original example, MyStruct is reference counted (due to the use of Arc),
so the workqueue passes ownership of one refcount to the workqueue,
which eventually passes the refcount to run. When `this` goes out of
scope at the end of `run`, the refcount is decremented and the struct is
freed if the refcount dropped to zero.

If you wanted to just have exclusive ownership of my_struct, you could
do that by using Box instead of Arc. In either case, the ownership is
correctly passed to run, and you cannot accidentally forget to free it
at the end of log_value.

So, ultimately there's a tradeoff here. The code corresponds less
directly to what the machine code will be. On the other hand, it will be
*more* difficult to use incorrectly since incorrect uses will usually
result in compilation errors. The claim of Rust is that this tradeoff is
worth it.

Alice

2023-10-04 11:06:31

by Konstantin Shelekhin

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

On Wed Oct 4, 2023 at 1:29 AM MSK, Alice Ryhl wrote:
> On Tue, Oct 3, 2023 at 10:13PM Konstantin Shelekhin <[email protected]> wrote:
> > +//! #[pin_data]
> > +//! struct MyStruct {
> > +//! value: i32,
> > +//! #[pin]
> > +//! work: Work<MyStruct>,
> > +//! }
> > +//!
> > +//! impl_has_work! {
> > +//! impl HasWork<Self> for MyStruct { self.work }
> > +//! }
> > +//!
> > +//! impl MyStruct {
> > +//! fn new(value: i32) -> Result<Arc<Self>> {
> > +//! Arc::pin_init(pin_init!(MyStruct {
> > +//! value,
> > +//! work <- new_work!("MyStruct::work"),
> > +//! }))
> > +//! }
> > +//! }
> > +//!
> > +//! impl WorkItem for MyStruct {
> > +//! type Pointer = Arc<MyStruct>;
> > +//!
> > +//! fn run(this: Arc<MyStruct>) {
> > +//! pr_info!("The value is: {}", this.value);
> > +//! }
> > +//! }
> > +//!
> > +//! /// This method will enqueue the struct for execution on the system workqueue, where its value
> > +//! /// will be printed.
> > +//! fn print_later(val: Arc<MyStruct>) {
> > +//! let _ = workqueue::system().enqueue(val);
> > +//! }
> >
> > I understand that this is highly opionated, but is it possible to make
> > the initialization less verbose?
>
> The short answer is yes. There are safe alternatives that are much less
> verbose. Unfortunately, those alternatives give up some of the features
> that this design has. Specifically, they give up the feature that allows
> you to embed the work_struct inside custom structs. I need to be able to
> embed the work_struct inside of custom structs, so I did not go that
> route.

My concern with the approach of using traits instead of calling an
initialization function is that a some of existing code uses the
following pattern:

static void nvmet_file_submit_buffered_io(struct nvmet_req *req)
{
INIT_WORK(&req->f.work, nvmet_file_buffered_io_work);
queue_work(buffered_io_wq, &req->f.work);
}

static void nvmet_file_execute_flush(struct nvmet_req *req)
{
if (!nvmet_check_transfer_len(req, 0))
return;
INIT_WORK(&req->f.work, nvmet_file_flush_work);
queue_work(nvmet_wq, &req->f.work);
}

static void nvmet_file_execute_dsm(struct nvmet_req *req)
{
if (!nvmet_check_data_len_lte(req, nvmet_dsm_len(req)))
return;
INIT_WORK(&req->f.work, nvmet_file_dsm_work);
queue_work(nvmet_wq, &req->f.work);
}

As you can see a single work struct is used here, and dispatching
happens beforehands. While it's possible to do the dispatching later in
run(), it's IMO cleaner to do this earlier.

> There are also some parts of this (mainly `impl_has_work!`) that I am
> unhappy with. I would be happy to see a solution that doesn't need it,
> but I couldn't figure out how to avoid it.
>
> > Because the C version looks much, much cleaner and easier to grasp:
> >
> > struct my_struct {
> > i32 value;
> > struct work_struct work;
> > };
> >
> > void log_value(struct work_struct *work)
> > {
> > struct my_struct *s = container_of(work, struct my_struct, work);
> > pr_info("The value is: %d\n", s->value);
> > }
> >
> > void print_later(struct my_struct &s)
> > {
> > INIT_WORK(&s->work, log_value);
> > schedule_work(&s->work);
> > }
>
> Although I think that a part of this is just whether you are familiar
> with Rust syntax, there is definitely some truth to this. Your code is a
> lot closer to the machine code of what actually happens here. Perhaps it
> would be interesting to see what you get if you just unsafely do exactly
> the same thing in Rust? It would look something like this:
>
> struct MyStruct {
> value: i32,
> work: bindings::work_struct,
> }
>
> unsafe fn log_value(work: *mut bindings::work_struct) {
> unsafe {
> let s = container_of!(work, MyStruct, work);
> pr_info!("The value is: {}", (*s).value);
> }
> }
>
> unsafe fn print_later(s: *mut bindings::work_struct) {
> unsafe {
> bindings::INIT_WORK(&mut (*s).work, log_value);
> bindings::schedule_work(&mut (*s).work);
> }
> }
>
> (I didn't try to compile this.)
>
> The problem with this approach is that it uses unsafe in driver code,
> but the goal behind Rust abstractions is to isolate all of the related
> unsafe code. The idea being that drivers using the workqueue do not need
> any unsafe code to use it. This means that, assuming these workqueue
> abstractions are correct, no driver can accidentally cause memory
> unsafety by using the workqueue wrong.
>
> The main difficult part of making this happen is the container_of
> operation. We need to somehow verify *at compile time* that the
> container_of in log_value really is given a pointer to the work field of
> a MyStruct. Other than the things that are just how Rust looks, most of
> the verbosity is necessary to make this compile-time check possible.
>
> Another thing it does is handle proper transfer of ownership. In my
> original example, MyStruct is reference counted (due to the use of Arc),
> so the workqueue passes ownership of one refcount to the workqueue,
> which eventually passes the refcount to run. When `this` goes out of
> scope at the end of `run`, the refcount is decremented and the struct is
> freed if the refcount dropped to zero.
>
> If you wanted to just have exclusive ownership of my_struct, you could
> do that by using Box instead of Arc. In either case, the ownership is
> correctly passed to run, and you cannot accidentally forget to free it
> at the end of log_value.
>
> So, ultimately there's a tradeoff here. The code corresponds less
> directly to what the machine code will be. On the other hand, it will be
> *more* difficult to use incorrectly since incorrect uses will usually
> result in compilation errors. The claim of Rust is that this tradeoff is
> worth it.

I get where all this coming from, I just really dislike the idea to
write all this code every time I need to pass something down the
workqueue. Maybe it's possible to hide most of the boilerplate behind a
derive.

Something like this, for example:

#[pin_data, derive(WorkContainer)]
struct MyStruct {
value: i32,
#[pin, work(fn = log_value)]
work: Work,
}

fn log_value(s: Arc<MyStruct>) {
pr_info!("The value is: {}", s.value);
}

fn print_later(s: Arc<MyStruct>) {
workqueue::system().enqueue(s);
}

2023-10-04 14:39:39

by Boqun Feng

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

On Wed, Oct 04, 2023 at 02:06:05PM +0300, Konstantin Shelekhin wrote:
> On Wed Oct 4, 2023 at 1:29 AM MSK, Alice Ryhl wrote:
> > On Tue, Oct 3, 2023 at 10:13PM Konstantin Shelekhin <[email protected]> wrote:
> > > +//! #[pin_data]
> > > +//! struct MyStruct {
> > > +//! value: i32,
> > > +//! #[pin]
> > > +//! work: Work<MyStruct>,
> > > +//! }
> > > +//!
> > > +//! impl_has_work! {
> > > +//! impl HasWork<Self> for MyStruct { self.work }
> > > +//! }
> > > +//!
> > > +//! impl MyStruct {
> > > +//! fn new(value: i32) -> Result<Arc<Self>> {
> > > +//! Arc::pin_init(pin_init!(MyStruct {
> > > +//! value,
> > > +//! work <- new_work!("MyStruct::work"),
> > > +//! }))
> > > +//! }
> > > +//! }
> > > +//!
> > > +//! impl WorkItem for MyStruct {
> > > +//! type Pointer = Arc<MyStruct>;
> > > +//!
> > > +//! fn run(this: Arc<MyStruct>) {
> > > +//! pr_info!("The value is: {}", this.value);
> > > +//! }
> > > +//! }
> > > +//!
> > > +//! /// This method will enqueue the struct for execution on the system workqueue, where its value
> > > +//! /// will be printed.
> > > +//! fn print_later(val: Arc<MyStruct>) {
> > > +//! let _ = workqueue::system().enqueue(val);
> > > +//! }
> > >
> > > I understand that this is highly opionated, but is it possible to make
> > > the initialization less verbose?
> >
> > The short answer is yes. There are safe alternatives that are much less
> > verbose. Unfortunately, those alternatives give up some of the features
> > that this design has. Specifically, they give up the feature that allows
> > you to embed the work_struct inside custom structs. I need to be able to
> > embed the work_struct inside of custom structs, so I did not go that
> > route.
>
> My concern with the approach of using traits instead of calling an
> initialization function is that a some of existing code uses the
> following pattern:
>
> static void nvmet_file_submit_buffered_io(struct nvmet_req *req)
> {
> INIT_WORK(&req->f.work, nvmet_file_buffered_io_work);
> queue_work(buffered_io_wq, &req->f.work);
> }
>
> static void nvmet_file_execute_flush(struct nvmet_req *req)
> {
> if (!nvmet_check_transfer_len(req, 0))
> return;
> INIT_WORK(&req->f.work, nvmet_file_flush_work);
> queue_work(nvmet_wq, &req->f.work);
> }
>
> static void nvmet_file_execute_dsm(struct nvmet_req *req)
> {
> if (!nvmet_check_data_len_lte(req, nvmet_dsm_len(req)))
> return;
> INIT_WORK(&req->f.work, nvmet_file_dsm_work);
> queue_work(nvmet_wq, &req->f.work);
> }
>
> As you can see a single work struct is used here, and dispatching
> happens beforehands. While it's possible to do the dispatching later in
> run(), it's IMO cleaner to do this earlier.

This is not a problem until nvmet actually uses/switches to Rust, right?
;-) We can certainly improve the API when a real user needs something.
Or you know someone is already working on this?

[...]
>
> I get where all this coming from, I just really dislike the idea to
> write all this code every time I need to pass something down the
> workqueue. Maybe it's possible to hide most of the boilerplate behind a
> derive.
>
> Something like this, for example:
>
> #[pin_data, derive(WorkContainer)]
> struct MyStruct {
> value: i32,
> #[pin, work(fn = log_value)]
> work: Work,
> }
>
> fn log_value(s: Arc<MyStruct>) {
> pr_info!("The value is: {}", s.value);
> }
>
> fn print_later(s: Arc<MyStruct>) {
> workqueue::system().enqueue(s);
> }

All of your suggestions make senses to me, but because we don't have
many users right now, it's actually hard to determine a "best" API. I
like what we have right now because it's explicit: people won't need to
learn much about procedure macros to understand how it works, and it
also provides better opportunities for people who's yet not familiar
with Rust to give some reviews. So starting with something relatively
simple and verbose may not be a bad idea ;-)

Again, I like your idea, we need to explore that direction, but one
dragon at a time ;-)

Regards,
Boqun

2023-10-04 14:57:04

by Konstantin Shelekhin

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

> This is not a problem until nvmet actually uses/switches to Rust, right?
> ;-) We can certainly improve the API when a real user needs something.
> Or you know someone is already working on this?

Nope, not at this moment. I have an itch to experiment with Rust and
iSCSI, but that's my personal toy without any plans to at least propose
it to the subsystem maintainers yet.

> All of your suggestions make senses to me, but because we don't have
> many users right now, it's actually hard to determine a "best" API. I
> like what we have right now because it's explicit: people won't need to
> learn much about procedure macros to understand how it works, and it
> also provides better opportunities for people who's yet not familiar
> with Rust to give some reviews. So starting with something relatively
> simple and verbose may not be a bad idea ;-)
>
> Again, I like your idea, we need to explore that direction, but one
> dragon at a time ;-)

Oh yeah, completely understand :)

2023-10-04 15:53:34

by Andreas Hindborg

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

Hi,

"Konstantin Shelekhin" <[email protected]> writes:

>> This is not a problem until nvmet actually uses/switches to Rust, right?
>> ;-) We can certainly improve the API when a real user needs something.
>> Or you know someone is already working on this?
>
> Nope, not at this moment. I have an itch to experiment with Rust and
> iSCSI, but that's my personal toy without any plans to at least propose
> it to the subsystem maintainers yet.

If you are so inclined, I would suggest you take a look at the blk-mq
bindings and the nvme (pci) [1] and null_blk [2] Rust drivers. I am
available if you have any questions!

BR Andreas

[1] https://github.com/metaspace/linux/tree/nvme-next-for-6.6
[2] https://github.com/metaspace/linux/tree/null_blk-next-for-6.6

2023-10-05 14:17:22

by Trevor Gross

[permalink] [raw]

Subject: Re: [PATCH v4 7/7] rust: workqueue: add examples

On Tue, Oct 3, 2023 at 6:30 PM Alice Ryhl <[email protected]> wrote:
> On Tue, Oct 3, 2023 at 10:13PM Konstantin Shelekhin <[email protected]> wrote:
> > +//! #[pin_data]
> > +//! struct MyStruct {
> > +//! value: i32,
> > +//! #[pin]
> > +//! work: Work<MyStruct>,
> > +//! }
> > +//!
> > +//! impl_has_work! {
> > +//! impl HasWork<Self> for MyStruct { self.work }
> > +//! }
> > +//!
> > +//! impl MyStruct {
> > +//! fn new(value: i32) -> Result<Arc<Self>> {
> > +//! Arc::pin_init(pin_init!(MyStruct {
> > +//! value,
> > +//! work <- new_work!("MyStruct::work"),
> > +//! }))
> > +//! }
> > +//! }
> > +//!

For what it's worth, I think that using a binding for return items
usually looks ever so slightly more clear than passing a multiline
argument

fn new(value: i32) -> Result<Arc<Self>> {
let tmp = pin_init!(MyStruct {
value,
work <- new_work!("MyStruct::work")
});
Arc::pin_init(tmp)
}