2024-05-07 13:16:50

by Bartosz Golaszewski

[permalink] [raw]
Subject: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

From: Bartosz Golaszewski <[email protected]>

Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
caused a massive drop in performance of requesting GPIO lines due to the
call to synchronize_srcu() on each label change. Rework the code to not
wait until all read-only users are done with reading the label but
instead atomically replace the label pointer and schedule its release
after all read-only critical sections are done.

To that end wrap the descriptor label in a struct that also contains the
rcu_head struct required for deferring tasks using call_srcu() and stop
using kstrdup_const() as we're required to allocate memory anyway. Just
allocate enough for the label string and rcu_head in one go.

Reported-by: Neil Armstrong <[email protected]>
Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
Suggested-by: Paul E. McKenney <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
---
drivers/gpio/gpiolib.c | 31 ++++++++++++++++++++++++-------
drivers/gpio/gpiolib.h | 7 ++++++-
2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 94903fc1c145..2fa3756c9073 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -101,6 +101,7 @@ static bool gpiolib_initialized;

const char *gpiod_get_label(struct gpio_desc *desc)
{
+ struct gpio_desc_label *label;
unsigned long flags;

flags = READ_ONCE(desc->flags);
@@ -108,23 +109,36 @@ const char *gpiod_get_label(struct gpio_desc *desc)
!test_bit(FLAG_REQUESTED, &flags))
return "interrupt";

- return test_bit(FLAG_REQUESTED, &flags) ?
- srcu_dereference(desc->label, &desc->srcu) : NULL;
+ if (!test_bit(FLAG_REQUESTED, &flags))
+ return NULL;
+
+ label = srcu_dereference_check(desc->label, &desc->srcu,
+ srcu_read_lock_held(&desc->srcu));
+
+ return label->str;
+}
+
+static void desc_free_label(struct rcu_head *rh)
+{
+ kfree(container_of(rh, struct gpio_desc_label, rh));
}

static int desc_set_label(struct gpio_desc *desc, const char *label)
{
- const char *new = NULL, *old;
+ struct gpio_desc_label *new = NULL, *old;

if (label) {
- new = kstrdup_const(label, GFP_KERNEL);
+ new = kzalloc(struct_size(new, str, strlen(label) + 1),
+ GFP_KERNEL);
if (!new)
return -ENOMEM;
+
+ strcpy(new->str, label);
}

old = rcu_replace_pointer(desc->label, new, 1);
- synchronize_srcu(&desc->srcu);
- kfree_const(old);
+ if (old)
+ call_srcu(&desc->srcu, &old->rh, desc_free_label);

return 0;
}
@@ -697,8 +711,11 @@ static void gpiodev_release(struct device *dev)
struct gpio_device *gdev = to_gpio_device(dev);
unsigned int i;

- for (i = 0; i < gdev->ngpio; i++)
+ for (i = 0; i < gdev->ngpio; i++) {
+ /* Free pending label. */
+ synchronize_srcu(&gdev->descs[i].srcu);
cleanup_srcu_struct(&gdev->descs[i].srcu);
+ }

ida_free(&gpio_ida, gdev->id);
kfree_const(gdev->label);
diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
index f67d5991ab1c..69a353c789f0 100644
--- a/drivers/gpio/gpiolib.h
+++ b/drivers/gpio/gpiolib.h
@@ -137,6 +137,11 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory);

void gpiod_line_state_notify(struct gpio_desc *desc, unsigned long action);

+struct gpio_desc_label {
+ struct rcu_head rh;
+ char str[];
+};
+
/**
* struct gpio_desc - Opaque descriptor for a GPIO
*
@@ -177,7 +182,7 @@ struct gpio_desc {
#define FLAG_EVENT_CLOCK_HTE 19 /* GPIO CDEV reports hardware timestamps in events */

/* Connection label */
- const char __rcu *label;
+ struct gpio_desc_label __rcu *label;
/* Name of the GPIO */
const char *name;
#ifdef CONFIG_OF_DYNAMIC
--
2.40.1



2024-05-07 14:23:28

by Neil Armstrong

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On 07/05/2024 14:13, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <[email protected]>
>
> Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> caused a massive drop in performance of requesting GPIO lines due to the
> call to synchronize_srcu() on each label change. Rework the code to not
> wait until all read-only users are done with reading the label but
> instead atomically replace the label pointer and schedule its release
> after all read-only critical sections are done.
>
> To that end wrap the descriptor label in a struct that also contains the
> rcu_head struct required for deferring tasks using call_srcu() and stop
> using kstrdup_const() as we're required to allocate memory anyway. Just
> allocate enough for the label string and rcu_head in one go.
>
> Reported-by: Neil Armstrong <[email protected]>
> Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
> Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> Suggested-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Bartosz Golaszewski <[email protected]>
> ---
> drivers/gpio/gpiolib.c | 31 ++++++++++++++++++++++++-------
> drivers/gpio/gpiolib.h | 7 ++++++-
> 2 files changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 94903fc1c145..2fa3756c9073 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -101,6 +101,7 @@ static bool gpiolib_initialized;
>
> const char *gpiod_get_label(struct gpio_desc *desc)
> {
> + struct gpio_desc_label *label;
> unsigned long flags;
>
> flags = READ_ONCE(desc->flags);
> @@ -108,23 +109,36 @@ const char *gpiod_get_label(struct gpio_desc *desc)
> !test_bit(FLAG_REQUESTED, &flags))
> return "interrupt";
>
> - return test_bit(FLAG_REQUESTED, &flags) ?
> - srcu_dereference(desc->label, &desc->srcu) : NULL;
> + if (!test_bit(FLAG_REQUESTED, &flags))
> + return NULL;
> +
> + label = srcu_dereference_check(desc->label, &desc->srcu,
> + srcu_read_lock_held(&desc->srcu));
> +
> + return label->str;
> +}
> +
> +static void desc_free_label(struct rcu_head *rh)
> +{
> + kfree(container_of(rh, struct gpio_desc_label, rh));
> }
>
> static int desc_set_label(struct gpio_desc *desc, const char *label)
> {
> - const char *new = NULL, *old;
> + struct gpio_desc_label *new = NULL, *old;
>
> if (label) {
> - new = kstrdup_const(label, GFP_KERNEL);
> + new = kzalloc(struct_size(new, str, strlen(label) + 1),
> + GFP_KERNEL);
> if (!new)
> return -ENOMEM;
> +
> + strcpy(new->str, label);
> }
>
> old = rcu_replace_pointer(desc->label, new, 1);
> - synchronize_srcu(&desc->srcu);
> - kfree_const(old);
> + if (old)
> + call_srcu(&desc->srcu, &old->rh, desc_free_label);
>
> return 0;
> }
> @@ -697,8 +711,11 @@ static void gpiodev_release(struct device *dev)
> struct gpio_device *gdev = to_gpio_device(dev);
> unsigned int i;
>
> - for (i = 0; i < gdev->ngpio; i++)
> + for (i = 0; i < gdev->ngpio; i++) {
> + /* Free pending label. */
> + synchronize_srcu(&gdev->descs[i].srcu);
> cleanup_srcu_struct(&gdev->descs[i].srcu);
> + }
>
> ida_free(&gpio_ida, gdev->id);
> kfree_const(gdev->label);
> diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
> index f67d5991ab1c..69a353c789f0 100644
> --- a/drivers/gpio/gpiolib.h
> +++ b/drivers/gpio/gpiolib.h
> @@ -137,6 +137,11 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory);
>
> void gpiod_line_state_notify(struct gpio_desc *desc, unsigned long action);
>
> +struct gpio_desc_label {
> + struct rcu_head rh;
> + char str[];
> +};
> +
> /**
> * struct gpio_desc - Opaque descriptor for a GPIO
> *
> @@ -177,7 +182,7 @@ struct gpio_desc {
> #define FLAG_EVENT_CLOCK_HTE 19 /* GPIO CDEV reports hardware timestamps in events */
>
> /* Connection label */
> - const char __rcu *label;
> + struct gpio_desc_label __rcu *label;
> /* Name of the GPIO */
> const char *name;
> #ifdef CONFIG_OF_DYNAMIC

Tested-by: Neil Armstrong <[email protected]> # on SM8650-QRD

Reduces the penalty from 100x / 2985x to only ~2x

Thanks,
Neil

2024-05-07 14:25:29

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <[email protected]>
>
> Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> caused a massive drop in performance of requesting GPIO lines due to the
> call to synchronize_srcu() on each label change. Rework the code to not
> wait until all read-only users are done with reading the label but
> instead atomically replace the label pointer and schedule its release
> after all read-only critical sections are done.
>
> To that end wrap the descriptor label in a struct that also contains the
> rcu_head struct required for deferring tasks using call_srcu() and stop
> using kstrdup_const() as we're required to allocate memory anyway. Just
> allocate enough for the label string and rcu_head in one go.
>
> Reported-by: Neil Armstrong <[email protected]>
> Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
> Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> Suggested-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Bartosz Golaszewski <[email protected]>

Looks good to me!

Acked-by: Paul E. McKenney <[email protected]>

One semi-related question... Why the per-descriptor srcu_struct?
Please see below for more on this question.

Thanx, Paul

> ---
> drivers/gpio/gpiolib.c | 31 ++++++++++++++++++++++++-------
> drivers/gpio/gpiolib.h | 7 ++++++-
> 2 files changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 94903fc1c145..2fa3756c9073 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -101,6 +101,7 @@ static bool gpiolib_initialized;
>
> const char *gpiod_get_label(struct gpio_desc *desc)
> {
> + struct gpio_desc_label *label;
> unsigned long flags;
>
> flags = READ_ONCE(desc->flags);
> @@ -108,23 +109,36 @@ const char *gpiod_get_label(struct gpio_desc *desc)
> !test_bit(FLAG_REQUESTED, &flags))
> return "interrupt";
>
> - return test_bit(FLAG_REQUESTED, &flags) ?
> - srcu_dereference(desc->label, &desc->srcu) : NULL;
> + if (!test_bit(FLAG_REQUESTED, &flags))
> + return NULL;
> +
> + label = srcu_dereference_check(desc->label, &desc->srcu,
> + srcu_read_lock_held(&desc->srcu));
> +
> + return label->str;
> +}
> +
> +static void desc_free_label(struct rcu_head *rh)
> +{
> + kfree(container_of(rh, struct gpio_desc_label, rh));
> }
>
> static int desc_set_label(struct gpio_desc *desc, const char *label)
> {
> - const char *new = NULL, *old;
> + struct gpio_desc_label *new = NULL, *old;
>
> if (label) {
> - new = kstrdup_const(label, GFP_KERNEL);
> + new = kzalloc(struct_size(new, str, strlen(label) + 1),
> + GFP_KERNEL);
> if (!new)
> return -ENOMEM;
> +
> + strcpy(new->str, label);
> }
>
> old = rcu_replace_pointer(desc->label, new, 1);
> - synchronize_srcu(&desc->srcu);
> - kfree_const(old);
> + if (old)
> + call_srcu(&desc->srcu, &old->rh, desc_free_label);
>
> return 0;
> }
> @@ -697,8 +711,11 @@ static void gpiodev_release(struct device *dev)
> struct gpio_device *gdev = to_gpio_device(dev);
> unsigned int i;
>
> - for (i = 0; i < gdev->ngpio; i++)
> + for (i = 0; i < gdev->ngpio; i++) {
> + /* Free pending label. */
> + synchronize_srcu(&gdev->descs[i].srcu);
> cleanup_srcu_struct(&gdev->descs[i].srcu);
> + }

If the srcu_struct was shared among all of these, you could just do one
synchronize_srcu() and one cleanup_srcu_struct() instead of needing to
do one per gdev->desc[] entry.

You might be able to go further and have one srcu_struct for all the
gpio devices.

Or did you guys run tests and find some performance problem with sharing
srcu_struct structures? (I wouldn't expect one, but sometimes the
hardware has a better imagination than I do.)

> ida_free(&gpio_ida, gdev->id);
> kfree_const(gdev->label);
> diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
> index f67d5991ab1c..69a353c789f0 100644
> --- a/drivers/gpio/gpiolib.h
> +++ b/drivers/gpio/gpiolib.h
> @@ -137,6 +137,11 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory);
>
> void gpiod_line_state_notify(struct gpio_desc *desc, unsigned long action);
>
> +struct gpio_desc_label {
> + struct rcu_head rh;
> + char str[];
> +};
> +
> /**
> * struct gpio_desc - Opaque descriptor for a GPIO
> *
> @@ -177,7 +182,7 @@ struct gpio_desc {
> #define FLAG_EVENT_CLOCK_HTE 19 /* GPIO CDEV reports hardware timestamps in events */
>
> /* Connection label */
> - const char __rcu *label;
> + struct gpio_desc_label __rcu *label;
> /* Name of the GPIO */
> const char *name;
> #ifdef CONFIG_OF_DYNAMIC
> --
> 2.40.1
>

2024-05-07 14:48:27

by Bartosz Golaszewski

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 7, 2024 at 4:24 PM Paul E. McKenney <[email protected]> wrote:
>
> On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <[email protected]>
> >
> > Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > caused a massive drop in performance of requesting GPIO lines due to the
> > call to synchronize_srcu() on each label change. Rework the code to not
> > wait until all read-only users are done with reading the label but
> > instead atomically replace the label pointer and schedule its release
> > after all read-only critical sections are done.
> >
> > To that end wrap the descriptor label in a struct that also contains the
> > rcu_head struct required for deferring tasks using call_srcu() and stop
> > using kstrdup_const() as we're required to allocate memory anyway. Just
> > allocate enough for the label string and rcu_head in one go.
> >
> > Reported-by: Neil Armstrong <[email protected]>
> > Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
> > Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > Suggested-by: Paul E. McKenney <[email protected]>
> > Signed-off-by: Bartosz Golaszewski <[email protected]>
>
> Looks good to me!
>
> Acked-by: Paul E. McKenney <[email protected]>
>
> One semi-related question... Why the per-descriptor srcu_struct?
>
> If the srcu_struct was shared among all of these, you could just do one
> synchronize_srcu() and one cleanup_srcu_struct() instead of needing to
> do one per gdev->desc[] entry.
>
> You might be able to go further and have one srcu_struct for all the
> gpio devices.
>
> Or did you guys run tests and find some performance problem with sharing
> srcu_struct structures? (I wouldn't expect one, but sometimes the
> hardware has a better imagination than I do.)
>

I guess my goal was not to make synchronize_srcu() for descriptor X
wait for read-only operations on descriptor Y. But with that gone, I
suppose you're right, we can improve this patch further by switching
to a single SRCU descriptor.

I'll send a v2.

Bart

2024-05-07 16:15:01

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 07, 2024 at 04:48:04PM +0200, Bartosz Golaszewski wrote:
> On Tue, May 7, 2024 at 4:24 PM Paul E. McKenney <[email protected]> wrote:
> >
> > On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski <[email protected]>
> > >
> > > Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > > caused a massive drop in performance of requesting GPIO lines due to the
> > > call to synchronize_srcu() on each label change. Rework the code to not
> > > wait until all read-only users are done with reading the label but
> > > instead atomically replace the label pointer and schedule its release
> > > after all read-only critical sections are done.
> > >
> > > To that end wrap the descriptor label in a struct that also contains the
> > > rcu_head struct required for deferring tasks using call_srcu() and stop
> > > using kstrdup_const() as we're required to allocate memory anyway. Just
> > > allocate enough for the label string and rcu_head in one go.
> > >
> > > Reported-by: Neil Armstrong <[email protected]>
> > > Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
> > > Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > > Suggested-by: Paul E. McKenney <[email protected]>
> > > Signed-off-by: Bartosz Golaszewski <[email protected]>
> >
> > Looks good to me!
> >
> > Acked-by: Paul E. McKenney <[email protected]>
> >
> > One semi-related question... Why the per-descriptor srcu_struct?
> >
> > If the srcu_struct was shared among all of these, you could just do one
> > synchronize_srcu() and one cleanup_srcu_struct() instead of needing to
> > do one per gdev->desc[] entry.
> >
> > You might be able to go further and have one srcu_struct for all the
> > gpio devices.
> >
> > Or did you guys run tests and find some performance problem with sharing
> > srcu_struct structures? (I wouldn't expect one, but sometimes the
> > hardware has a better imagination than I do.)
> >
>
> I guess my goal was not to make synchronize_srcu() for descriptor X
> wait for read-only operations on descriptor Y. But with that gone, I
> suppose you're right, we can improve this patch further by switching
> to a single SRCU descriptor.
>
> I'll send a v2.

My guess is that a separate patch for each of the two changes would be
best, but I must defer to you guys on that.

Thanx, Paul

2024-05-07 16:39:31

by Bartosz Golaszewski

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 7, 2024 at 6:01 PM Paul E. McKenney <[email protected]> wrote:
>
> On Tue, May 07, 2024 at 04:48:04PM +0200, Bartosz Golaszewski wrote:
> > On Tue, May 7, 2024 at 4:24 PM Paul E. McKenney <paulmck@kernelorg> wrote:
> > >
> > > On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> > > > From: Bartosz Golaszewski <[email protected]>
> > > >
> > > > Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > > > caused a massive drop in performance of requesting GPIO lines due to the
> > > > call to synchronize_srcu() on each label change. Rework the code to not
> > > > wait until all read-only users are done with reading the label but
> > > > instead atomically replace the label pointer and schedule its release
> > > > after all read-only critical sections are done.
> > > >
> > > > To that end wrap the descriptor label in a struct that also contains the
> > > > rcu_head struct required for deferring tasks using call_srcu() and stop
> > > > using kstrdup_const() as we're required to allocate memory anyway. Just
> > > > allocate enough for the label string and rcu_head in one go.
> > > >
> > > > Reported-by: Neil Armstrong <[email protected]>
> > > > Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
> > > > Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > > > Suggested-by: Paul E. McKenney <[email protected]>
> > > > Signed-off-by: Bartosz Golaszewski <[email protected]>
> > >
> > > Looks good to me!
> > >
> > > Acked-by: Paul E. McKenney <[email protected]>
> > >
> > > One semi-related question... Why the per-descriptor srcu_struct?
> > >
> > > If the srcu_struct was shared among all of these, you could just do one
> > > synchronize_srcu() and one cleanup_srcu_struct() instead of needing to
> > > do one per gdev->desc[] entry.
> > >
> > > You might be able to go further and have one srcu_struct for all the
> > > gpio devices.
> > >
> > > Or did you guys run tests and find some performance problem with sharing
> > > srcu_struct structures? (I wouldn't expect one, but sometimes the
> > > hardware has a better imagination than I do.)
> > >
> >
> > I guess my goal was not to make synchronize_srcu() for descriptor X
> > wait for read-only operations on descriptor Y. But with that gone, I
> > suppose you're right, we can improve this patch further by switching
> > to a single SRCU descriptor.
> >
> > I'll send a v2.
>
> My guess is that a separate patch for each of the two changes would be
> best, but I must defer to you guys on that.
>
> Thanx, Paul

Ok, makes sense, I'll apply this and send a follow-up.

Bart

2024-05-07 17:25:10

by Bartosz Golaszewski

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

From: Bartosz Golaszewski <[email protected]>


On Tue, 07 May 2024 14:13:46 +0200, Bartosz Golaszewski wrote:
> Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> caused a massive drop in performance of requesting GPIO lines due to the
> call to synchronize_srcu() on each label change. Rework the code to not
> wait until all read-only users are done with reading the label but
> instead atomically replace the label pointer and schedule its release
> after all read-only critical sections are done.
>
> [...]

Applied, thanks!

[1/1] gpiolib: fix the speed of descriptor label setting with SRCU
commit: a86d27693066a34a29be86f394bbad847b2d1749

Best regards,
--
Bartosz Golaszewski <[email protected]>

2024-05-07 17:38:37

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <[email protected]>
>
> Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> caused a massive drop in performance of requesting GPIO lines due to the
> call to synchronize_srcu() on each label change. Rework the code to not
> wait until all read-only users are done with reading the label but
> instead atomically replace the label pointer and schedule its release
> after all read-only critical sections are done.

> To that end wrap the descriptor label in a struct that also contains the
> rcu_head struct required for deferring tasks using call_srcu() and stop
> using kstrdup_const() as we're required to allocate memory anyway.

If there is no label and we assign something like "?" (two bytes) we got
with your patch the allocation of most likely 32 bytes (as next power of
two for the SLAB) instead of 18..24.

OTOH, I dunno if SLAB supports 24-bytes. If not, it means that up to 16 bytes
label there would be no difference. In any case, with a new update (as far
as I understood the next move) it might return to kstrdup_const() or so.

> Just allocate enough for the label string and rcu_head in one go.

--
With Best Regards,
Andy Shevchenko



2024-05-09 12:39:19

by Bartosz Golaszewski

[permalink] [raw]
Subject: Re: [PATCH] gpiolib: fix the speed of descriptor label setting with SRCU

On Tue, May 7, 2024 at 7:38 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Tue, May 07, 2024 at 02:13:46PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <[email protected]>
> >
> > Commit 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
> > caused a massive drop in performance of requesting GPIO lines due to the
> > call to synchronize_srcu() on each label change. Rework the code to not
> > wait until all read-only users are done with reading the label but
> > instead atomically replace the label pointer and schedule its release
> > after all read-only critical sections are done.
>
> > To that end wrap the descriptor label in a struct that also contains the
> > rcu_head struct required for deferring tasks using call_srcu() and stop
> > using kstrdup_const() as we're required to allocate memory anyway.
>
> If there is no label and we assign something like "?" (two bytes) we got
> with your patch the allocation of most likely 32 bytes (as next power of
> two for the SLAB) instead of 18..24.
>
> OTOH, I dunno if SLAB supports 24-bytes. If not, it means that up to 16 bytes
> label there would be no difference. In any case, with a new update (as far
> as I understood the next move) it might return to kstrdup_const() or so.
>

Memory is cheap. This is just done for requested lines of which there
are never that many. I wouldn't stress about it. The rcu_head struct
is already 32 bytes on its own.

Bart