2018-03-10 00:13:37

by Laura Abbott

[permalink] [raw]
Subject: [PATCH 0/4] VLA removal from the GPIO subsystem

Hi,

For those who haven't seen it, there's an effort to remove VLAs from the
kernel so we can turn on -Wvla in the name of security. See
https://lkml.org/lkml/2018/3/7/621 for more details and discussion.

This is a series to remove a few VLAs from the gpio subsystem. These are
compile tested only so I'd appreciate reviews and tests from
maintainers. When these get Acked, I expect these to just go through the
GPIO tree like usual.

Thanks,
Laura

Laura Abbott (4):
gpio: Remove VLA from gpiolib
gpio: Remove VLA from MAX3191X driver
gpio: Remove VLA from xra1403 driver
gpio: Remove VLA from stmpe driver

drivers/gpio/gpio-max3191x.c | 7 +++++-
drivers/gpio/gpio-stmpe.c | 7 +++++-
drivers/gpio/gpio-xra1403.c | 8 ++++++-
drivers/gpio/gpiolib.c | 55 ++++++++++++++++++++++++++++++++++++++------
4 files changed, 67 insertions(+), 10 deletions(-)

--
2.14.3



2018-03-10 00:12:25

by Laura Abbott

[permalink] [raw]
Subject: [PATCH 3/4] gpio: Remove VLA from xra1403 driver


The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621)

This patch replaces a VLA with an appropriate call to kmalloc_array.

Signed-off-by: Laura Abbott <[email protected]>
---
drivers/gpio/gpio-xra1403.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-xra1403.c b/drivers/gpio/gpio-xra1403.c
index 0230e4b7a2fb..8d4c8e99b251 100644
--- a/drivers/gpio/gpio-xra1403.c
+++ b/drivers/gpio/gpio-xra1403.c
@@ -126,11 +126,16 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
{
int reg;
struct xra1403 *xra = gpiochip_get_data(chip);
- int value[xra1403_regmap_cfg.max_register];
+ int *value;
int i;
unsigned int gcr;
unsigned int gsr;

+ value = kmalloc_array(xra1403_regmap_cfg.max_register, sizeof(*value),
+ GFP_KERNEL);
+ if (!value)
+ return;
+
seq_puts(s, "xra reg:");
for (reg = 0; reg <= xra1403_regmap_cfg.max_register; reg++)
seq_printf(s, " %2.2x", reg);
@@ -154,6 +159,7 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
(gcr & BIT(i)) ? "in" : "out",
(gsr & BIT(i)) ? "hi" : "lo");
}
+ kfree(value);
}
#else
#define xra1403_dbg_show NULL
--
2.14.3


2018-03-10 00:12:43

by Laura Abbott

[permalink] [raw]
Subject: [PATCH 1/4] gpio: Remove VLA from gpiolib



The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621)

This patch replaces several VLAs with an appropriate call to
kmalloc_array.

Signed-off-by: Laura Abbott <[email protected]>
---
drivers/gpio/gpiolib.c | 55 +++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 48 insertions(+), 7 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index d66de67ef307..124727c74931 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2662,16 +2662,33 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,

while (i < array_size) {
struct gpio_chip *chip = desc_array[i]->gdev->chip;
- unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
- unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
+ unsigned long *mask;
+ unsigned long *bits;
int first, j, ret;

+ mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+ sizeof(*mask),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+ if (!mask)
+ return -ENOMEM;
+
+ bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+ sizeof(*bits),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+ if (!bits) {
+ kfree(mask);
+ return -ENOMEM;
+ }
+
+
if (!can_sleep)
WARN_ON(chip->can_sleep);

/* collect all inputs belonging to the same chip */
first = i;
- memset(mask, 0, sizeof(mask));
+ memset(mask, 0, sizeof(*mask));
do {
const struct gpio_desc *desc = desc_array[i];
int hwgpio = gpio_chip_hwgpio(desc);
@@ -2682,8 +2699,11 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
(desc_array[i]->gdev->chip == chip));

ret = gpio_chip_get_multiple(chip, mask, bits);
- if (ret)
+ if (ret) {
+ kfree(bits);
+ kfree(mask);
return ret;
+ }

for (j = first; j < i; j++) {
const struct gpio_desc *desc = desc_array[j];
@@ -2695,6 +2715,8 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
value_array[j] = value;
trace_gpio_value(desc_to_gpio(desc), 1, value);
}
+ kfree(bits);
+ kfree(mask);
}
return 0;
}
@@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,

while (i < array_size) {
struct gpio_chip *chip = desc_array[i]->gdev->chip;
- unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
- unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
+ unsigned long *mask;
+ unsigned long *bits;
int count = 0;

+ mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+ sizeof(*mask),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+ if (!mask)
+ return;
+
+ bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+ sizeof(*bits),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+ if (!bits) {
+ kfree(mask);
+ return;
+ }
+
if (!can_sleep)
WARN_ON(chip->can_sleep);

- memset(mask, 0, sizeof(mask));
+ memset(mask, 0, sizeof(*mask));
do {
struct gpio_desc *desc = desc_array[i];
int hwgpio = gpio_chip_hwgpio(desc);
@@ -2925,6 +2963,9 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
/* push collected bits to outputs */
if (count != 0)
gpio_chip_set_multiple(chip, mask, bits);
+
+ kfree(mask);
+ kfree(bits);
}
}

--
2.14.3


2018-03-10 00:12:50

by Laura Abbott

[permalink] [raw]
Subject: [PATCH 2/4] gpio: Remove VLA from MAX3191X driver


The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621)

This patch replaces several a VLA with an appropriate call to
kmalloc_array.

Signed-off-by: Laura Abbott <[email protected]>
---
drivers/gpio/gpio-max3191x.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-max3191x.c b/drivers/gpio/gpio-max3191x.c
index f74b1072e84b..b5b9cb1fda50 100644
--- a/drivers/gpio/gpio-max3191x.c
+++ b/drivers/gpio/gpio-max3191x.c
@@ -315,12 +315,17 @@ static void gpiod_set_array_single_value_cansleep(unsigned int ndescs,
struct gpio_desc **desc,
int value)
{
- int i, values[ndescs];
+ int i, *values;
+
+ values = kmalloc_array(ndescs, sizeof(*values), GFP_KERNEL);
+ if (!values)
+ return;

for (i = 0; i < ndescs; i++)
values[i] = value;

gpiod_set_array_value_cansleep(ndescs, desc, values);
+ kfree(values);
}

static struct gpio_descs *devm_gpiod_get_array_optional_count(
--
2.14.3


2018-03-10 00:13:03

by Laura Abbott

[permalink] [raw]
Subject: [PATCH 4/4] gpio: Remove VLA from stmpe driver

The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621)

This patch replaces a VLA with an appropriate call to kmalloc_array.

Signed-off-by: Laura Abbott <[email protected]>
---
drivers/gpio/gpio-stmpe.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
index f8d7d1cd8488..b7854850bcdb 100644
--- a/drivers/gpio/gpio-stmpe.c
+++ b/drivers/gpio/gpio-stmpe.c
@@ -369,10 +369,14 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
struct stmpe *stmpe = stmpe_gpio->stmpe;
u8 statmsbreg;
int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
- u8 status[num_banks];
+ u8 *status;
int ret;
int i;

+ status = kmalloc_array(num_banks, sizeof(*status), GFP_ATOMIC);
+ if (!status)
+ return IRQ_NONE;
+
/*
* the stmpe_block_read() call below, imposes to set statmsbreg
* with the register located at the lowest address. As STMPE1600
@@ -424,6 +428,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
}
}

+ kfree(status);
return IRQ_HANDLED;
}

--
2.14.3


2018-03-12 07:24:02

by Nandor Han

[permalink] [raw]
Subject: Re: EXT: [PATCH 3/4] gpio: Remove VLA from xra1403 driver

On 10/03/18 02:10, Laura Abbott wrote:
>
> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621)
>
> This patch replaces a VLA with an appropriate call to kmalloc_array.
>
> Signed-off-by: Laura Abbott <[email protected]>
> ---

This looks good to me.
Reviewed-by: Nandor Han <[email protected]>

Nandor

> drivers/gpio/gpio-xra1403.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-xra1403.c b/drivers/gpio/gpio-xra1403.c
> index 0230e4b7a2fb..8d4c8e99b251 100644
> --- a/drivers/gpio/gpio-xra1403.c
> +++ b/drivers/gpio/gpio-xra1403.c
> @@ -126,11 +126,16 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
> {
> int reg;
> struct xra1403 *xra = gpiochip_get_data(chip);
> - int value[xra1403_regmap_cfg.max_register];
> + int *value;
> int i;
> unsigned int gcr;
> unsigned int gsr;
>
> + value = kmalloc_array(xra1403_regmap_cfg.max_register, sizeof(*value),
> + GFP_KERNEL);
> + if (!value)
> + return;
> +
> seq_puts(s, "xra reg:");
> for (reg = 0; reg <= xra1403_regmap_cfg.max_register; reg++)
> seq_printf(s, " %2.2x", reg);
> @@ -154,6 +159,7 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
> (gcr & BIT(i)) ? "in" : "out",
> (gsr & BIT(i)) ? "hi" : "lo");
> }
> + kfree(value);
> }
> #else
> #define xra1403_dbg_show NULL
>


2018-03-12 15:02:30

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On 2018-03-10 01:10, Laura Abbott wrote:
> /* collect all inputs belonging to the same chip */
> first = i;
> - memset(mask, 0, sizeof(mask));
> + memset(mask, 0, sizeof(*mask));

see below

> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
>
> while (i < array_size) {
> struct gpio_chip *chip = desc_array[i]->gdev->chip;
> - unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
> - unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
> + unsigned long *mask;
> + unsigned long *bits;
> int count = 0;
>
> + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> + sizeof(*mask),
> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> +
> + if (!mask)
> + return;
> +
> + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> + sizeof(*bits),
> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> +
> + if (!bits) {
> + kfree(mask);
> + return;
> + }
> +
> if (!can_sleep)
> WARN_ON(chip->can_sleep);
>
> - memset(mask, 0, sizeof(mask));
> + memset(mask, 0, sizeof(*mask));

Hm, it seems you're now only clearing the first word of mask, not the
entire allocation. Why not just use kcalloc() instead of kmalloc_array
to have it automatically cleared?

Other random thoughts: maybe two allocations for each loop iteration is
a bit much. Maybe do a first pass over the array and collect the maximal
chip->ngpio, do the memory allocation and freeing outside the loop (then
you'd of course need to preserve the memset() with appropriate length
computed). And maybe even just do one allocation, making bits point at
the second half.

Does the set function need to be updated to return an int to be able to
inform the caller that memory allocation failed?

Rasmus

2018-03-12 23:42:07

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On 03/12/2018 08:00 AM, Rasmus Villemoes wrote:
> On 2018-03-10 01:10, Laura Abbott wrote:
>> /* collect all inputs belonging to the same chip */
>> first = i;
>> - memset(mask, 0, sizeof(mask));
>> + memset(mask, 0, sizeof(*mask));
>
> see below
>
>> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
>>
>> while (i < array_size) {
>> struct gpio_chip *chip = desc_array[i]->gdev->chip;
>> - unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
>> - unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
>> + unsigned long *mask;
>> + unsigned long *bits;
>> int count = 0;
>>
>> + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>> + sizeof(*mask),
>> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>> +
>> + if (!mask)
>> + return;
>> +
>> + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>> + sizeof(*bits),
>> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>> +
>> + if (!bits) {
>> + kfree(mask);
>> + return;
>> + }
>> +
>> if (!can_sleep)
>> WARN_ON(chip->can_sleep);
>>
>> - memset(mask, 0, sizeof(mask));
>> + memset(mask, 0, sizeof(*mask));
>
> Hm, it seems you're now only clearing the first word of mask, not the
> entire allocation. Why not just use kcalloc() instead of kmalloc_array
> to have it automatically cleared?
>

Bleh, I didn't think through that carefully. I'll just switch
to kcalloc, especially since it calls kmalloc_array.

> Other random thoughts: maybe two allocations for each loop iteration is
> a bit much. Maybe do a first pass over the array and collect the maximal
> chip->ngpio, do the memory allocation and freeing outside the loop (then
> you'd of course need to preserve the memset() with appropriate length
> computed). And maybe even just do one allocation, making bits point at
> the second half.
>

I was trying to make minimal changes and match the existing code. Is this
likely to be an actual hot path to optimize?

> Does the set function need to be updated to return an int to be able to
> inform the caller that memory allocation failed?
>

That would involve changing the public API. I don't have a problem
doing so if that's what you want.

> Rasmus
>

Thanks,
Laura

2018-03-13 07:24:49

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On 2018-03-13 00:40, Laura Abbott wrote:
> On 03/12/2018 08:00 AM, Rasmus Villemoes wrote:
>>
>> Hm, it seems you're now only clearing the first word of mask, not the
>> entire allocation. Why not just use kcalloc() instead of kmalloc_array
>> to have it automatically cleared?
>
> Bleh, I didn't think through that carefully. I'll just switch
> to kcalloc, especially since it calls kmalloc_array.
>
>> Other random thoughts: maybe two allocations for each loop iteration is
>> a bit much. Maybe do a first pass over the array and collect the maximal
>> chip->ngpio, do the memory allocation and freeing outside the loop (then
>> you'd of course need to preserve the memset() with appropriate length
>> computed). And maybe even just do one allocation, making bits point at
>> the second half.
>>
>
> I was trying to make minimal changes and match the existing code.

Well, textually you may be making the minimal changes (though the error
handling adds some "boilerplate" kfree()s etc.), but semantically adding
two kmallocs in a loop could be a big deal. Dunno, maybe in practice all
the gpios (almost always) belong to the same chip, so there's only one
iteration through the loop anyway.

> Is this likely to be an actual hot path to optimize?

No idea, it was just a drive-by comment, so also...

>> Does the set function need to be updated to return an int to be able to
>> inform the caller that memory allocation failed?
>
> That would involve changing the public API. I don't have a problem
> doing so if that's what you want.

... I don't "want" anything, that'll be for the gpio maintainers to decide.

Rasmus

2018-03-13 09:14:26

by Phil Reid

[permalink] [raw]
Subject: Re: [PATCH 4/4] gpio: Remove VLA from stmpe driver

On 10/03/2018 08:10, Laura Abbott wrote:
> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621)
>
> This patch replaces a VLA with an appropriate call to kmalloc_array.
>
> Signed-off-by: Laura Abbott <[email protected]>
> ---
> drivers/gpio/gpio-stmpe.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
> index f8d7d1cd8488..b7854850bcdb 100644
> --- a/drivers/gpio/gpio-stmpe.c
> +++ b/drivers/gpio/gpio-stmpe.c
> @@ -369,10 +369,14 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
> struct stmpe *stmpe = stmpe_gpio->stmpe;
> u8 statmsbreg;
> int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
> - u8 status[num_banks];
> + u8 *status;
> int ret;
> int i;
>
> + status = kmalloc_array(num_banks, sizeof(*status), GFP_ATOMIC);
> + if (!status)
> + return IRQ_NONE;
> +
> /*
> * the stmpe_block_read() call below, imposes to set statmsbreg
> * with the register located at the lowest address. As STMPE1600
> @@ -424,6 +428,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
> }
> }
>
> + kfree(status);
> return IRQ_HANDLED;
> }
>
>

Doing this in an irq handler seems wrong.
Perhaps better if a buffer is pre-allocated in stmpe_gpio


--
Regards
Phil Reid



2018-03-13 09:44:19

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 0/4] VLA removal from the GPIO subsystem

On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:

> For those who haven't seen it, there's an effort to remove VLAs from the
> kernel so we can turn on -Wvla in the name of security. See
> https://lkml.org/lkml/2018/3/7/621 for more details and discussion.

OK I read up on it, I'm on board! Less is more.

> This is a series to remove a few VLAs from the gpio subsystem. These are
> compile tested only so I'd appreciate reviews and tests from
> maintainers. When these get Acked, I expect these to just go through the
> GPIO tree like usual.

Sure I will just queue it up with ACKs, or if I just think it looks nice
and reliable.

Thank you for doing this!

Yours,
Linus Walleij

2018-03-14 00:19:44

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 4/4] gpio: Remove VLA from stmpe driver

On 03/13/2018 02:13 AM, Phil Reid wrote:
> On 10/03/2018 08:10, Laura Abbott wrote:
>> The new challenge is to remove VLAs from the kernel
>> (see https://lkml.org/lkml/2018/3/7/621)
>>
>> This patch replaces a VLA with an appropriate call to kmalloc_array.
>>
>> Signed-off-by: Laura Abbott <[email protected]>
>> ---
>>   drivers/gpio/gpio-stmpe.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
>> index f8d7d1cd8488..b7854850bcdb 100644
>> --- a/drivers/gpio/gpio-stmpe.c
>> +++ b/drivers/gpio/gpio-stmpe.c
>> @@ -369,10 +369,14 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>       struct stmpe *stmpe = stmpe_gpio->stmpe;
>>       u8 statmsbreg;
>>       int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
>> -    u8 status[num_banks];
>> +    u8 *status;
>>       int ret;
>>       int i;
>> +    status = kmalloc_array(num_banks, sizeof(*status), GFP_ATOMIC);
>> +    if (!status)
>> +        return IRQ_NONE;
>> +
>>       /*
>>        * the stmpe_block_read() call below, imposes to set statmsbreg
>>        * with the register located at the lowest address. As STMPE1600
>> @@ -424,6 +428,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>           }
>>       }
>> +    kfree(status);
>>       return IRQ_HANDLED;
>>   }
>>
>
> Doing this in an irq handler seems wrong.
> Perhaps better if a buffer is pre-allocated in stmpe_gpio
>
>

Sure, I can pre-allocate the buffer in the probe.

Thanks,
Laura

2018-03-14 01:18:09

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 4/4] gpio: Remove VLA from stmpe driver

On 03/13/2018 05:18 PM, Laura Abbott wrote:
> On 03/13/2018 02:13 AM, Phil Reid wrote:
>> On 10/03/2018 08:10, Laura Abbott wrote:
>>> The new challenge is to remove VLAs from the kernel
>>> (see https://lkml.org/lkml/2018/3/7/621)
>>>
>>> This patch replaces a VLA with an appropriate call to kmalloc_array.
>>>
>>> Signed-off-by: Laura Abbott <[email protected]>
>>> ---
>>>   drivers/gpio/gpio-stmpe.c | 7 ++++++-
>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
>>> index f8d7d1cd8488..b7854850bcdb 100644
>>> --- a/drivers/gpio/gpio-stmpe.c
>>> +++ b/drivers/gpio/gpio-stmpe.c
>>> @@ -369,10 +369,14 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>>       struct stmpe *stmpe = stmpe_gpio->stmpe;
>>>       u8 statmsbreg;
>>>       int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
>>> -    u8 status[num_banks];
>>> +    u8 *status;
>>>       int ret;
>>>       int i;
>>> +    status = kmalloc_array(num_banks, sizeof(*status), GFP_ATOMIC);
>>> +    if (!status)
>>> +        return IRQ_NONE;
>>> +
>>>       /*
>>>        * the stmpe_block_read() call below, imposes to set statmsbreg
>>>        * with the register located at the lowest address. As STMPE1600
>>> @@ -424,6 +428,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>>           }
>>>       }
>>> +    kfree(status);
>>>       return IRQ_HANDLED;
>>>   }
>>>
>>
>> Doing this in an irq handler seems wrong.
>> Perhaps better if a buffer is pre-allocated in stmpe_gpio
>>
>>
>
> Sure, I can pre-allocate the buffer in the probe.
>
> Thanks,
> Laura

Actually I wonder if there would be concurrency problems if we
tried to pre-allocate a global buffer. But the IRQ handler
calls stmpe_block_read which takes a mutex to sleep so I think
doing the (small) kmalloc should be fine and I can change it to
a GFP_KERNEL.

Thanks,
Laura

2018-03-14 02:56:52

by Phil Reid

[permalink] [raw]
Subject: Re: [PATCH 4/4] gpio: Remove VLA from stmpe driver

On 14/03/2018 09:16, Laura Abbott wrote:
> On 03/13/2018 05:18 PM, Laura Abbott wrote:
>> On 03/13/2018 02:13 AM, Phil Reid wrote:
>>> On 10/03/2018 08:10, Laura Abbott wrote:
>>>> The new challenge is to remove VLAs from the kernel
>>>> (see https://lkml.org/lkml/2018/3/7/621)
>>>>
>>>> This patch replaces a VLA with an appropriate call to kmalloc_array.
>>>>
>>>> Signed-off-by: Laura Abbott <[email protected]>
>>>> ---
>>>>   drivers/gpio/gpio-stmpe.c | 7 ++++++-
>>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
>>>> index f8d7d1cd8488..b7854850bcdb 100644
>>>> --- a/drivers/gpio/gpio-stmpe.c
>>>> +++ b/drivers/gpio/gpio-stmpe.c
>>>> @@ -369,10 +369,14 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>>>       struct stmpe *stmpe = stmpe_gpio->stmpe;
>>>>       u8 statmsbreg;
>>>>       int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
>>>> -    u8 status[num_banks];
>>>> +    u8 *status;
>>>>       int ret;
>>>>       int i;
>>>> +    status = kmalloc_array(num_banks, sizeof(*status), GFP_ATOMIC);
>>>> +    if (!status)
>>>> +        return IRQ_NONE;
>>>> +
>>>>       /*
>>>>        * the stmpe_block_read() call below, imposes to set statmsbreg
>>>>        * with the register located at the lowest address. As STMPE1600
>>>> @@ -424,6 +428,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
>>>>           }
>>>>       }
>>>> +    kfree(status);
>>>>       return IRQ_HANDLED;
>>>>   }
>>>>
>>>
>>> Doing this in an irq handler seems wrong.
>>> Perhaps better if a buffer is pre-allocated in stmpe_gpio
>>
>> Sure, I can pre-allocate the buffer in the probe.
>
> Actually I wonder if there would be concurrency problems if we
> tried to pre-allocate a global buffer. But the IRQ handler
> calls stmpe_block_read which takes a mutex to sleep so I think
> doing the (small) kmalloc should be fine and I can change it to
> a GFP_KERNEL.
>
I'm no expert on this driver, but I wouldn't think there'd be any concurrency
problem if the buffer is only used by the irq handler.
It should never get called concurrently for the same device.

As to the impact, I'll admit I've really got no idea how much potential overhead a
kmalloc has compared to a mutex for the linux kernel.
Just a general rule of thumb, that memory allocations are usually expensive.
--
Regards
Phil Reid



2018-03-17 08:29:08

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On Mon, Mar 12, 2018 at 04:00:36PM +0100, Rasmus Villemoes wrote:
> On 2018-03-10 01:10, Laura Abbott wrote:
> > @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
> >
> > while (i < array_size) {
> > struct gpio_chip *chip = desc_array[i]->gdev->chip;
> > - unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
> > - unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
> > + unsigned long *mask;
> > + unsigned long *bits;
> > int count = 0;
> >
> > + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > + sizeof(*mask),
> > + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > +
> > + if (!mask)
> > + return;
> > +
> > + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > + sizeof(*bits),
> > + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > +
> > + if (!bits) {
> > + kfree(mask);
> > + return;
> > + }
> > +
> > if (!can_sleep)
> > WARN_ON(chip->can_sleep);
> >
> > - memset(mask, 0, sizeof(mask));
> > + memset(mask, 0, sizeof(*mask));
>
> Other random thoughts: maybe two allocations for each loop iteration is
> a bit much. Maybe do a first pass over the array and collect the maximal
> chip->ngpio, do the memory allocation and freeing outside the loop (then
> you'd of course need to preserve the memset() with appropriate length
> computed). And maybe even just do one allocation, making bits point at
> the second half.

I think those are great ideas because the function is kind of a hotpath
and usage of VLAs was motivated by the desire to make it fast.

I'd go one step further and store the maximum ngpio of all registered
chips in a global variable (and update it in gpiochip_add_data_with_key()),
then allocate 2 * max_ngpio once before entering the loop (as you've
suggested). That would avoid the first pass to determine the maximum
chip->ngpio. In most systems max_ngpio will be < 64, so one or two
unsigned longs depending on the arch's bitness.

FWIW, to achieve a stack overflow the platform or a driver need to specify
a huge number of GPIOs for a chip. So the exploitability is limited,
but of course it's still better to get rid of the VLAs.

Running v2 of this patch through checkpatch --strict results in a few
"Alignment should match open parenthesis" and one "Please don't use
multiple blank lines" complaint, granted those are nits but it may
be worth fixing them up front lest the usual suspects come along and
submit bikeshedding patches.

Thanks,

Lukas

2018-03-18 14:26:00

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On Sat, Mar 17, 2018 at 09:25:09AM +0100, Lukas Wunner wrote:
> On Mon, Mar 12, 2018 at 04:00:36PM +0100, Rasmus Villemoes wrote:
> > On 2018-03-10 01:10, Laura Abbott wrote:
> > > @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
> > >
> > > while (i < array_size) {
> > > struct gpio_chip *chip = desc_array[i]->gdev->chip;
> > > - unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
> > > - unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
> > > + unsigned long *mask;
> > > + unsigned long *bits;
> > > int count = 0;
> > >
> > > + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > > + sizeof(*mask),
> > > + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > > +
> > > + if (!mask)
> > > + return;
> > > +
> > > + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > > + sizeof(*bits),
> > > + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > > +
> > > + if (!bits) {
> > > + kfree(mask);
> > > + return;
> > > + }
> > > +
> > > if (!can_sleep)
> > > WARN_ON(chip->can_sleep);
> > >
> > > - memset(mask, 0, sizeof(mask));
> > > + memset(mask, 0, sizeof(*mask));
> >
> > Other random thoughts: maybe two allocations for each loop iteration is
> > a bit much. Maybe do a first pass over the array and collect the maximal
> > chip->ngpio, do the memory allocation and freeing outside the loop (then
> > you'd of course need to preserve the memset() with appropriate length
> > computed). And maybe even just do one allocation, making bits point at
> > the second half.
>
> I think those are great ideas because the function is kind of a hotpath
> and usage of VLAs was motivated by the desire to make it fast.
>
> I'd go one step further and store the maximum ngpio of all registered
> chips in a global variable (and update it in gpiochip_add_data_with_key()),
> then allocate 2 * max_ngpio once before entering the loop (as you've
> suggested). That would avoid the first pass to determine the maximum
> chip->ngpio. In most systems max_ngpio will be < 64, so one or two
> unsigned longs depending on the arch's bitness.

Actually, scratch that. If ngpio is usually smallish, we can just
allocate reasonably sized space for mask and bits on the stack,
and fall back to the kcalloc slowpath only if chip->ngpio exceeds
that limit. Basically the below (likewise compile-tested only),
this is on top of Laura's patch, could be squashed together.
Let me know what you think, thanks.

-- >8 --
Subject: [PATCH] gpio: Add fastpath to gpiod_get/set_array_value_complex()

Signed-off-by: Lukas Wunner <[email protected]>
---
drivers/gpio/gpiolib.c | 76 ++++++++++++++++++++++++--------------------------
1 file changed, 37 insertions(+), 39 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 429bc251392b..ffc67b0b866c 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2432,6 +2432,8 @@ static int gpio_chip_get_multiple(struct gpio_chip *chip,
return -EIO;
}

+#define FASTPATH_NGPIO 256
+
int gpiod_get_array_value_complex(bool raw, bool can_sleep,
unsigned int array_size,
struct gpio_desc **desc_array,
@@ -2441,27 +2443,24 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,

while (i < array_size) {
struct gpio_chip *chip = desc_array[i]->gdev->chip;
- unsigned long *mask;
- unsigned long *bits;
+ unsigned long fastpath[2 * BITS_TO_LONGS(FASTPATH_NGPIO)];
+ unsigned long *slowpath = NULL, *mask, *bits;
int first, j, ret;

- mask = kcalloc(BITS_TO_LONGS(chip->ngpio),
- sizeof(*mask),
- can_sleep ? GFP_KERNEL : GFP_ATOMIC);
-
- if (!mask)
- return -ENOMEM;
-
- bits = kcalloc(BITS_TO_LONGS(chip->ngpio),
- sizeof(*bits),
- can_sleep ? GFP_KERNEL : GFP_ATOMIC);
-
- if (!bits) {
- kfree(mask);
- return -ENOMEM;
+ if (likely(chip->ngpio <= FASTPATH_NGPIO)) {
+ memset(fastpath, 0, sizeof(fastpath));
+ mask = fastpath;
+ bits = fastpath + BITS_TO_LONGS(FASTPATH_NGPIO);
+ } else {
+ slowpath = kcalloc(2 * BITS_TO_LONGS(chip->ngpio),
+ sizeof(*slowpath),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+ if (!slowpath)
+ return -ENOMEM;
+ mask = slowpath;
+ bits = slowpath + BITS_TO_LONGS(chip->ngpio);
}

-
if (!can_sleep)
WARN_ON(chip->can_sleep);

@@ -2478,8 +2477,8 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,

ret = gpio_chip_get_multiple(chip, mask, bits);
if (ret) {
- kfree(bits);
- kfree(mask);
+ if (slowpath)
+ kfree(slowpath);
return ret;
}

@@ -2493,8 +2492,9 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
value_array[j] = value;
trace_gpio_value(desc_to_gpio(desc), 1, value);
}
- kfree(bits);
- kfree(mask);
+
+ if (slowpath)
+ kfree(slowpath);
}
return 0;
}
@@ -2699,24 +2699,22 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,

while (i < array_size) {
struct gpio_chip *chip = desc_array[i]->gdev->chip;
- unsigned long *mask;
- unsigned long *bits;
+ unsigned long fastpath[2 * BITS_TO_LONGS(FASTPATH_NGPIO)];
+ unsigned long *slowpath = NULL, *mask, *bits;
int count = 0;

- mask = kcalloc(BITS_TO_LONGS(chip->ngpio),
- sizeof(*mask),
- can_sleep ? GFP_KERNEL : GFP_ATOMIC);
-
- if (!mask)
- return -ENOMEM;
-
- bits = kcalloc(BITS_TO_LONGS(chip->ngpio),
- sizeof(*bits),
- can_sleep ? GFP_KERNEL : GFP_ATOMIC);
-
- if (!bits) {
- kfree(mask);
- return -ENOMEM;
+ if (likely(chip->ngpio <= FASTPATH_NGPIO)) {
+ memset(fastpath, 0, sizeof(fastpath));
+ mask = fastpath;
+ bits = fastpath + BITS_TO_LONGS(FASTPATH_NGPIO);
+ } else {
+ slowpath = kcalloc(2 * BITS_TO_LONGS(chip->ngpio),
+ sizeof(*slowpath),
+ can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+ if (!slowpath)
+ return -ENOMEM;
+ mask = slowpath;
+ bits = slowpath + BITS_TO_LONGS(chip->ngpio);
}

if (!can_sleep)
@@ -2753,8 +2751,8 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
if (count != 0)
gpio_chip_set_multiple(chip, mask, bits);

- kfree(mask);
- kfree(bits);
+ if (slowpath)
+ kfree(slowpath);
}
return 0;
}
--
2.16.2


2018-03-18 20:35:23

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On 2018-03-18 15:23, Lukas Wunner wrote:
>>>
>>> Other random thoughts: maybe two allocations for each loop iteration is
>>> a bit much. Maybe do a first pass over the array and collect the maximal
>>> chip->ngpio, do the memory allocation and freeing outside the loop (then
>>> you'd of course need to preserve the memset() with appropriate length
>>> computed). And maybe even just do one allocation, making bits point at
>>> the second half.
>>
>> I think those are great ideas because the function is kind of a hotpath
>> and usage of VLAs was motivated by the desire to make it fast.
>>
>> I'd go one step further and store the maximum ngpio of all registered
>> chips in a global variable (and update it in gpiochip_add_data_with_key()),
>> then allocate 2 * max_ngpio once before entering the loop (as you've
>> suggested). That would avoid the first pass to determine the maximum
>> chip->ngpio. In most systems max_ngpio will be < 64, so one or two
>> unsigned longs depending on the arch's bitness.
>
> Actually, scratch that. If ngpio is usually smallish, we can just
> allocate reasonably sized space for mask and bits on the stack,

Yes.

> and fall back to the kcalloc slowpath only if chip->ngpio exceeds
> that limit.

Well, I'd suggest not adding that fallback code now, but simply add a
check in gpiochip_add_data_with_key to ensure ngpio is sane (and refuse
to register the chip otherwise), at least if we know that every
currently supported/known chip is covered by the 256 (?). That keeps the
code simple and fast, and then if somebody has a chip with 40000 gpio
lines, we can add a fallback path. Or we could consider alternative
solutions, to avoid a 10000 byte GFP_ATOMIC allocation (maybe hang a
pre-allocation off the gpio_chip; that's only two more bits per
descriptor, and there's already a whole gpio_desc for each - but not
sure about the locking in that case).

Rasmus

2018-03-19 07:02:52

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On Sun, Mar 18, 2018 at 09:34:12PM +0100, Rasmus Villemoes wrote:
> On 2018-03-18 15:23, Lukas Wunner wrote:
> >>> Other random thoughts: maybe two allocations for each loop iteration is
> >>> a bit much. Maybe do a first pass over the array and collect the maximal
> >>> chip->ngpio, do the memory allocation and freeing outside the loop (then
> >>> you'd of course need to preserve the memset() with appropriate length
> >>> computed). And maybe even just do one allocation, making bits point at
> >>> the second half.
> >>
> >> I think those are great ideas because the function is kind of a hotpath
> >> and usage of VLAs was motivated by the desire to make it fast.
> >>
> >> I'd go one step further and store the maximum ngpio of all registered
> >> chips in a global variable (and update it in gpiochip_add_data_with_key()),
> >> then allocate 2 * max_ngpio once before entering the loop (as you've
> >> suggested). That would avoid the first pass to determine the maximum
> >> chip->ngpio. In most systems max_ngpio will be < 64, so one or two
> >> unsigned longs depending on the arch's bitness.
> >
> > Actually, scratch that. If ngpio is usually smallish, we can just
> > allocate reasonably sized space for mask and bits on the stack,
>
> Yes.
>
> > and fall back to the kcalloc slowpath only if chip->ngpio exceeds
> > that limit.
>
> Well, I'd suggest not adding that fallback code now, but simply add a
> check in gpiochip_add_data_with_key to ensure ngpio is sane (and refuse
> to register the chip otherwise), at least if we know that every
> currently supported/known chip is covered by the 256 (?).

The number 256 was arbitrarily chosen. I really wouldn't be surprised
if gpiochips with more pins exist, but they're probably rare, so using
the slowpath seems fine, but dropping support for them completely would
be a regression.

E.g. many serially attached chips such as MAX3191X are daisy-chainable
and the driver deliberately doesn't impose an upper limit on the number
of chips because the spec doesn't contain one either. To the OS a
daisy-chain of such chips appears as a single gpiochip with many pins.

Thanks,

Lukas

2018-03-19 15:11:29

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On Mon, Mar 19, 2018 at 9:00 AM, Lukas Wunner <[email protected]> wrote:
> On Sun, Mar 18, 2018 at 09:34:12PM +0100, Rasmus Villemoes wrote:
>> On 2018-03-18 15:23, Lukas Wunner wrote:

>> > Actually, scratch that. If ngpio is usually smallish, we can just
>> > allocate reasonably sized space for mask and bits on the stack,
>> > and fall back to the kcalloc slowpath only if chip->ngpio exceeds
>> > that limit.

>> Well, I'd suggest not adding that fallback code now, but simply add a
>> check in gpiochip_add_data_with_key to ensure ngpio is sane (and refuse
>> to register the chip otherwise), at least if we know that every
>> currently supported/known chip is covered by the 256 (?).
>
> The number 256 was arbitrarily chosen. I really wouldn't be surprised
> if gpiochips with more pins exist, but they're probably rare, so using
> the slowpath seems fine, but dropping support for them completely would
> be a regression.

All modern Intel SoCs have GPIO count in between of ~230-380.
Though, few of them are split to communities by (much) less than 256
pin in each when there is a 1:1 mapping between community and
gpiochip.

OTOH, the function you are fixing is most likely is not used together
with the drivers for x86.

--
With Best Regards,
Andy Shevchenko

2018-03-26 09:09:09

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 2/4] gpio: Remove VLA from MAX3191X driver

On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:

>
> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621)
>
> This patch replaces several a VLA with an appropriate call to
> kmalloc_array.
>
> Signed-off-by: Laura Abbott <[email protected]>

Patch applied.

It's very simple and straight-forward after all.

Yours,
Linus Walleij

2018-03-26 09:10:27

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 3/4] gpio: Remove VLA from xra1403 driver

On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:

> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621)
>
> This patch replaces a VLA with an appropriate call to kmalloc_array.
>
> Signed-off-by: Laura Abbott <[email protected]>

Patch applied with Nandor's review tag.

Yours,
Linus Walleij

2018-03-28 00:38:32

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On 03/18/2018 07:23 AM, Lukas Wunner wrote:
> On Sat, Mar 17, 2018 at 09:25:09AM +0100, Lukas Wunner wrote:
>> On Mon, Mar 12, 2018 at 04:00:36PM +0100, Rasmus Villemoes wrote:
>>> On 2018-03-10 01:10, Laura Abbott wrote:
>>>> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
>>>>
>>>> while (i < array_size) {
>>>> struct gpio_chip *chip = desc_array[i]->gdev->chip;
>>>> - unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
>>>> - unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
>>>> + unsigned long *mask;
>>>> + unsigned long *bits;
>>>> int count = 0;
>>>>
>>>> + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>>>> + sizeof(*mask),
>>>> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>>>> +
>>>> + if (!mask)
>>>> + return;
>>>> +
>>>> + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>>>> + sizeof(*bits),
>>>> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>>>> +
>>>> + if (!bits) {
>>>> + kfree(mask);
>>>> + return;
>>>> + }
>>>> +
>>>> if (!can_sleep)
>>>> WARN_ON(chip->can_sleep);
>>>>
>>>> - memset(mask, 0, sizeof(mask));
>>>> + memset(mask, 0, sizeof(*mask));
>>>
>>> Other random thoughts: maybe two allocations for each loop iteration is
>>> a bit much. Maybe do a first pass over the array and collect the maximal
>>> chip->ngpio, do the memory allocation and freeing outside the loop (then
>>> you'd of course need to preserve the memset() with appropriate length
>>> computed). And maybe even just do one allocation, making bits point at
>>> the second half.
>>
>> I think those are great ideas because the function is kind of a hotpath
>> and usage of VLAs was motivated by the desire to make it fast.
>>
>> I'd go one step further and store the maximum ngpio of all registered
>> chips in a global variable (and update it in gpiochip_add_data_with_key()),
>> then allocate 2 * max_ngpio once before entering the loop (as you've
>> suggested). That would avoid the first pass to determine the maximum
>> chip->ngpio. In most systems max_ngpio will be < 64, so one or two
>> unsigned longs depending on the arch's bitness.
>
> Actually, scratch that. If ngpio is usually smallish, we can just
> allocate reasonably sized space for mask and bits on the stack,
> and fall back to the kcalloc slowpath only if chip->ngpio exceeds
> that limit. Basically the below (likewise compile-tested only),
> this is on top of Laura's patch, could be squashed together.
> Let me know what you think, thanks.
>

It seems like there's general consensus this is okay so I'm going
to fold it into the next version. If not, we can discuss again.

> -- >8 --
> Subject: [PATCH] gpio: Add fastpath to gpiod_get/set_array_value_complex()
>
> Signed-off-by: Lukas Wunner <[email protected]>
> ---
> drivers/gpio/gpiolib.c | 76 ++++++++++++++++++++++++--------------------------
> 1 file changed, 37 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 429bc251392b..ffc67b0b866c 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -2432,6 +2432,8 @@ static int gpio_chip_get_multiple(struct gpio_chip *chip,
> return -EIO;
> }
>
> +#define FASTPATH_NGPIO 256
> +
> int gpiod_get_array_value_complex(bool raw, bool can_sleep,
> unsigned int array_size,
> struct gpio_desc **desc_array,
> @@ -2441,27 +2443,24 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
>
> while (i < array_size) {
> struct gpio_chip *chip = desc_array[i]->gdev->chip;
> - unsigned long *mask;
> - unsigned long *bits;
> + unsigned long fastpath[2 * BITS_TO_LONGS(FASTPATH_NGPIO)];
> + unsigned long *slowpath = NULL, *mask, *bits;
> int first, j, ret;
>
> - mask = kcalloc(BITS_TO_LONGS(chip->ngpio),
> - sizeof(*mask),
> - can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> -
> - if (!mask)
> - return -ENOMEM;
> -
> - bits = kcalloc(BITS_TO_LONGS(chip->ngpio),
> - sizeof(*bits),
> - can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> -
> - if (!bits) {
> - kfree(mask);
> - return -ENOMEM;
> + if (likely(chip->ngpio <= FASTPATH_NGPIO)) {
> + memset(fastpath, 0, sizeof(fastpath));
> + mask = fastpath;
> + bits = fastpath + BITS_TO_LONGS(FASTPATH_NGPIO);
> + } else {
> + slowpath = kcalloc(2 * BITS_TO_LONGS(chip->ngpio),
> + sizeof(*slowpath),
> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> + if (!slowpath)
> + return -ENOMEM;
> + mask = slowpath;
> + bits = slowpath + BITS_TO_LONGS(chip->ngpio);
> }
>
> -
> if (!can_sleep)
> WARN_ON(chip->can_sleep);
>
> @@ -2478,8 +2477,8 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
>
> ret = gpio_chip_get_multiple(chip, mask, bits);
> if (ret) {
> - kfree(bits);
> - kfree(mask);
> + if (slowpath)
> + kfree(slowpath);
> return ret;
> }
>
> @@ -2493,8 +2492,9 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
> value_array[j] = value;
> trace_gpio_value(desc_to_gpio(desc), 1, value);
> }
> - kfree(bits);
> - kfree(mask);
> +
> + if (slowpath)
> + kfree(slowpath);
> }
> return 0;
> }
> @@ -2699,24 +2699,22 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
>
> while (i < array_size) {
> struct gpio_chip *chip = desc_array[i]->gdev->chip;
> - unsigned long *mask;
> - unsigned long *bits;
> + unsigned long fastpath[2 * BITS_TO_LONGS(FASTPATH_NGPIO)];
> + unsigned long *slowpath = NULL, *mask, *bits;
> int count = 0;
>
> - mask = kcalloc(BITS_TO_LONGS(chip->ngpio),
> - sizeof(*mask),
> - can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> -
> - if (!mask)
> - return -ENOMEM;
> -
> - bits = kcalloc(BITS_TO_LONGS(chip->ngpio),
> - sizeof(*bits),
> - can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> -
> - if (!bits) {
> - kfree(mask);
> - return -ENOMEM;
> + if (likely(chip->ngpio <= FASTPATH_NGPIO)) {
> + memset(fastpath, 0, sizeof(fastpath));
> + mask = fastpath;
> + bits = fastpath + BITS_TO_LONGS(FASTPATH_NGPIO);
> + } else {
> + slowpath = kcalloc(2 * BITS_TO_LONGS(chip->ngpio),
> + sizeof(*slowpath),
> + can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> + if (!slowpath)
> + return -ENOMEM;
> + mask = slowpath;
> + bits = slowpath + BITS_TO_LONGS(chip->ngpio);
> }
>
> if (!can_sleep)
> @@ -2753,8 +2751,8 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
> if (count != 0)
> gpio_chip_set_multiple(chip, mask, bits);
>
> - kfree(mask);
> - kfree(bits);
> + if (slowpath)
> + kfree(slowpath);
> }
> return 0;
> }
>


2018-03-28 03:56:31

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH 1/4] gpio: Remove VLA from gpiolib

On Tue, Mar 27, 2018 at 05:37:18PM -0700, Laura Abbott wrote:
> On 03/18/2018 07:23 AM, Lukas Wunner wrote:
> > Actually, scratch that. If ngpio is usually smallish, we can just
> > allocate reasonably sized space for mask and bits on the stack,
> > and fall back to the kcalloc slowpath only if chip->ngpio exceeds
> > that limit. Basically the below (likewise compile-tested only),
> > this is on top of Laura's patch, could be squashed together.
> > Let me know what you think, thanks.
> >
>
> It seems like there's general consensus this is okay so I'm going
> to fold it into the next version. If not, we can discuss again.

Yes, feel free to squash into your original patch with my S-o-b,
keep your authorship.

You may want to raise FASTPATH_NGPIO to something like 384, 448 or 512
to accommodate for the Intel chips Andy mentioned. It's kind of a
"640k should be enough for everyone" thing but I'd expect the performance
impact of the extra bytes on the stack / memsetting them to zero
to be absolutely negligible.

Thanks!

Lukas

2018-03-28 07:29:29

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 3/4] gpio: Remove VLA from xra1403 driver

Hi Laura,

On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:
> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621)
>
> This patch replaces a VLA with an appropriate call to kmalloc_array.
>
> Signed-off-by: Laura Abbott <[email protected]>

Thanks for your patch!

> --- a/drivers/gpio/gpio-xra1403.c
> +++ b/drivers/gpio/gpio-xra1403.c
> @@ -126,11 +126,16 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
> {
> int reg;
> struct xra1403 *xra = gpiochip_get_data(chip);
> - int value[xra1403_regmap_cfg.max_register];

Apparently xra1403_regmap_cfg.max_register is always 0x15?

What about adding

#define XRA_LAST 15

at the top, and replacing both "XRA_IFR | 0x01" and
xra1403_regmap_cfg.max_register by XRA_LAST instead?
That would avoid doing yet another memory allocation over and over.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-03-28 17:28:30

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCH 3/4] gpio: Remove VLA from xra1403 driver



On 03/28/2018 12:27 AM, Geert Uytterhoeven wrote:
> Hi Laura,
>
> On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:
>> The new challenge is to remove VLAs from the kernel
>> (see https://lkml.org/lkml/2018/3/7/621)
>>
>> This patch replaces a VLA with an appropriate call to kmalloc_array.
>>
>> Signed-off-by: Laura Abbott <[email protected]>
>
> Thanks for your patch!
>
>> --- a/drivers/gpio/gpio-xra1403.c
>> +++ b/drivers/gpio/gpio-xra1403.c
>> @@ -126,11 +126,16 @@ static void xra1403_dbg_show(struct seq_file *s, struct gpio_chip *chip)
>> {
>> int reg;
>> struct xra1403 *xra = gpiochip_get_data(chip);
>> - int value[xra1403_regmap_cfg.max_register];
>
> Apparently xra1403_regmap_cfg.max_register is always 0x15?
>
> What about adding
>
> #define XRA_LAST 15
>
> at the top, and replacing both "XRA_IFR | 0x01" and
> xra1403_regmap_cfg.max_register by XRA_LAST instead?
> That would avoid doing yet another memory allocation over and over.
>
> Gr{oetje,eeting}s,
>
> Geert
>

I'm okay with making the change but I think Linus already picked
up the patch into his gpio trees. Linus, do you want a patch on
top of your -devel branch or should I just send a new patch?

Thanks,
Laura

2018-04-04 12:55:09

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 3/4] gpio: Remove VLA from xra1403 driver

On Wed, Mar 28, 2018 at 7:27 PM, Laura Abbott <[email protected]> wrote:
> On 03/28/2018 12:27 AM, Geert Uytterhoeven wrote:
>>
>> Hi Laura,
>>
>> On Sat, Mar 10, 2018 at 1:10 AM, Laura Abbott <[email protected]> wrote:
>>>
>>> The new challenge is to remove VLAs from the kernel
>>> (see https://lkml.org/lkml/2018/3/7/621)
>>>
>>> This patch replaces a VLA with an appropriate call to kmalloc_array.
>>>
>>> Signed-off-by: Laura Abbott <[email protected]>
>>
>>
>> Thanks for your patch!
>>
>>> --- a/drivers/gpio/gpio-xra1403.c
>>> +++ b/drivers/gpio/gpio-xra1403.c
>>> @@ -126,11 +126,16 @@ static void xra1403_dbg_show(struct seq_file *s,
>>> struct gpio_chip *chip)
>>> {
>>> int reg;
>>> struct xra1403 *xra = gpiochip_get_data(chip);
>>> - int value[xra1403_regmap_cfg.max_register];
>>
>>
>> Apparently xra1403_regmap_cfg.max_register is always 0x15?
>>
>> What about adding
>>
>> #define XRA_LAST 15
>>
>> at the top, and replacing both "XRA_IFR | 0x01" and
>> xra1403_regmap_cfg.max_register by XRA_LAST instead?
>> That would avoid doing yet another memory allocation over and over.
>>
>> Gr{oetje,eeting}s,
>>
>> Geert
>>
>
> I'm okay with making the change but I think Linus already picked
> up the patch into his gpio trees. Linus, do you want a patch on
> top of your -devel branch or should I just send a new patch?

Yeah a patch on top is fine, I sent my pull request to Torvalds
today so we can take this as a fix for the -rc cycle simply.

Yours,
Linus Walleij