Hi Linus,
Thanks for your feedback. I apologize for the late response, I didn’t receive any email notification due to the invalid email address "[email protected]". I have added the right address "[email protected]" and I am replying to your comments below (found and copied from https://www.spinics.net/lists/linux-gpio/msg58692.html)
Please send GPIO related patches to linux-gpio@xxxxxxxxxxxxxxx!
I will.
On Tue, Feb 23, 2021 at 11:51 PM Asmaa Mnebhi <Asmaa@xxxxxxxxxxxx> wrote:
>
> From: Asmaa Mnebhi <asmaa@xxxxxxxxxx>
>
> There are 3 possible GPIO interrupts which can be
> supported on BlueField-2 boards. Some BlueField boards support:
> 1) PHY interrupt only
> 2) PHY interrupt and Reset interrupt
> 3) Low power interrupt only
Is this a property of the GPIO block or more of a property of the
chip?
A bit of both. There are some boards which have only some of the above interrupts, some that don’t have any of those.
For example, the GPIO pin used for the PHY interrupt can vary from one board to another.
> There is one hardware line shared among all GPIOs among other
> things. So the interrupt controller checks whether the
> hardware interrupt is from a GPIO first. Then it checks which
> GPIO block it is for. And within the GPIO block, it checks
> which GPIO pin it is for.
> The "reset interrupt" and "low power interrupt" trigger a
> user space program.
Then they should be doing so using drivers in the proper kernel
subsystems, such as
drivers/power/reset/gpio-poweroff.c
drivers/power/reset/gpio-restart.c
Userspace has no business trying to intercept and take control
over such low level tasks as machine reset.
I am sorry I didn’t use the right words for this. the interrupt handler in this driver triggers an ACPI event which will in turn take care of initiating a script. The ACPI code does its magic independently of this driver. The function below " acpi_bus_generate_netlink_event" does that.
> The PHY interrupt is mapped to a linux
> IRQ and passed down to a PHY driver.
Then the phy driver should obtain its IRQ just like any other IRQ
in the system, the fact that it happens to be on a GPIO line
does not matter.
The issue here is if I pass the HW interrupt line (from the ACPI) driver to the PHY driver, the PHY interrupt will be triggered every time the HW interrupt happens, which could be coming from an I2C interrupt, an MDIO interrupt or another GPIO interrupt. I only want to be triggering this interrupt after we check that it is specific to GPIO pin X.
> The GPIO pin responsible for these interrupts may also change
> across boards.
That's fine, the hardware description model (I guess in your case
ACPI) should take care of that.
We cannot really pass it through the ACPI table because the ACPI table is common to all BlueField-2 boards. And each board may have a different GPIO pin associated with a particular function. This is why we use ACPI properties instead of GpioInt(). So that the bootloader can change the GPIO pin value based on the board id detected at boot time.
> The ACPI table contains a property which is assigned a valid
> GPIO pin number if the interrupt is supported on a particular
> BlueField-2 board. The bootloader will change that property
> based on the board id.
OK and then the kernel uses ACPI.
We have some ACPI experts on GPIO, and you must have noticed
that we have a special ACPI layer for gpiolib.
This is what should provide IRQ resources to your other drivers so they
can look them up with a simple
struct gpio_desc *gpiod = devm_gpiod_get(dev, ...)
please see my comment above.
Please
NOTE: you are using GPIOLIB_IRQCHIP so you need to
add
select GPIOLIB_IRQCHIP
to Kconfig for this driver.
Ok!
> -// SPDX-License-Identifier: GPL-2.0
> +// SPDX-License-Identifier: GPL-2.0-only or BSD-3-Clause
> +
> +/*
> + * Copyright (c) 2020-2021 NVIDIA Corporation.
> + */
Send this as a separate patch as it is only administrativa.
It's fine I think, you mostly wrote this driver.
Ok!
> +#include <linux/gpio/consumer.h>
This looks weird but let's see!
> +#define DRV_VERSION "1.2"
We usually don't make this kind of stuff, skip this.
It's not like the API changes.
ok!
> +#define YU_CAUSE_GPIO_ADDR 0x2801530
Shouldn't this address come from ACPI?
This resource is shared among all GPIO blocks (4 of them) while we want to map it only once. The ACPI table has an entry for each GPIO block along with its start address and offset.
We could pass YU_CAUSE_GPIO_ADDR through the ACPI table but we would need to invoke it from one GPIO block only (which is not consistent in my opinion). And we will need to add extra logic to make sure it is not mapped 4 times (since the probe is invoked 4 times, once for each GPIO block).
> +#define YU_CAUSE_GPIO_ADDR_SIZE 0x4
Does this mean it is four bytes? That should be implied I think.
I should have named it "OFFSET" instead of "SIZE".
> +static bool mlxbf2_gpio_is_acpi_event(u32 gpio_block, unsigned long gpio_pin,
> + struct mlxbf2_gpio_context *gs)
> +{
> + if (gpio_block & BIT(GPIO_BLOCK0)) {
> + if (gpio_pin & BIT(gs->rst_pin))
> + return true;
> + }
> + if (gpio_block & BIT(GPIO_BLOCK16)) {
> + if (gpio_pin & BIT(gs->low_pwr_pin))
> + return true;
> + }
> +
> + return false;
> +}
(...)
> + if (mlxbf2_gpio_is_acpi_event(gpio_block, gpio_pin, gs))
> + schedule_work(&gs->send_work);
So you determine that some line is an "ACPI event" using some hardware
registers? I don't know, this looks fragile.
Yes it is part of the interrupt controller. We don’t want to be triggering these events for all HW interrupts we receive. Every time an interrupt is triggered, the HW will write 3 register:
1) one which tells us whether the interrupt is from an I2C block, or MDIO block or GPIO block
2) one which tells us which GPIO block it is
3) one which tell us which GPIO pin within a block it is.
Once we have made the above check, we clear the interrupt in the handler, and the HW then knows to automatically clear the 3 above registers for us.
> + spin_lock_init(&gs->gc.bgpio_lock);
Why? This should be initialized by the core as an effect
of bgpio_init().
Will remove.
> + if (ret) {
> + dev_err(dev, "bgpio_init failed\n");
> + return ret;
> + }
Maybe a separate patch adding some debug print? Not the biggest thing but...
Ok!
> gc->direction_input = mlxbf2_gpio_direction_input;
> gc->direction_output = mlxbf2_gpio_direction_output;
> gc->ngpio = npins;
> gc->owner = THIS_MODULE;
> + gc->to_irq = mlxbf2_gpio_to_irq;
Drop this.
Ok!
> + /*
> + * PHY interrupt
> + */
> + ret = device_property_read_u32(dev, "phy-int-pin", &phy_int_pin);
> + if (ret < 0)
> + phy_int_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + /*
> + * OCP3.0 supports the low power mode interrupt.
> + */
> + ret = device_property_read_u32(dev, "low-pwr-pin", &low_pwr_pin);
> + if (ret < 0)
> + low_pwr_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + /*
> + * BlueSphere and the PRIS boards support the reset interrupt.
> + */
> + ret = device_property_read_u32(dev, "rst-pin", &rst_pin);
> + if (ret < 0)
> + rst_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + gs->phy_int_pin = phy_int_pin;
> + gs->low_pwr_pin = low_pwr_pin;
> + gs->rst_pin = rst_pin;
I see what you are doing but why on earth are these resources tied to
this interrupt controller? They should be resources for something else
in my mind, however I don't know much about ACPI.
Yes. It would belong in the ACPI table if we had a different ACPI table for each board. But unfortunately that is not the case. We have one ACPI table for several boards. And our HW pin selection is not consistent across board. So we need to use ACPI properties instead because that is the only ACPI table parameter that our bootloader can freely modify at boot time after detecting which board it is operating on☹.
> + irq = platform_get_irq(pdev, 0);
> + ret = devm_request_irq(dev, irq, mlxbf2_gpio_irq_handler,
> + IRQF_ONESHOT | IRQF_SHARED, name, gs);
Why is it oneshot? That is usually just useful for threaded IRQs.
This is because this HW interrupt is shared with the I2C interrupts. And our i2c driver needs it to be oneshot (already upstreamed as i2c-mlxbf2.c).
So we need to keep the flags consistent to be able to request the HW interrupt from here.
> + if (phy_int_pin != MLXBF2_GPIO_MAX_PINS_PER_BLOCK) {
> + /* Create phy irq mapping */
> + mlxbf2_gpio_to_irq(&gs->gc, phy_int_pin);
> + /* Enable sharing the irq domain with the PHY driver */
> + irq_set_default_host(gs->gc.irq.domain);
> + }
Ugh this is messy, we need to provide the IRQs out of the driver in
a clean way.
I couldn’t find a better example to pass a software interrupt to another driver. What would you suggest? Would you prefer to move this interrupt controller altogether to the PHY driver? and also have the corresponding controller in a new /driver/power/reset/ driver for the reset/low power pins?
Yours,
Linus Walleij
Hi Linus,
Did you have a chance to take a look at my reply below? I would greatly appreciate your feedback to decide how I should restructure this code moving forward.
Thank you.
Asmaa
-----Original Message-----
From: Asmaa Mnebhi
Sent: Wednesday, March 10, 2021 3:38 PM
To: Asmaa Mnebhi <[email protected]>; [email protected]; [email protected]
Cc: [email protected]
Subject: RE: [PATCH v1 1/1] gpio: Support interrupts in gpio-mlxbf2.c
Hi Linus,
Thanks for your feedback. I apologize for the late response, I didn’t receive any email notification due to the invalid email address "[email protected]". I have added the right address "[email protected]" and I am replying to your comments below (found and copied from https://www.spinics.net/lists/linux-gpio/msg58692.html)
Please send GPIO related patches to linux-gpio@xxxxxxxxxxxxxxx!
I will.
On Tue, Feb 23, 2021 at 11:51 PM Asmaa Mnebhi <Asmaa@xxxxxxxxxxxx> wrote:
>
> From: Asmaa Mnebhi <asmaa@xxxxxxxxxx>
>
> There are 3 possible GPIO interrupts which can be supported on
> BlueField-2 boards. Some BlueField boards support:
> 1) PHY interrupt only
> 2) PHY interrupt and Reset interrupt
> 3) Low power interrupt only
Is this a property of the GPIO block or more of a property of the chip?
A bit of both. There are some boards which have only some of the above interrupts, some that don’t have any of those.
For example, the GPIO pin used for the PHY interrupt can vary from one board to another.
> There is one hardware line shared among all GPIOs among other things.
> So the interrupt controller checks whether the hardware interrupt is
> from a GPIO first. Then it checks which GPIO block it is for. And
> within the GPIO block, it checks which GPIO pin it is for.
> The "reset interrupt" and "low power interrupt" trigger a user space
> program.
Then they should be doing so using drivers in the proper kernel subsystems, such as drivers/power/reset/gpio-poweroff.c
drivers/power/reset/gpio-restart.c
Userspace has no business trying to intercept and take control over such low level tasks as machine reset.
I am sorry I didn’t use the right words for this. the interrupt handler in this driver triggers an ACPI event which will in turn take care of initiating a script. The ACPI code does its magic independently of this driver. The function below " acpi_bus_generate_netlink_event" does that.
> The PHY interrupt is mapped to a linux IRQ and passed down to a PHY
> driver.
Then the phy driver should obtain its IRQ just like any other IRQ in the system, the fact that it happens to be on a GPIO line does not matter.
The issue here is if I pass the HW interrupt line (from the ACPI) driver to the PHY driver, the PHY interrupt will be triggered every time the HW interrupt happens, which could be coming from an I2C interrupt, an MDIO interrupt or another GPIO interrupt. I only want to be triggering this interrupt after we check that it is specific to GPIO pin X.
> The GPIO pin responsible for these interrupts may also change across
> boards.
That's fine, the hardware description model (I guess in your case
ACPI) should take care of that.
We cannot really pass it through the ACPI table because the ACPI table is common to all BlueField-2 boards. And each board may have a different GPIO pin associated with a particular function. This is why we use ACPI properties instead of GpioInt(). So that the bootloader can change the GPIO pin value based on the board id detected at boot time.
> The ACPI table contains a property which is assigned a valid GPIO pin
> number if the interrupt is supported on a particular
> BlueField-2 board. The bootloader will change that property based on
> the board id.
OK and then the kernel uses ACPI.
We have some ACPI experts on GPIO, and you must have noticed that we have a special ACPI layer for gpiolib.
This is what should provide IRQ resources to your other drivers so they can look them up with a simple struct gpio_desc *gpiod = devm_gpiod_get(dev, ...)
please see my comment above.
Please
NOTE: you are using GPIOLIB_IRQCHIP so you need to add
select GPIOLIB_IRQCHIP
to Kconfig for this driver.
Ok!
> -// SPDX-License-Identifier: GPL-2.0
> +// SPDX-License-Identifier: GPL-2.0-only or BSD-3-Clause
> +
> +/*
> + * Copyright (c) 2020-2021 NVIDIA Corporation.
> + */
Send this as a separate patch as it is only administrativa.
It's fine I think, you mostly wrote this driver.
Ok!
> +#include <linux/gpio/consumer.h>
This looks weird but let's see!
> +#define DRV_VERSION "1.2"
We usually don't make this kind of stuff, skip this.
It's not like the API changes.
ok!
> +#define YU_CAUSE_GPIO_ADDR 0x2801530
Shouldn't this address come from ACPI?
This resource is shared among all GPIO blocks (4 of them) while we want to map it only once. The ACPI table has an entry for each GPIO block along with its start address and offset.
We could pass YU_CAUSE_GPIO_ADDR through the ACPI table but we would need to invoke it from one GPIO block only (which is not consistent in my opinion). And we will need to add extra logic to make sure it is not mapped 4 times (since the probe is invoked 4 times, once for each GPIO block).
> +#define YU_CAUSE_GPIO_ADDR_SIZE 0x4
Does this mean it is four bytes? That should be implied I think.
I should have named it "OFFSET" instead of "SIZE".
> +static bool mlxbf2_gpio_is_acpi_event(u32 gpio_block, unsigned long gpio_pin,
> + struct mlxbf2_gpio_context *gs) {
> + if (gpio_block & BIT(GPIO_BLOCK0)) {
> + if (gpio_pin & BIT(gs->rst_pin))
> + return true;
> + }
> + if (gpio_block & BIT(GPIO_BLOCK16)) {
> + if (gpio_pin & BIT(gs->low_pwr_pin))
> + return true;
> + }
> +
> + return false;
> +}
(...)
> + if (mlxbf2_gpio_is_acpi_event(gpio_block, gpio_pin, gs))
> + schedule_work(&gs->send_work);
So you determine that some line is an "ACPI event" using some hardware registers? I don't know, this looks fragile.
Yes it is part of the interrupt controller. We don’t want to be triggering these events for all HW interrupts we receive. Every time an interrupt is triggered, the HW will write 3 register:
1) one which tells us whether the interrupt is from an I2C block, or MDIO block or GPIO block
2) one which tells us which GPIO block it is
3) one which tell us which GPIO pin within a block it is.
Once we have made the above check, we clear the interrupt in the handler, and the HW then knows to automatically clear the 3 above registers for us.
> + spin_lock_init(&gs->gc.bgpio_lock);
Why? This should be initialized by the core as an effect of bgpio_init().
Will remove.
> + if (ret) {
> + dev_err(dev, "bgpio_init failed\n");
> + return ret;
> + }
Maybe a separate patch adding some debug print? Not the biggest thing but...
Ok!
> gc->direction_input = mlxbf2_gpio_direction_input;
> gc->direction_output = mlxbf2_gpio_direction_output;
> gc->ngpio = npins;
> gc->owner = THIS_MODULE;
> + gc->to_irq = mlxbf2_gpio_to_irq;
Drop this.
Ok!
> + /*
> + * PHY interrupt
> + */
> + ret = device_property_read_u32(dev, "phy-int-pin", &phy_int_pin);
> + if (ret < 0)
> + phy_int_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + /*
> + * OCP3.0 supports the low power mode interrupt.
> + */
> + ret = device_property_read_u32(dev, "low-pwr-pin", &low_pwr_pin);
> + if (ret < 0)
> + low_pwr_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + /*
> + * BlueSphere and the PRIS boards support the reset interrupt.
> + */
> + ret = device_property_read_u32(dev, "rst-pin", &rst_pin);
> + if (ret < 0)
> + rst_pin = MLXBF2_GPIO_MAX_PINS_PER_BLOCK;
> +
> + gs->phy_int_pin = phy_int_pin;
> + gs->low_pwr_pin = low_pwr_pin;
> + gs->rst_pin = rst_pin;
I see what you are doing but why on earth are these resources tied to this interrupt controller? They should be resources for something else in my mind, however I don't know much about ACPI.
Yes. It would belong in the ACPI table if we had a different ACPI table for each board. But unfortunately that is not the case. We have one ACPI table for several boards. And our HW pin selection is not consistent across board. So we need to use ACPI properties instead because that is the only ACPI table parameter that our bootloader can freely modify at boot time after detecting which board it is operating on☹.
> + irq = platform_get_irq(pdev, 0);
> + ret = devm_request_irq(dev, irq, mlxbf2_gpio_irq_handler,
> + IRQF_ONESHOT | IRQF_SHARED,
> + name, gs);
Why is it oneshot? That is usually just useful for threaded IRQs.
This is because this HW interrupt is shared with the I2C interrupts. And our i2c driver needs it to be oneshot (already upstreamed as i2c-mlxbf2.c).
So we need to keep the flags consistent to be able to request the HW interrupt from here.
> + if (phy_int_pin != MLXBF2_GPIO_MAX_PINS_PER_BLOCK) {
> + /* Create phy irq mapping */
> + mlxbf2_gpio_to_irq(&gs->gc, phy_int_pin);
> + /* Enable sharing the irq domain with the PHY driver */
> + irq_set_default_host(gs->gc.irq.domain);
> + }
Ugh this is messy, we need to provide the IRQs out of the driver in a clean way.
I couldn’t find a better example to pass a software interrupt to another driver. What would you suggest? Would you prefer to move this interrupt controller altogether to the PHY driver? and also have the corresponding controller in a new /driver/power/reset/ driver for the reset/low power pins?
Yours,
Linus Walleij
> We cannot really pass it through the ACPI table because the ACPI
> table is common to all BlueField-2 boards. And each board may have
> a different GPIO pin associated with a particular function. This is
> why we use ACPI properties instead of GpioInt(). So that the
> bootloader can change the GPIO pin value based on the board id
> detected at boot time.
That sounds very broken.
ACPI describes the hardware. If the hardware is different, you need
different ACPI. And i assume the ACPI spec says GpioInt() is the
correct way to do this, and does not say you can hack around
limitations of your bootloader using properties?
Andrew
On Fri, Mar 19, 2021 at 3:24 PM Andrew Lunn <[email protected]> wrote:
>
> > We cannot really pass it through the ACPI table because the ACPI
> > table is common to all BlueField-2 boards. And each board may have
> > a different GPIO pin associated with a particular function. This is
> > why we use ACPI properties instead of GpioInt(). So that the
> > bootloader can change the GPIO pin value based on the board id
> > detected at boot time.
>
> That sounds very broken.
>
> ACPI describes the hardware. If the hardware is different, you need
> different ACPI. And i assume the ACPI spec says GpioInt() is the
> correct way to do this, and does not say you can hack around
> limitations of your bootloader using properties?
It seems my reply didn't make the mailing list, but I'm on the same
page with you.
On x86 boards the difference is usually provided by firmware via NVS
and corresponding macro(s).
One may google for any DSDT for x86 and check for, for instance,
GNUM() macro and Co.
--
With Best Regards,
Andy Shevchenko
On Wed, Mar 10, 2021 at 9:38 PM Asmaa Mnebhi <[email protected]> wrote:
> > That's fine, the hardware description model (I guess in your case
> > ACPI) should take care of that.
> >
> We cannot really pass it through the ACPI table because the ACPI
> table is common to all BlueField-2 boards. And each board may have
> a different GPIO pin associated with a particular function. This is
> why we use ACPI properties instead of GpioInt(). So that the
> bootloader can change the GPIO pin value based on the board
> id detected at boot time.
(...)
> Yes. It would belong in the ACPI table if we had a different ACPI
> table for each board. But unfortunately that is not the case.
You have to agree with Andy about all ACPI details.
Andy is the ACPI GPIO maintainer and we cannot merge
a patch with any kind of ACPI support without his ACK,
so hash it out as he wants it. The only people on the
planet that can make me think otherwise is if Rafael
Wysocki and Mika Westerberg say something else,
which is *extremely* unlikely.
If you need to do workarounds because of broken ACPI
tables, you need to convince the ACPI maintainers that
there is no way you can fix these tables so it needs
to be fixed in the kernel. It is not something for the
GPIO maintainers to decide.
To continue that argument please mail these people in
the MAINTAINERS file, Andy and Mika Westerberg and have
a discussion with the kernel ACPI community:
ACPI
M: "Rafael J. Wysocki" <[email protected]>
M: Len Brown <[email protected]>
L: [email protected]
Yours,
Linus Walleij
On Mon, Mar 22, 2021 at 01:41:58PM +0100, Linus Walleij wrote:
> On Wed, Mar 10, 2021 at 9:38 PM Asmaa Mnebhi <[email protected]> wrote:
>
> > > That's fine, the hardware description model (I guess in your case
> > > ACPI) should take care of that.
> > >
> > We cannot really pass it through the ACPI table because the ACPI
> > table is common to all BlueField-2 boards. And each board may have
> > a different GPIO pin associated with a particular function. This is
> > why we use ACPI properties instead of GpioInt(). So that the
> > bootloader can change the GPIO pin value based on the board
> > id detected at boot time.
> (...)
> > Yes. It would belong in the ACPI table if we had a different ACPI
> > table for each board. But unfortunately that is not the case.
>
> You have to agree with Andy about all ACPI details.
>
> Andy is the ACPI GPIO maintainer and we cannot merge
> a patch with any kind of ACPI support without his ACK,
> so hash it out as he wants it. The only people on the
> planet that can make me think otherwise is if Rafael
> Wysocki and Mika Westerberg say something else,
> which is *extremely* unlikely.
+1
And given this is burried inside a network driver, you are also going
to get push back from the networking maintainers to do this correctly.
Andrew