The i801 controller provides a locking mechanism that the OS is supposed
to use to safely share the SMBus with ACPI AML or other firmware.
Previously, Linux attempted to get out of the way of ACPI AML entirely,
but left the bus locked if it used it before the first AML access. This
causes AML implementations that *do* attempt to safely share the bus
to time out if Linux uses it first; notably, this regressed ACPI video
backlight controls on 2015 iMacs after 01590f361e started instantiating
SPD EEPROMs on boot.
Commit 065b6211a8 fixed the immediate problem of leaving the bus locked,
but we can do better. The controller does have a proper locking mechanism,
so let's use it as intended. Since we can't rely on the BIOS doing this
properly, we implement the following logic:
- If ACPI AML uses the bus at all, we make a note and disable power
management. The latter matches already existing behavior.
- When we want to use the bus, we attempt to lock it first. If the
locking attempt times out, *and* ACPI hasn't tried to use the bus at
all yet, we cautiously go ahead and assume the BIOS forgot to unlock
the bus after boot. This preserves existing behavior.
- We always unlock the bus after a transfer.
- If ACPI AML tries to use the bus (except trying to lock it) while
we're in the middle of a transfer, or after we've determined
locking is broken, we know we cannot safely share the bus and give up.
Upon first usage of SMBus by ACPI AML, if nothing has gone horribly
wrong so far, users will see:
i801_smbus 0000:00:1f.4: SMBus controller is shared with ACPI AML. This seems safe so far.
If locking the SMBus times out, users will see:
i801_smbus 0000:00:1f.4: BIOS left SMBus locked
And if ACPI AML tries to use the bus concurrently with Linux, or it
previously used the bus and we failed to subsequently lock it as
above, the driver will give up and users will get:
i801_smbus 0000:00:1f.4: BIOS uses SMBus unsafely
i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
This fixes the regression introduced by 01590f361e, and further allows
safely sharing the SMBus on 2015 iMacs. Tested by running `i2cdump` in a
loop while changing backlight levels via the ACPI video device.
Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
Cc: <[email protected]>
Signed-off-by: Hector Martin <[email protected]>
---
drivers/i2c/busses/i2c-i801.c | 96 ++++++++++++++++++++++++++++-------
1 file changed, 79 insertions(+), 17 deletions(-)
diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index 04a1e38f2a6f..03be6310d6d7 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -287,11 +287,18 @@ struct i801_priv {
#endif
struct platform_device *tco_pdev;
+ /* BIOS left the controller marked busy. */
+ bool inuse_stuck;
/*
- * If set to true the host controller registers are reserved for
- * ACPI AML use. Protected by acpi_lock.
+ * If set to true, ACPI AML uses the host controller registers.
+ * Protected by acpi_lock.
*/
- bool acpi_reserved;
+ bool acpi_usage;
+ /*
+ * If set to true, ACPI AML uses the host controller registers in an
+ * unsafe way. Protected by acpi_lock.
+ */
+ bool acpi_unsafe;
struct mutex acpi_lock;
};
@@ -854,10 +861,37 @@ static s32 i801_access(struct i2c_adapter *adap, u16 addr,
int hwpec;
int block = 0;
int ret = 0, xact = 0;
+ int timeout = 0;
struct i801_priv *priv = i2c_get_adapdata(adap);
+ /*
+ * The controller provides a bit that implements a mutex mechanism
+ * between users of the bus. First, try to lock the hardware mutex.
+ * If this doesn't work, we give up trying to do this, but then
+ * bail if ACPI uses SMBus at all.
+ */
+ if (!priv->inuse_stuck) {
+ while (inb_p(SMBHSTSTS(priv)) & SMBHSTSTS_INUSE_STS) {
+ if (++timeout >= MAX_RETRIES) {
+ dev_warn(&priv->pci_dev->dev,
+ "BIOS left SMBus locked\n");
+ priv->inuse_stuck = true;
+ break;
+ }
+ usleep_range(250, 500);
+ }
+ }
+
mutex_lock(&priv->acpi_lock);
- if (priv->acpi_reserved) {
+ if (priv->acpi_usage && priv->inuse_stuck && !priv->acpi_unsafe) {
+ priv->acpi_unsafe = true;
+
+ dev_warn(&priv->pci_dev->dev, "BIOS uses SMBus unsafely\n");
+ dev_warn(&priv->pci_dev->dev,
+ "Driver SMBus register access inhibited\n");
+ }
+
+ if (priv->acpi_unsafe) {
mutex_unlock(&priv->acpi_lock);
return -EBUSY;
}
@@ -1639,6 +1673,16 @@ static bool i801_acpi_is_smbus_ioport(const struct i801_priv *priv,
address <= pci_resource_end(priv->pci_dev, SMBBAR);
}
+static acpi_status
+i801_acpi_do_access(u32 function, acpi_physical_address address,
+ u32 bits, u64 *value)
+{
+ if ((function & ACPI_IO_MASK) == ACPI_READ)
+ return acpi_os_read_port(address, (u32 *)value, bits);
+ else
+ return acpi_os_write_port(address, (u32)*value, bits);
+}
+
static acpi_status
i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
u64 *value, void *handler_context, void *region_context)
@@ -1648,17 +1692,38 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
acpi_status status;
/*
- * Once BIOS AML code touches the OpRegion we warn and inhibit any
- * further access from the driver itself. This device is now owned
- * by the system firmware.
+ * Non-i801 accesses pass through.
*/
- mutex_lock(&priv->acpi_lock);
+ if (!i801_acpi_is_smbus_ioport(priv, address))
+ return i801_acpi_do_access(function, address, bits, value);
- if (!priv->acpi_reserved && i801_acpi_is_smbus_ioport(priv, address)) {
- priv->acpi_reserved = true;
+ if (!mutex_trylock(&priv->acpi_lock)) {
+ mutex_lock(&priv->acpi_lock);
+ /*
+ * This better be a read of the status register to acquire
+ * the lock...
+ */
+ if (!priv->acpi_unsafe &&
+ !(address == SMBHSTSTS(priv) &&
+ (function & ACPI_IO_MASK) == ACPI_READ)) {
+ /*
+ * Uh-oh, ACPI AML is trying to do something with the
+ * controller without locking it properly.
+ */
+ priv->acpi_unsafe = true;
+
+ dev_warn(&pdev->dev, "BIOS uses SMBus unsafely\n");
+ dev_warn(&pdev->dev,
+ "Driver SMBus register access inhibited\n");
+ }
+ }
- dev_warn(&pdev->dev, "BIOS is accessing SMBus registers\n");
- dev_warn(&pdev->dev, "Driver SMBus register access inhibited\n");
+ if (!priv->acpi_usage) {
+ priv->acpi_usage = true;
+
+ if (!priv->acpi_unsafe)
+ dev_info(&pdev->dev,
+ "SMBus controller is shared with ACPI AML. This seems safe so far.\n");
/*
* BIOS is accessing the host controller so prevent it from
@@ -1667,10 +1732,7 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
pm_runtime_get_sync(&pdev->dev);
}
- if ((function & ACPI_IO_MASK) == ACPI_READ)
- status = acpi_os_read_port(address, (u32 *)value, bits);
- else
- status = acpi_os_write_port(address, (u32)*value, bits);
+ status = i801_acpi_do_access(function, address, bits, value);
mutex_unlock(&priv->acpi_lock);
@@ -1706,7 +1768,7 @@ static void i801_acpi_remove(struct i801_priv *priv)
ACPI_ADR_SPACE_SYSTEM_IO, i801_acpi_io_handler);
mutex_lock(&priv->acpi_lock);
- if (priv->acpi_reserved)
+ if (priv->acpi_usage)
pm_runtime_put(&priv->pci_dev->dev);
mutex_unlock(&priv->acpi_lock);
}
--
2.32.0
On Sat, Jun 26, 2021 at 02:41:13PM +0900, Hector Martin wrote:
> The i801 controller provides a locking mechanism that the OS is supposed
> to use to safely share the SMBus with ACPI AML or other firmware.
>
> Previously, Linux attempted to get out of the way of ACPI AML entirely,
> but left the bus locked if it used it before the first AML access. This
> causes AML implementations that *do* attempt to safely share the bus
> to time out if Linux uses it first; notably, this regressed ACPI video
> backlight controls on 2015 iMacs after 01590f361e started instantiating
> SPD EEPROMs on boot.
>
> Commit 065b6211a8 fixed the immediate problem of leaving the bus locked,
> but we can do better. The controller does have a proper locking mechanism,
> so let's use it as intended. Since we can't rely on the BIOS doing this
> properly, we implement the following logic:
>
> - If ACPI AML uses the bus at all, we make a note and disable power
> management. The latter matches already existing behavior.
> - When we want to use the bus, we attempt to lock it first. If the
> locking attempt times out, *and* ACPI hasn't tried to use the bus at
> all yet, we cautiously go ahead and assume the BIOS forgot to unlock
> the bus after boot. This preserves existing behavior.
> - We always unlock the bus after a transfer.
> - If ACPI AML tries to use the bus (except trying to lock it) while
> we're in the middle of a transfer, or after we've determined
> locking is broken, we know we cannot safely share the bus and give up.
>
> Upon first usage of SMBus by ACPI AML, if nothing has gone horribly
> wrong so far, users will see:
>
> i801_smbus 0000:00:1f.4: SMBus controller is shared with ACPI AML. This seems safe so far.
>
> If locking the SMBus times out, users will see:
>
> i801_smbus 0000:00:1f.4: BIOS left SMBus locked
>
> And if ACPI AML tries to use the bus concurrently with Linux, or it
> previously used the bus and we failed to subsequently lock it as
> above, the driver will give up and users will get:
>
> i801_smbus 0000:00:1f.4: BIOS uses SMBus unsafely
> i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
>
> This fixes the regression introduced by 01590f361e, and further allows
> safely sharing the SMBus on 2015 iMacs. Tested by running `i2cdump` in a
> loop while changing backlight levels via the ACPI video device.
>
> Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
> Cc: <[email protected]>
> Signed-off-by: Hector Martin <[email protected]>
Jean, Heiner, what do we do with this topic?
> ---
> drivers/i2c/busses/i2c-i801.c | 96 ++++++++++++++++++++++++++++-------
> 1 file changed, 79 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
> index 04a1e38f2a6f..03be6310d6d7 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -287,11 +287,18 @@ struct i801_priv {
> #endif
> struct platform_device *tco_pdev;
>
> + /* BIOS left the controller marked busy. */
> + bool inuse_stuck;
> /*
> - * If set to true the host controller registers are reserved for
> - * ACPI AML use. Protected by acpi_lock.
> + * If set to true, ACPI AML uses the host controller registers.
> + * Protected by acpi_lock.
> */
> - bool acpi_reserved;
> + bool acpi_usage;
> + /*
> + * If set to true, ACPI AML uses the host controller registers in an
> + * unsafe way. Protected by acpi_lock.
> + */
> + bool acpi_unsafe;
> struct mutex acpi_lock;
> };
>
> @@ -854,10 +861,37 @@ static s32 i801_access(struct i2c_adapter *adap, u16 addr,
> int hwpec;
> int block = 0;
> int ret = 0, xact = 0;
> + int timeout = 0;
> struct i801_priv *priv = i2c_get_adapdata(adap);
>
> + /*
> + * The controller provides a bit that implements a mutex mechanism
> + * between users of the bus. First, try to lock the hardware mutex.
> + * If this doesn't work, we give up trying to do this, but then
> + * bail if ACPI uses SMBus at all.
> + */
> + if (!priv->inuse_stuck) {
> + while (inb_p(SMBHSTSTS(priv)) & SMBHSTSTS_INUSE_STS) {
> + if (++timeout >= MAX_RETRIES) {
> + dev_warn(&priv->pci_dev->dev,
> + "BIOS left SMBus locked\n");
> + priv->inuse_stuck = true;
> + break;
> + }
> + usleep_range(250, 500);
> + }
> + }
> +
> mutex_lock(&priv->acpi_lock);
> - if (priv->acpi_reserved) {
> + if (priv->acpi_usage && priv->inuse_stuck && !priv->acpi_unsafe) {
> + priv->acpi_unsafe = true;
> +
> + dev_warn(&priv->pci_dev->dev, "BIOS uses SMBus unsafely\n");
> + dev_warn(&priv->pci_dev->dev,
> + "Driver SMBus register access inhibited\n");
> + }
> +
> + if (priv->acpi_unsafe) {
> mutex_unlock(&priv->acpi_lock);
> return -EBUSY;
> }
> @@ -1639,6 +1673,16 @@ static bool i801_acpi_is_smbus_ioport(const struct i801_priv *priv,
> address <= pci_resource_end(priv->pci_dev, SMBBAR);
> }
>
> +static acpi_status
> +i801_acpi_do_access(u32 function, acpi_physical_address address,
> + u32 bits, u64 *value)
> +{
> + if ((function & ACPI_IO_MASK) == ACPI_READ)
> + return acpi_os_read_port(address, (u32 *)value, bits);
> + else
> + return acpi_os_write_port(address, (u32)*value, bits);
> +}
> +
> static acpi_status
> i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
> u64 *value, void *handler_context, void *region_context)
> @@ -1648,17 +1692,38 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
> acpi_status status;
>
> /*
> - * Once BIOS AML code touches the OpRegion we warn and inhibit any
> - * further access from the driver itself. This device is now owned
> - * by the system firmware.
> + * Non-i801 accesses pass through.
> */
> - mutex_lock(&priv->acpi_lock);
> + if (!i801_acpi_is_smbus_ioport(priv, address))
> + return i801_acpi_do_access(function, address, bits, value);
>
> - if (!priv->acpi_reserved && i801_acpi_is_smbus_ioport(priv, address)) {
> - priv->acpi_reserved = true;
> + if (!mutex_trylock(&priv->acpi_lock)) {
> + mutex_lock(&priv->acpi_lock);
> + /*
> + * This better be a read of the status register to acquire
> + * the lock...
> + */
> + if (!priv->acpi_unsafe &&
> + !(address == SMBHSTSTS(priv) &&
> + (function & ACPI_IO_MASK) == ACPI_READ)) {
> + /*
> + * Uh-oh, ACPI AML is trying to do something with the
> + * controller without locking it properly.
> + */
> + priv->acpi_unsafe = true;
> +
> + dev_warn(&pdev->dev, "BIOS uses SMBus unsafely\n");
> + dev_warn(&pdev->dev,
> + "Driver SMBus register access inhibited\n");
> + }
> + }
>
> - dev_warn(&pdev->dev, "BIOS is accessing SMBus registers\n");
> - dev_warn(&pdev->dev, "Driver SMBus register access inhibited\n");
> + if (!priv->acpi_usage) {
> + priv->acpi_usage = true;
> +
> + if (!priv->acpi_unsafe)
> + dev_info(&pdev->dev,
> + "SMBus controller is shared with ACPI AML. This seems safe so far.\n");
>
> /*
> * BIOS is accessing the host controller so prevent it from
> @@ -1667,10 +1732,7 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
> pm_runtime_get_sync(&pdev->dev);
> }
>
> - if ((function & ACPI_IO_MASK) == ACPI_READ)
> - status = acpi_os_read_port(address, (u32 *)value, bits);
> - else
> - status = acpi_os_write_port(address, (u32)*value, bits);
> + status = i801_acpi_do_access(function, address, bits, value);
>
> mutex_unlock(&priv->acpi_lock);
>
> @@ -1706,7 +1768,7 @@ static void i801_acpi_remove(struct i801_priv *priv)
> ACPI_ADR_SPACE_SYSTEM_IO, i801_acpi_io_handler);
>
> mutex_lock(&priv->acpi_lock);
> - if (priv->acpi_reserved)
> + if (priv->acpi_usage)
> pm_runtime_put(&priv->pci_dev->dev);
> mutex_unlock(&priv->acpi_lock);
> }
> --
> 2.32.0
>
On Mon, 29 Nov 2021 10:00:55 +0100, Wolfram Sang wrote:
> On Sat, Jun 26, 2021 at 02:41:13PM +0900, Hector Martin wrote:
> > The i801 controller provides a locking mechanism that the OS is supposed
> > to use to safely share the SMBus with ACPI AML or other firmware.
> >
> > Previously, Linux attempted to get out of the way of ACPI AML entirely,
> > but left the bus locked if it used it before the first AML access. This
> > causes AML implementations that *do* attempt to safely share the bus
> > to time out if Linux uses it first; notably, this regressed ACPI video
> > backlight controls on 2015 iMacs after 01590f361e started instantiating
> > SPD EEPROMs on boot.
> >
> > Commit 065b6211a8 fixed the immediate problem of leaving the bus locked,
> > but we can do better. The controller does have a proper locking mechanism,
> > so let's use it as intended. Since we can't rely on the BIOS doing this
> > properly, we implement the following logic:
> >
> > - If ACPI AML uses the bus at all, we make a note and disable power
> > management. The latter matches already existing behavior.
> > - When we want to use the bus, we attempt to lock it first. If the
> > locking attempt times out, *and* ACPI hasn't tried to use the bus at
> > all yet, we cautiously go ahead and assume the BIOS forgot to unlock
> > the bus after boot. This preserves existing behavior.
> > - We always unlock the bus after a transfer.
> > - If ACPI AML tries to use the bus (except trying to lock it) while
> > we're in the middle of a transfer, or after we've determined
> > locking is broken, we know we cannot safely share the bus and give up.
> >
> > Upon first usage of SMBus by ACPI AML, if nothing has gone horribly
> > wrong so far, users will see:
> >
> > i801_smbus 0000:00:1f.4: SMBus controller is shared with ACPI AML. This seems safe so far.
> >
> > If locking the SMBus times out, users will see:
> >
> > i801_smbus 0000:00:1f.4: BIOS left SMBus locked
> >
> > And if ACPI AML tries to use the bus concurrently with Linux, or it
> > previously used the bus and we failed to subsequently lock it as
> > above, the driver will give up and users will get:
> >
> > i801_smbus 0000:00:1f.4: BIOS uses SMBus unsafely
> > i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
> >
> > This fixes the regression introduced by 01590f361e, and further allows
> > safely sharing the SMBus on 2015 iMacs. Tested by running `i2cdump` in a
> > loop while changing backlight levels via the ACPI video device.
> >
> > Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
> > Cc: <[email protected]>
> > Signed-off-by: Hector Martin <[email protected]>
>
> Jean, Heiner, what do we do with this topic?
I like the idea, I need to give it a try and review the code.
--
Jean Delvare
SUSE L3 Support
Tested-by: Alex Henrie <[email protected]>
Tested on a SECO SBC-B68 and a UDOO X86.
The BIOS AML code queries the Embedded Controller over SMBus,
respecting the hardware semaphore implementation.
I get this line on kernel log and everything works as expected.
[ 7.270172] i801_smbus 0000:00:1f.3: SMBus controller is shared with ACPI AML. This seems safe so far.
Tested with continous use of i2c-tools (i2cdump) with temperature reads
in thermal_zone (that triggers AML code).
Tested-by: Ettore Chimenti <[email protected]>
Hi all,
I know it's been a while, and I have to apologize for that. The
original plan was to merge the patch series from Heiner Kallweit first,
but as it turns out, it was never resubmitted after my (admittedly
slow) review and things got stuck.
Anyway, I finally took some time to review this patch. I have a good
headache now, and some comments below.
On Sat, 26 Jun 2021 14:41:13 +0900, Hector Martin wrote:
> The i801 controller provides a locking mechanism that the OS is supposed
> to use to safely share the SMBus with ACPI AML or other firmware.
>
> Previously, Linux attempted to get out of the way of ACPI AML entirely,
> but left the bus locked if it used it before the first AML access. This
> causes AML implementations that *do* attempt to safely share the bus
> to time out if Linux uses it first; notably, this regressed ACPI video
> backlight controls on 2015 iMacs after 01590f361e started instantiating
> SPD EEPROMs on boot.
>
> Commit 065b6211a8 fixed the immediate problem of leaving the bus locked,
> but we can do better. The controller does have a proper locking mechanism,
> so let's use it as intended. Since we can't rely on the BIOS doing this
> properly, we implement the following logic:
>
> - If ACPI AML uses the bus at all, we make a note and disable power
> management. The latter matches already existing behavior.
> - When we want to use the bus, we attempt to lock it first. If the
> locking attempt times out, *and* ACPI hasn't tried to use the bus at
> all yet, we cautiously go ahead and assume the BIOS forgot to unlock
> the bus after boot. This preserves existing behavior.
> - We always unlock the bus after a transfer.
If I'm not mistaken, this also matches already existing behavior (since
commit 065b6211a8).
> - If ACPI AML tries to use the bus (except trying to lock it) while
> we're in the middle of a transfer, or after we've determined
> locking is broken, we know we cannot safely share the bus and give up.
>
> Upon first usage of SMBus by ACPI AML, if nothing has gone horribly
> wrong so far, users will see:
>
> i801_smbus 0000:00:1f.4: SMBus controller is shared with ACPI AML. This seems safe so far.
>
> If locking the SMBus times out, users will see:
>
> i801_smbus 0000:00:1f.4: BIOS left SMBus locked
>
> And if ACPI AML tries to use the bus concurrently with Linux, or it
> previously used the bus and we failed to subsequently lock it as
> above, the driver will give up and users will get:
>
> i801_smbus 0000:00:1f.4: BIOS uses SMBus unsafely
> i801_smbus 0000:00:1f.4: Driver SMBus register access inhibited
>
> This fixes the regression introduced by 01590f361e, and further allows
> safely sharing the SMBus on 2015 iMacs. Tested by running `i2cdump` in a
> loop while changing backlight levels via the ACPI video device.
>
> Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
> Cc: <[email protected]>
I object to this patch going to stable, as the original regression has
already been addressed by an earlier patch. The "Fixes" tag doesn't
seem appropriate, for the same reason.
This patch is way too intrusive and risky to be considered for stable
kernels, it clearly fails to meet the first two acceptance criteria. It
should live for some time upstream before distributions may consider
backporting it.
> Signed-off-by: Hector Martin <[email protected]>
> ---
> drivers/i2c/busses/i2c-i801.c | 96 ++++++++++++++++++++++++++++-------
> 1 file changed, 79 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
> index 04a1e38f2a6f..03be6310d6d7 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -287,11 +287,18 @@ struct i801_priv {
> #endif
> struct platform_device *tco_pdev;
>
> + /* BIOS left the controller marked busy. */
> + bool inuse_stuck;
> /*
> - * If set to true the host controller registers are reserved for
> - * ACPI AML use. Protected by acpi_lock.
> + * If set to true, ACPI AML uses the host controller registers.
> + * Protected by acpi_lock.
> */
> - bool acpi_reserved;
> + bool acpi_usage;
> + /*
> + * If set to true, ACPI AML uses the host controller registers in an
> + * unsafe way. Protected by acpi_lock.
> + */
> + bool acpi_unsafe;
> struct mutex acpi_lock;
> };
>
> @@ -854,10 +861,37 @@ static s32 i801_access(struct i2c_adapter *adap, u16 addr,
> int hwpec;
> int block = 0;
> int ret = 0, xact = 0;
> + int timeout = 0;
> struct i801_priv *priv = i2c_get_adapdata(adap);
>
> + /*
> + * The controller provides a bit that implements a mutex mechanism
> + * between users of the bus. First, try to lock the hardware mutex.
> + * If this doesn't work, we give up trying to do this, but then
> + * bail if ACPI uses SMBus at all.
> + */
> + if (!priv->inuse_stuck) {
> + while (inb_p(SMBHSTSTS(priv)) & SMBHSTSTS_INUSE_STS) {
Do I understand correctly that there are 3 cases where we would be
looping here:
1* The SMBus was used by the BIOS at boot time and they forgot to clear
SMBHSTSTS_INUSE_STS. The expected outcome is a timeout.
2* ACPI is using the SMBus, and the implementation is a good citizen
(acquires the hardware lock before starting a transfer, releases it
when done), and we are in the middle of an ACPI-side SMBus transfer.
The expectation is that the ACPI-side transfer will complete and the
hardware lock be released before we reach the timeout.
3* ACPI is using the SMBus, the implementation is a bad citizen and did
not release the hardware lock after a transfer. The expected outcome
is a timeout.
Case 1 wouldn't be a problem, but we can't differentiate it from case
3, which is a problem, therefore we always consider a timeout to be the
problematic case 3 and inhibit driver access when that happens. Right?
> + if (++timeout >= MAX_RETRIES) {
Note: I had to change this to use time_is_before_jiffies(), as
MAX_RETRIES has been removed from the i2c-i801 driver by commit
44c54c4ec391 ("i2c: i801: Improve status polling") meanwhile.
> + dev_warn(&priv->pci_dev->dev,
> + "BIOS left SMBus locked\n");
> + priv->inuse_stuck = true;
> + break;
> + }
> + usleep_range(250, 500);
> + }
> + }
> +
> mutex_lock(&priv->acpi_lock);
> - if (priv->acpi_reserved) {
> + if (priv->acpi_usage && priv->inuse_stuck && !priv->acpi_unsafe) {
> + priv->acpi_unsafe = true;
> +
> + dev_warn(&priv->pci_dev->dev, "BIOS uses SMBus unsafely\n");
> + dev_warn(&priv->pci_dev->dev,
> + "Driver SMBus register access inhibited\n");
> + }
> +
> + if (priv->acpi_unsafe) {
> mutex_unlock(&priv->acpi_lock);
> return -EBUSY;
> }
> @@ -1639,6 +1673,16 @@ static bool i801_acpi_is_smbus_ioport(const struct i801_priv *priv,
> address <= pci_resource_end(priv->pci_dev, SMBBAR);
> }
>
> +static acpi_status
> +i801_acpi_do_access(u32 function, acpi_physical_address address,
> + u32 bits, u64 *value)
> +{
> + if ((function & ACPI_IO_MASK) == ACPI_READ)
> + return acpi_os_read_port(address, (u32 *)value, bits);
> + else
> + return acpi_os_write_port(address, (u32)*value, bits);
> +}
> +
> static acpi_status
> i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
> u64 *value, void *handler_context, void *region_context)
> @@ -1648,17 +1692,38 @@ i801_acpi_io_handler(u32 function, acpi_physical_address address, u32 bits,
> acpi_status status;
>
> /*
> - * Once BIOS AML code touches the OpRegion we warn and inhibit any
> - * further access from the driver itself. This device is now owned
> - * by the system firmware.
> + * Non-i801 accesses pass through.
Technically, this is "Non-SMBus access". It could still be an access to
other functions of the i82801 chipset, but we don't care about these.
> */
> - mutex_lock(&priv->acpi_lock);
> + if (!i801_acpi_is_smbus_ioport(priv, address))
> + return i801_acpi_do_access(function, address, bits, value);
>
> - if (!priv->acpi_reserved && i801_acpi_is_smbus_ioport(priv, address)) {
> - priv->acpi_reserved = true;
> + if (!mutex_trylock(&priv->acpi_lock)) {
> + mutex_lock(&priv->acpi_lock);
I was pretty confused by this construct at first, I think it deserves a
comment. My understanding of it is:
1* We first try to acquire the software lock without waiting. If it
works, it means that no transfer is in progress on the Linux side,
and we can proceed with the ACPI I/O request, as we hold the lock.
2* If immediate acquisition of the software lock failed, we wait for
the lock to be released by i801_access(). Once we have the lock,
there are two options. Either the first thing done by the ACPI side
is reading SMBHSTSTS_INUSE_STS, it is a good citizen and we can
continue (we hold both the software lock and the hardware lock). Or
the ACPI side accesses another register directly, which means it
does not check SMBHSTSTS_INUSE_STS, it is a bad citizen and there is
no safe way to access the SMBus on the Linux side. We prevent
further access from the Linux side, as we did before this patch.
Please correct me if I'm wrong.
> + /*
> + * This better be a read of the status register to acquire
> + * the lock...
> + */
> + if (!priv->acpi_unsafe &&
> + !(address == SMBHSTSTS(priv) &&
> + (function & ACPI_IO_MASK) == ACPI_READ)) {
> + /*
> + * Uh-oh, ACPI AML is trying to do something with the
> + * controller without locking it properly.
> + */
> + priv->acpi_unsafe = true;
> +
> + dev_warn(&pdev->dev, "BIOS uses SMBus unsafely\n");
> + dev_warn(&pdev->dev,
> + "Driver SMBus register access inhibited\n");
> + }
> + }
>
> - dev_warn(&pdev->dev, "BIOS is accessing SMBus registers\n");
> - dev_warn(&pdev->dev, "Driver SMBus register access inhibited\n");
> + if (!priv->acpi_usage) {
> + priv->acpi_usage = true;
> +
> + if (!priv->acpi_unsafe)
> + dev_info(&pdev->dev,
> + "SMBus controller is shared with ACPI AML. This seems safe so far.\n");
>
> /*
> * BIOS is accessing the host controller so prevent it from
I see a problem with this implementation. It is possible that
i801_access() gets called in the middle of an ACPI-side SMBus transfer,
because the software lock is not held for the whole duration of an
ACPI-side transfer (contrary to Linux-side transfers). Before this
patch, the first step of the ACPI-side transfer would have blocked any
future access from the Linux side, so we were safe. After this patch,
acpi_usage has been set to true at this point, but that alone won't
prevent access from the Linux side. Access will only be prevented if
SMBHSTSTS_INUSE_STS is considered stuck. If the ACPI implementation is
a bad citizen, I'm afraid there is no guarantee of that happening.
SMBHSTSTS will eventually be read during the ACPI-side transfer, and
that will acquire the hardware lock as a side effect, but not as the
first step.
Specifically, the scenario that would cause problem is:
ACPI side | Linux side
---------------------------------------+----------------------------------
[first step of a transfer] |
mutex_trylock(&priv->acpi_lock) |
priv->acpi_usage = true |
i801_acpi_do_access(random register) |
mutex_unlock(&priv->acpi_lock) |
| take hardware lock
| mutex_lock(&priv->acpi_lock)
| priv->acpi_unsafe is false -> OK
| perform transfer
| release hardware lock
| mutex_unlock(&priv->acpi_lock)
[second step of a transfer] |
mutex_trylock(&priv->acpi_lock) |
i801_acpi_do_access(random register) |
mutex_unlock(&priv->acpi_lock) |
(...) |
Obviously the ACPI-side transfer will be screwed, as whatever register
value was set during the first step will inevitably have been
overwritten by the Linux-side transfer.
I admit that the probability of this scenario happening is extremely
low, because it will take only one completed ACPI-side transfer (bad
citizen case) to take the hardware lock and not release it, which in
turn will prevent any further access from the Linux side. The above
scenario can therefore only happen if the very first SMBus transfer
from the bad citizen ACPI side happens at exactly the same time as a
transfer from the Linux side.
Question: why do you check that the first action of ACPI AML is to
acquire the hardware lock *only* if that action was attempted while the
Linux side was performing a transfer? That event is quite unlikely to
happen. Can't we perform that test unconditionally the very first time
i801_acpi_io_handler() is called? Unless the i2c-i801 driver gets
initialized in the middle of an ACPI-side SMBus transfer, the first
time i801_acpi_io_handler() is called, and i801_acpi_is_smbus_ioport()
returns true, should correspond to the first step of an ACPI-side SMBus
transfer. If that's not a SMBHSTSTS read then we know we can't trust
the ACPI AML implementation, and Linux-side access should be
prohibited. No need to wait for a transfer collision to make that
decision. What do you think?
--
Jean Delvare
SUSE L3 Support
On 15.12.2022 15:26, Jean Delvare wrote:
> Hi all,
>
> I know it's been a while, and I have to apologize for that. The
> original plan was to merge the patch series from Heiner Kallweit first,
> but as it turns out, it was never resubmitted after my (admittedly
> slow) review and things got stuck.
>
Indeed, this has been stuck. I have to set finishing this series on
my task list again.
> Anyway, I finally took some time to review this patch. I have a good
> headache now, and some comments below.
>
[...]
Hi,
I'm going to be honest - it's been so long I completely lost all mental
state of this patch, so at this point I'm mostly at the same place you
are here.
On 15/12/2022 23.26, Jean Delvare wrote:
> I object to this patch going to stable, as the original regression has
> already been addressed by an earlier patch. The "Fixes" tag doesn't
> seem appropriate, for the same reason.
>
> This patch is way too intrusive and risky to be considered for stable
> kernels, it clearly fails to meet the first two acceptance criteria. It
> should live for some time upstream before distributions may consider
> backporting it.
That's fair. The original intent was to fix the regression and implement
it properly in one go.
>> + /*
>> + * The controller provides a bit that implements a mutex mechanism
>> + * between users of the bus. First, try to lock the hardware mutex.
>> + * If this doesn't work, we give up trying to do this, but then
>> + * bail if ACPI uses SMBus at all.
>> + */
>> + if (!priv->inuse_stuck) {
>> + while (inb_p(SMBHSTSTS(priv)) & SMBHSTSTS_INUSE_STS) {
>
> Do I understand correctly that there are 3 cases where we would be
> looping here:
>
> 1* The SMBus was used by the BIOS at boot time and they forgot to clear
> SMBHSTSTS_INUSE_STS. The expected outcome is a timeout.
>
> 2* ACPI is using the SMBus, and the implementation is a good citizen
> (acquires the hardware lock before starting a transfer, releases it
> when done), and we are in the middle of an ACPI-side SMBus transfer.
> The expectation is that the ACPI-side transfer will complete and the
> hardware lock be released before we reach the timeout.
>
> 3* ACPI is using the SMBus, the implementation is a bad citizen and did
> not release the hardware lock after a transfer. The expected outcome
> is a timeout.
>
> Case 1 wouldn't be a problem, but we can't differentiate it from case
> 3, which is a problem, therefore we always consider a timeout to be the
> problematic case 3 and inhibit driver access when that happens. Right?
Not quite. inuse_stuck does not inhibit driver access by itself until we
see *some* access from ACPI (acpi_usage).
It is impossible for BIOS to forget to clear SMBHSTSTS_INUSE_STS *and*
for ACPI to be a good citizen with the bus, because that would
immediately deadlock on the ACPI access regardless of what the OS does
(unless there is some weird logic/workaround to clear the lock on first
access in the ACPI AML, but let's hope nobody does that). So the
expected cases are:
1. SMBus was left locked by the BIOS but is *not* used by ACPI
(inuse_stuck == true, acpi_usage == false). In this case we expect a
timeout once, then we give up on checking the hardware lock altogether
in the future.
2. SMBus is/was busy on a well-behaved ACPI transfer (inuse_stuck ==
false, acpi_usage == true). In this case we make note of the ACPI usage
but nothing bad happens (no timeout).
3. ACPI is/has used SMBus and left it locked (inuse_stuck == true,
acpi_usage == true). In this case we give up on future driver accesses
entirely.
Only case 3 leads to the driver bailing out on further accesses.
> Note: I had to change this to use time_is_before_jiffies(), as
> MAX_RETRIES has been removed from the i2c-i801 driver by commit
> 44c54c4ec391 ("i2c: i801: Improve status polling") meanwhile.
Ack.
> Technically, this is "Non-SMBus access". It could still be an access to
> other functions of the i82801 chipset, but we don't care about these.
Right, makese sense.
>
>> */
>> - mutex_lock(&priv->acpi_lock);
>> + if (!i801_acpi_is_smbus_ioport(priv, address))
>> + return i801_acpi_do_access(function, address, bits, value);
>>
>> - if (!priv->acpi_reserved && i801_acpi_is_smbus_ioport(priv, address)) {
>> - priv->acpi_reserved = true;
>> + if (!mutex_trylock(&priv->acpi_lock)) {
>> + mutex_lock(&priv->acpi_lock);
>
> I was pretty confused by this construct at first, I think it deserves a
> comment. My understanding of it is:
>
> 1* We first try to acquire the software lock without waiting. If it
> works, it means that no transfer is in progress on the Linux side,
> and we can proceed with the ACPI I/O request, as we hold the lock.
>
> 2* If immediate acquisition of the software lock failed, we wait for
> the lock to be released by i801_access(). Once we have the lock,
> there are two options. Either the first thing done by the ACPI side
> is reading SMBHSTSTS_INUSE_STS, it is a good citizen and we can
> continue (we hold both the software lock and the hardware lock). Or
> the ACPI side accesses another register directly, which means it
> does not check SMBHSTSTS_INUSE_STS, it is a bad citizen and there is
> no safe way to access the SMBus on the Linux side. We prevent
> further access from the Linux side, as we did before this patch.
>
> Please correct me if I'm wrong.
Yes, I believe that was the intent.
>
> I see a problem with this implementation. It is possible that
> i801_access() gets called in the middle of an ACPI-side SMBus transfer,
> because the software lock is not held for the whole duration of an
> ACPI-side transfer (contrary to Linux-side transfers). Before this
> patch, the first step of the ACPI-side transfer would have blocked any
> future access from the Linux side, so we were safe. After this patch,
> acpi_usage has been set to true at this point, but that alone won't
> prevent access from the Linux side. Access will only be prevented if
> SMBHSTSTS_INUSE_STS is considered stuck. If the ACPI implementation is
> a bad citizen, I'm afraid there is no guarantee of that happening.
> SMBHSTSTS will eventually be read during the ACPI-side transfer, and
> that will acquire the hardware lock as a side effect, but not as the
> first step.
>
> Specifically, the scenario that would cause problem is:
>
> ACPI side | Linux side
> ---------------------------------------+----------------------------------
> [first step of a transfer] |
> mutex_trylock(&priv->acpi_lock) |
> priv->acpi_usage = true |
> i801_acpi_do_access(random register) |
> mutex_unlock(&priv->acpi_lock) |
> | take hardware lock
> | mutex_lock(&priv->acpi_lock)
> | priv->acpi_unsafe is false -> OK
> | perform transfer
> | release hardware lock
> | mutex_unlock(&priv->acpi_lock)
> [second step of a transfer] |
> mutex_trylock(&priv->acpi_lock) |
> i801_acpi_do_access(random register) |
> mutex_unlock(&priv->acpi_lock) |
> (...) |
>
> Obviously the ACPI-side transfer will be screwed, as whatever register
> value was set during the first step will inevitably have been
> overwritten by the Linux-side transfer.
>
> I admit that the probability of this scenario happening is extremely
> low, because it will take only one completed ACPI-side transfer (bad
> citizen case) to take the hardware lock and not release it, which in
> turn will prevent any further access from the Linux side. The above
> scenario can therefore only happen if the very first SMBus transfer
> from the bad citizen ACPI side happens at exactly the same time as a
> transfer from the Linux side.
Agreed, this is a problem.
> Question: why do you check that the first action of ACPI AML is to
> acquire the hardware lock *only* if that action was attempted while the
> Linux side was performing a transfer? That event is quite unlikely to
> happen. Can't we perform that test unconditionally the very first time
> i801_acpi_io_handler() is called? Unless the i2c-i801 driver gets> initialized in the middle of an ACPI-side SMBus transfer [...]
I wonder if there's some way to close this race? This doesn't sound all
that unlikely (consider: ACPI backlight is also a module, device init
happens in parallel, so we could well end up probing i2c-i801 in the
middle of an ACPI SMBus transfer more often than you'd expect at boot time).
How about this: instead of checking for the lock access on the *first*
call to i801_acpi_io_handler, we add an acpi_must_lock flag. This is
initially false, but it is set on completion of every Linux-side i2c
transfer, and cleared once we see ACPI take the lock properly. The ACPI
handler then checks it and expects a lock access if it is true.
So as soon as Linux does an i2c transfer, we expect ACPI to be taking
the lock next time it touches the hardware, and we know to bail on the
Linux side in the future if it does not.
* If the i2c driver probes in the middle of a well-behaved ACPI SMBus
transfer, nothing bad happens, since if we try to do a Linux-side
transfer it will block on the lock until ACPI is done. Further ACPI
SMBus transfers after a Linux transfer will pass the locking check.
* If the driver probes in the middle of a misbehaved ACPI transfer but
is otherwise idle, nothing happens until the first Linux transfer, then
the next ACPI access after that will have us bail and disable driver access.
* If we probe *and* try to make a transfer all in the middle of a
misbehaved ACPI transfer, then we are going to step on its toes and
break it, but at least we will notice as soon as the Linux side is done
(or possibly failed due to the collision) and ACPI tries to touch the
controller again, so we will get out of its way in the future and
there's at least a chance it will recover for future accesses.
Further closing that last edge case to avoid ever conflicting with
broken ACPI implementations would require some sort of mechanism to know
whether ACPI AML started running an i2c transfer method before the
i2c-i801 driver loaded, which might be too intrusive a change to be wrth
it for such a corner case. Though maybe there's an easy way? If there's
something like a global AML lock we could just have the probe sequence be:
1. Register the ACPI IO handler
2. Take the AML lock
3. Set acpi_must_lock = true
4. Release the AML lock
5. Finally register the i2c controller
That makes sure we serialize on any ongoing ACPI shenanigans at probe
time and allows us to truly check that the first access from ACPI after
that is a lock, before Linux has a chance to do anything itself. But I
don't know off the top of my head whether there is such a lock.
- Hector
On Sat, 17 Dec 2022 22:28:25 +0900, Hector Martin wrote:
> On 15/12/2022 23.26, Jean Delvare wrote:
> > Question: why do you check that the first action of ACPI AML is to
> > acquire the hardware lock *only* if that action was attempted while the
> > Linux side was performing a transfer? That event is quite unlikely to
> > happen. Can't we perform that test unconditionally the very first time
> > i801_acpi_io_handler() is called? Unless the i2c-i801 driver gets> initialized in the middle of an ACPI-side SMBus transfer [...]
>
> I wonder if there's some way to close this race? This doesn't sound all
> that unlikely (consider: ACPI backlight is also a module, device init
> happens in parallel, so we could well end up probing i2c-i801 in the
> middle of an ACPI SMBus transfer more often than you'd expect at boot time).
>
> How about this: instead of checking for the lock access on the *first*
> call to i801_acpi_io_handler, we add an acpi_must_lock flag. This is
> initially false, but it is set on completion of every Linux-side i2c
> transfer, and cleared once we see ACPI take the lock properly. The ACPI
> handler then checks it and expects a lock access if it is true.
>
> So as soon as Linux does an i2c transfer, we expect ACPI to be taking
> the lock next time it touches the hardware, and we know to bail on the
> Linux side in the future if it does not.
>
> * If the i2c driver probes in the middle of a well-behaved ACPI SMBus
> transfer, nothing bad happens, since if we try to do a Linux-side
> transfer it will block on the lock until ACPI is done. Further ACPI
> SMBus transfers after a Linux transfer will pass the locking check.
> * If the driver probes in the middle of a misbehaved ACPI transfer but
> is otherwise idle, nothing happens until the first Linux transfer, then
> the next ACPI access after that will have us bail and disable driver access.
> * If we probe *and* try to make a transfer all in the middle of a
> misbehaved ACPI transfer, then we are going to step on its toes and
> break it, but at least we will notice as soon as the Linux side is done
> (or possibly failed due to the collision) and ACPI tries to touch the
> controller again, so we will get out of its way in the future and
> there's at least a chance it will recover for future accesses.
I see what you want to do and I think it should work.
> Further closing that last edge case to avoid ever conflicting with
> broken ACPI implementations would require some sort of mechanism to know
> whether ACPI AML started running an i2c transfer method before the
> i2c-i801 driver loaded, which might be too intrusive a change to be wrth
> it for such a corner case. Though maybe there's an easy way? If there's
> something like a global AML lock we could just have the probe sequence be:
>
> 1. Register the ACPI IO handler
> 2. Take the AML lock
> 3. Set acpi_must_lock = true
> 4. Release the AML lock
> 5. Finally register the i2c controller
>
> That makes sure we serialize on any ongoing ACPI shenanigans at probe
> time and allows us to truly check that the first access from ACPI after
> that is a lock, before Linux has a chance to do anything itself. But I
> don't know off the top of my head whether there is such a lock.
I don't think there is, but that would be a question for the ACPI list.
Anyway, for it to be useful, it would have a to be a high-level lock
(taken before starting an ACPI-side SMBus transfer, released after the
ACPI-side SMBus transfer has been fully processes). If it is taken for
every I/O, or even if it isn't held while waiting for the transfer to
complete, it won't solve the problem. And I suspect that if such a
high-level lock existed, we would have been using it in the first place
to guarantee exclusive access to the SMBus controller.
The most important thing is to get exclusive access to work properly
for well-behaved ACPI implementations. I know that the Linux driver
hasn't been a good citizen with the hardware lock until you partially
fixed that 1.5 year ago, but I hope that other operating systems did
that earlier, which would have encouraged well-behaved ACPI
implementations. So hopefully misbehaving ACPI implementations aren't
many on recent systems.
--
Jean Delvare
SUSE L3 Support