Commit 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on
SPI buses") introduced a per-controller mutex. But mutex_unlock() of
said lock is called after the controller is already freed:
spi_unregister_controller(ctlr)
-> put_device(&ctlr->dev)
-> spi_controller_release(dev)
mutex_unlock(&ctrl->add_lock)
Move the put_device() after the mutex_unlock().
Fixes: 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on SPI buses")
Signed-off-by: Michael Walle <[email protected]>
---
I'm not sure if this is the correct fix. I don't know if the put_device()
will have to be protected by the add_lock (remember before, the add_lock
was a global lock).
For reference, the kernel oops is:
[ 20.242505] Unable to handle kernel paging request at virtual address 0042a2203dc65260
[ 20.250468] Mem abort info:
[ 20.253270] ESR = 0x96000044
[ 20.256332] EC = 0x25: DABT (current EL), IL = 32 bits
[ 20.261662] SET = 0, FnV = 0
[ 20.264723] EA = 0, S1PTW = 0
[ 20.267869] FSC = 0x04: level 0 translation fault
[ 20.272764] Data abort info:
[ 20.275646] ISV = 0, ISS = 0x00000044
[ 20.279494] CM = 0, WnR = 1
[ 20.282469] [0042a2203dc65260] address between user and kernel address ranges
[ 20.289632] Internal error: Oops: 96000044 [#1] SMP
[ 20.294525] Modules linked in:
[ 20.297586] CPU: 1 PID: 1546 Comm: init Not tainted 5.15.0-next-20211109+ #1307
[ 20.304919] Hardware name: Kontron KBox A-230-LS (DT)
[ 20.309983] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 20.316968] pc : queued_spin_lock_slowpath+0x1a8/0x3a0
[ 20.322127] lr : __mutex_unlock_slowpath.constprop.0+0x180/0x190
[ 20.328156] sp : ffff800013c23b80
[ 20.331475] x29: ffff800013c23b80 x28: ffff002003941040 x27: 0000000000000000
[ 20.338639] x26: ffff002000e49c90 x25: ffff80001167de48 x24: ffff800011f55578
[ 20.345802] x23: ffff002002b1d350 x22: 0000000000000002 x21: 0000000000000001
[ 20.352964] x20: ffff800013c23bb8 x19: ffff002002b1d350 x18: 00000000fffffffb
[ 20.360126] x17: 0000000000000001 x16: 0000000000000001 x15: 0000000000000020
[ 20.367288] x14: 0000000000000001 x13: 0000000000000000 x12: ffff002002c27938
[ 20.374450] x11: ffff002002c27908 x10: 0000000000000000 x9 : 0000000000080000
[ 20.381612] x8 : 0000000000000000 x7 : ffff00207f7b7a00 x6 : ffff8000119d1a00
[ 20.388774] x5 : ffff00207f7b7a00 x4 : 504322202c293830 x3 : ffff002002b1d358
[ 20.395936] x2 : ffff8000119d1a30 x1 : ffff8000119d1a30 x0 : ffff00207f7b7a08
[ 20.403098] Call trace:
[ 20.405546] queued_spin_lock_slowpath+0x1a8/0x3a0
[ 20.410352] mutex_unlock+0x5c/0x70
[ 20.413849] spi_unregister_controller+0xf0/0x170
[ 20.418568] dspi_remove+0x28/0xb0
[ 20.421978] dspi_shutdown+0x1c/0x30
[ 20.425561] platform_shutdown+0x30/0x40
[ 20.429495] device_shutdown+0x14c/0x420
[ 20.433427] __do_sys_reboot+0x214/0x29c
[ 20.437361] __arm64_sys_reboot+0x30/0x40
[ 20.441380] invoke_syscall+0x50/0x120
[ 20.445140] el0_svc_common.constprop.0+0x58/0x190
[ 20.449944] do_el0_svc+0x30/0x94
[ 20.453267] el0_svc+0x20/0x90
[ 20.456327] el0t_64_sync_handler+0x1a4/0x1b0
[ 20.460694] el0t_64_sync+0x1a0/0x1a4
[ 20.464368] Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827)
[ 20.470479] ---[ end trace 9ee0c21f9e4bc5d5 ]---
[ 21.475251] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
[ 21.482940] Kernel Offset: disabled
[ 21.486433] CPU features: 0x2,000001c2,40000846
[ 21.490976] Memory Limit: none
[ 21.494036] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000 ]---
drivers/spi/spi.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index b23e675953e1..fdd530b150a7 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -3099,12 +3099,6 @@ void spi_unregister_controller(struct spi_controller *ctlr)
device_del(&ctlr->dev);
- /* Release the last reference on the controller if its driver
- * has not yet been converted to devm_spi_alloc_master/slave().
- */
- if (!ctlr->devm_allocated)
- put_device(&ctlr->dev);
-
/* free bus id */
mutex_lock(&board_lock);
if (found == ctlr)
@@ -3113,6 +3107,12 @@ void spi_unregister_controller(struct spi_controller *ctlr)
if (IS_ENABLED(CONFIG_SPI_DYNAMIC))
mutex_unlock(&ctlr->add_lock);
+
+ /* Release the last reference on the controller if its driver
+ * has not yet been converted to devm_spi_alloc_master/slave().
+ */
+ if (!ctlr->devm_allocated)
+ put_device(&ctlr->dev);
}
EXPORT_SYMBOL_GPL(spi_unregister_controller);
--
2.30.2
On Wed, Nov 10, 2021 at 05:08:36PM +0100, Michael Walle wrote:
> Commit 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on
> SPI buses") introduced a per-controller mutex. But mutex_unlock() of
> said lock is called after the controller is already freed:
>
> spi_unregister_controller(ctlr)
> -> put_device(&ctlr->dev)
> -> spi_controller_release(dev)
> mutex_unlock(&ctrl->add_lock)
>
> Move the put_device() after the mutex_unlock().
Do we even need to unlock the mutex here?
>
> For reference, the kernel oops is:
> [ 20.242505] Unable to handle kernel paging request at virtual address 0042a2203dc65260
> [ 20.250468] Mem abort info:
> [ 20.253270] ESR = 0x96000044
Please think hard before including complete backtraces in upstream
reports, they are very large and contain almost no useful information
relative to their size so often obscure the relevant content in your
message. If part of the backtrace is usefully illustrative (it often is
for search engines if nothing else) then it's usually better to pull out
the relevant sections.
Am 2021-11-10 17:27, schrieb Mark Brown:
>> For reference, the kernel oops is:
>> [ 20.242505] Unable to handle kernel paging request at virtual
>> address 0042a2203dc65260
>> [ 20.250468] Mem abort info:
>> [ 20.253270] ESR = 0x96000044
>
> Please think hard before including complete backtraces in upstream
> reports, they are very large and contain almost no useful information
> relative to their size so often obscure the relevant content in your
> message. If part of the backtrace is usefully illustrative (it often is
> for search engines if nothing else) then it's usually better to pull
> out
> the relevant sections.
It was in the comments section of the patch, for exactly this purpose.
That's how you're doing it, no?
-michael
On Wed, Nov 10, 2021 at 05:30:48PM +0100, Michael Walle wrote:
> Am 2021-11-10 17:27, schrieb Mark Brown:
> > > For reference, the kernel oops is:
> > > [ 20.242505] Unable to handle kernel paging request at virtual
> > > address 0042a2203dc65260
> > > [ 20.250468] Mem abort info:
> > > [ 20.253270] ESR = 0x96000044
> > Please think hard before including complete backtraces in upstream
> > reports, they are very large and contain almost no useful information
> > relative to their size so often obscure the relevant content in your
> > message. If part of the backtrace is usefully illustrative (it often is
> > for search engines if nothing else) then it's usually better to pull out
> > the relevant sections.
> It was in the comments section of the patch, for exactly this purpose.
> That's how you're doing it, no?
That helps with what ends up in git but it's still including multiple
screenfuls of noise in the email which is where the usability problem
is.
On Wed, Nov 10, 2021 at 6:09 PM Michael Walle <[email protected]> wrote:
>
> Commit 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on
> SPI buses") introduced a per-controller mutex. But mutex_unlock() of
> said lock is called after the controller is already freed:
>
> spi_unregister_controller(ctlr)
> -> put_device(&ctlr->dev)
> -> spi_controller_release(dev)
> mutex_unlock(&ctrl->add_lock)
>
> Move the put_device() after the mutex_unlock().
I have a déjà-vu feeling about this code (Cc Lukas).
>
> Fixes: 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on SPI buses")
> Signed-off-by: Michael Walle <[email protected]>
> ---
> I'm not sure if this is the correct fix. I don't know if the put_device()
> will have to be protected by the add_lock (remember before, the add_lock
> was a global lock).
>
> For reference, the kernel oops is:
> [ 20.242505] Unable to handle kernel paging request at virtual address 0042a2203dc65260
> [ 20.250468] Mem abort info:
> [ 20.253270] ESR = 0x96000044
> [ 20.256332] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 20.261662] SET = 0, FnV = 0
> [ 20.264723] EA = 0, S1PTW = 0
> [ 20.267869] FSC = 0x04: level 0 translation fault
> [ 20.272764] Data abort info:
> [ 20.275646] ISV = 0, ISS = 0x00000044
> [ 20.279494] CM = 0, WnR = 1
> [ 20.282469] [0042a2203dc65260] address between user and kernel address ranges
> [ 20.289632] Internal error: Oops: 96000044 [#1] SMP
> [ 20.294525] Modules linked in:
> [ 20.297586] CPU: 1 PID: 1546 Comm: init Not tainted 5.15.0-next-20211109+ #1307
> [ 20.304919] Hardware name: Kontron KBox A-230-LS (DT)
> [ 20.309983] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 20.316968] pc : queued_spin_lock_slowpath+0x1a8/0x3a0
> [ 20.322127] lr : __mutex_unlock_slowpath.constprop.0+0x180/0x190
> [ 20.328156] sp : ffff800013c23b80
> [ 20.331475] x29: ffff800013c23b80 x28: ffff002003941040 x27: 0000000000000000
> [ 20.338639] x26: ffff002000e49c90 x25: ffff80001167de48 x24: ffff800011f55578
> [ 20.345802] x23: ffff002002b1d350 x22: 0000000000000002 x21: 0000000000000001
> [ 20.352964] x20: ffff800013c23bb8 x19: ffff002002b1d350 x18: 00000000fffffffb
> [ 20.360126] x17: 0000000000000001 x16: 0000000000000001 x15: 0000000000000020
> [ 20.367288] x14: 0000000000000001 x13: 0000000000000000 x12: ffff002002c27938
> [ 20.374450] x11: ffff002002c27908 x10: 0000000000000000 x9 : 0000000000080000
> [ 20.381612] x8 : 0000000000000000 x7 : ffff00207f7b7a00 x6 : ffff8000119d1a00
> [ 20.388774] x5 : ffff00207f7b7a00 x4 : 504322202c293830 x3 : ffff002002b1d358
> [ 20.395936] x2 : ffff8000119d1a30 x1 : ffff8000119d1a30 x0 : ffff00207f7b7a08
> [ 20.403098] Call trace:
> [ 20.405546] queued_spin_lock_slowpath+0x1a8/0x3a0
> [ 20.410352] mutex_unlock+0x5c/0x70
> [ 20.413849] spi_unregister_controller+0xf0/0x170
> [ 20.418568] dspi_remove+0x28/0xb0
> [ 20.421978] dspi_shutdown+0x1c/0x30
> [ 20.425561] platform_shutdown+0x30/0x40
> [ 20.429495] device_shutdown+0x14c/0x420
> [ 20.433427] __do_sys_reboot+0x214/0x29c
> [ 20.437361] __arm64_sys_reboot+0x30/0x40
> [ 20.441380] invoke_syscall+0x50/0x120
> [ 20.445140] el0_svc_common.constprop.0+0x58/0x190
> [ 20.449944] do_el0_svc+0x30/0x94
> [ 20.453267] el0_svc+0x20/0x90
> [ 20.456327] el0t_64_sync_handler+0x1a4/0x1b0
> [ 20.460694] el0t_64_sync+0x1a0/0x1a4
> [ 20.464368] Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827)
> [ 20.470479] ---[ end trace 9ee0c21f9e4bc5d5 ]---
> [ 21.475251] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
> [ 21.482940] Kernel Offset: disabled
> [ 21.486433] CPU features: 0x2,000001c2,40000846
> [ 21.490976] Memory Limit: none
> [ 21.494036] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000 ]---
>
> drivers/spi/spi.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index b23e675953e1..fdd530b150a7 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -3099,12 +3099,6 @@ void spi_unregister_controller(struct spi_controller *ctlr)
>
> device_del(&ctlr->dev);
>
> - /* Release the last reference on the controller if its driver
> - * has not yet been converted to devm_spi_alloc_master/slave().
> - */
> - if (!ctlr->devm_allocated)
> - put_device(&ctlr->dev);
> -
> /* free bus id */
> mutex_lock(&board_lock);
> if (found == ctlr)
> @@ -3113,6 +3107,12 @@ void spi_unregister_controller(struct spi_controller *ctlr)
>
> if (IS_ENABLED(CONFIG_SPI_DYNAMIC))
> mutex_unlock(&ctlr->add_lock);
> +
> + /* Release the last reference on the controller if its driver
> + * has not yet been converted to devm_spi_alloc_master/slave().
> + */
> + if (!ctlr->devm_allocated)
> + put_device(&ctlr->dev);
> }
> EXPORT_SYMBOL_GPL(spi_unregister_controller);
>
> --
> 2.30.2
>
--
With Best Regards,
Andy Shevchenko
Hello,
On Wed, Nov 10, 2021 at 05:08:36PM +0100, Michael Walle wrote:
> Commit 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on
> SPI buses") introduced a per-controller mutex. But mutex_unlock() of
> said lock is called after the controller is already freed:
>
> spi_unregister_controller(ctlr)
> -> put_device(&ctlr->dev)
> -> spi_controller_release(dev)
> mutex_unlock(&ctrl->add_lock)
This is indented in a misleading way. mutex_unlock() has to be on the
same level as put_device().
> Move the put_device() after the mutex_unlock().
>
> Fixes: 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on SPI buses")
> Signed-off-by: Michael Walle <[email protected]>
I first thought this was wrong, and the put_device must be dropped
altogether, but after some code reading I agree this is the right fix.
Reviewed-by: Uwe Kleine-K?nig <[email protected]>
> ---
> I'm not sure if this is the correct fix. I don't know if the put_device()
> will have to be protected by the add_lock (remember before, the add_lock
> was a global lock).
No, put_device doesn't need to be protected by this lock.
Best regards and thanks for the report and diagnosis,
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | https://www.pengutronix.de/ |
On Wed, Nov 10, 2021 at 05:08:36PM +0100, Michael Walle wrote:
> Commit 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on
> SPI buses") introduced a per-controller mutex. But mutex_unlock() of
> said lock is called after the controller is already freed:
>
> spi_unregister_controller(ctlr)
> -> put_device(&ctlr->dev)
> -> spi_controller_release(dev)
> mutex_unlock(&ctrl->add_lock)
>
> Move the put_device() after the mutex_unlock().
>
> Fixes: 6098475d4cb4 ("spi: Fix deadlock when adding SPI controllers on SPI buses")
> Signed-off-by: Michael Walle <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v5.15