2022-07-25 09:09:03

by Fengfei Xi

[permalink] [raw]
Subject: [PATCH] scsi: mpt3sas: fix kernel panic in scsih_qcmd after shutdown/unload

We encountered a kernel crash problem after the user performed a
shutdown operation. By analyzing the vmcore file, it is confirmed
that it is scsih_qcmd called memset to access ioc->request resources
that have been released in shutdown/module unload path.

crash> struct MPT3SAS_ADAPTER 0xffff00ff85806880
struct MPT3SAS_ADAPTER {
list = {
next = 0xffff800008eb8038 <mpt3sas_ioc_list>,
prev = 0xffff800008eb8038 <mpt3sas_ioc_list>
},
...
name = "mpt3sas_cm0\000\000\000\000\000\000\000\
...
remove_host = 1 '\001',
...
request_sz = 128,
request = 0x0,
...
sense = 0x0,

The SCSI queuecommand handlers(scsih_qcmd) may be invoked after
shutdown/unload, depending on other components. So we should add
checks for 'ioc->remove_host' in scsih_qcmd, so not to access
pointers/resources potentially freed in the PCI shutdown/module
unload path.

Just like the following commit:
9ff549ffb4fb4cc9a4b24d1de9dc3e68287797c4
scsi: mpt3sas: fix oops in error handlers after shutdown/unload

Signed-off-by: Fengfei Xi <[email protected]>
---
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b519f4b59..d8994eaec 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -5140,7 +5140,8 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
scsi_print_command(scmd);

sas_device_priv_data = scmd->device->hostdata;
- if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
+ if (!sas_device_priv_data || !sas_device_priv_data->sas_target ||
+ ioc->remove_host) {
scmd->result = DID_NO_CONNECT << 16;
scsi_done(scmd);
return 0;
--
2.17.1


2022-07-25 10:10:03

by Sreekanth Reddy

[permalink] [raw]
Subject: Re: [PATCH] scsi: mpt3sas: fix kernel panic in scsih_qcmd after shutdown/unload

Hi Fengfei,

Driver is already returinging the SCSI IO commands (except for
SYNCHRONIZE_CACHE & START_STOP) with DID_NO_CONNECT when remove_host
is set to one.

Also during the shutdown driver is not freeing any controller memory
pools. And during driver unload driver is freeing the memory pools
only after target devices are unregistered with SML.

Can you please share the kernel panic call trace once. Also please let
me know which driver version is used.

Thanks,
Sreekanth

On Mon, Jul 25, 2022 at 1:01 PM Fengfei Xi <[email protected]> wrote:
>
> We encountered a kernel crash problem after the user performed a
> shutdown operation. By analyzing the vmcore file, it is confirmed
> that it is scsih_qcmd called memset to access ioc->request resources
> that have been released in shutdown/module unload path.
>
> crash> struct MPT3SAS_ADAPTER 0xffff00ff85806880
> struct MPT3SAS_ADAPTER {
> list = {
> next = 0xffff800008eb8038 <mpt3sas_ioc_list>,
> prev = 0xffff800008eb8038 <mpt3sas_ioc_list>
> },
> ...
> name = "mpt3sas_cm0\000\000\000\000\000\000\000\
> ...
> remove_host = 1 '\001',
> ...
> request_sz = 128,
> request = 0x0,
> ...
> sense = 0x0,
>
> The SCSI queuecommand handlers(scsih_qcmd) may be invoked after
> shutdown/unload, depending on other components. So we should add
> checks for 'ioc->remove_host' in scsih_qcmd, so not to access
> pointers/resources potentially freed in the PCI shutdown/module
> unload path.
>
> Just like the following commit:
> 9ff549ffb4fb4cc9a4b24d1de9dc3e68287797c4
> scsi: mpt3sas: fix oops in error handlers after shutdown/unload
>
> Signed-off-by: Fengfei Xi <[email protected]>
> ---
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b519f4b59..d8994eaec 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -5140,7 +5140,8 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
> scsi_print_command(scmd);
>
> sas_device_priv_data = scmd->device->hostdata;
> - if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
> + if (!sas_device_priv_data || !sas_device_priv_data->sas_target ||
> + ioc->remove_host) {
> scmd->result = DID_NO_CONNECT << 16;
> scsi_done(scmd);
> return 0;
> --
> 2.17.1
>


Attachments:
smime.p7s (4.12 kB)
S/MIME Cryptographic Signature