2018-12-06 23:50:38

by Anatoliy Glagolev

[permalink] [raw]
Subject: [PATCH] scsi: qla2xxx: deadlock by configfs_depend_item

The intent of invoking configfs_depend_item in commit 7474f52a82d51
("tcm_qla2xxx: Perform configfs depend/undepend for base_tpg")
was to prevent a physical Fibre Channel port removal when
virtual (NPIV) ports announced through that physical port are active.
The change does not work as expected: it makes enabled physical port
dependent on target configfs subsystem (the port's parent), something
the configfs guarantees anyway.

Besides, scheduling work in a worker thread and waiting for the work's
completion is not really a valid workaround for the requirement not
to call configfs_depend_item from a configfs callback: the call
occasionally deadlocks.

Thus, removing configfs_depend_item calls does not break anything
and fixes the deadlock problem.

Signed-off-by: Anatoliy Glagolev <[email protected]>
---
drivers/scsi/qla2xxx/tcm_qla2xxx.c | 48 +++++++-------------------------------
drivers/scsi/qla2xxx/tcm_qla2xxx.h | 3 ---
2 files changed, 8 insertions(+), 43 deletions(-)

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 7732e93..1c7ca84 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -929,38 +929,14 @@ static ssize_t tcm_qla2xxx_tpg_enable_show(struct config_item *item,
atomic_read(&tpg->lport_tpg_enabled));
}

-static void tcm_qla2xxx_depend_tpg(struct work_struct *work)
-{
- struct tcm_qla2xxx_tpg *base_tpg = container_of(work,
- struct tcm_qla2xxx_tpg, tpg_base_work);
- struct se_portal_group *se_tpg = &base_tpg->se_tpg;
- struct scsi_qla_host *base_vha = base_tpg->lport->qla_vha;
-
- if (!target_depend_item(&se_tpg->tpg_group.cg_item)) {
- atomic_set(&base_tpg->lport_tpg_enabled, 1);
- qlt_enable_vha(base_vha);
- }
- complete(&base_tpg->tpg_base_comp);
-}
-
-static void tcm_qla2xxx_undepend_tpg(struct work_struct *work)
-{
- struct tcm_qla2xxx_tpg *base_tpg = container_of(work,
- struct tcm_qla2xxx_tpg, tpg_base_work);
- struct se_portal_group *se_tpg = &base_tpg->se_tpg;
- struct scsi_qla_host *base_vha = base_tpg->lport->qla_vha;
-
- if (!qlt_stop_phase1(base_vha->vha_tgt.qla_tgt)) {
- atomic_set(&base_tpg->lport_tpg_enabled, 0);
- target_undepend_item(&se_tpg->tpg_group.cg_item);
- }
- complete(&base_tpg->tpg_base_comp);
-}
-
static ssize_t tcm_qla2xxx_tpg_enable_store(struct config_item *item,
const char *page, size_t count)
{
struct se_portal_group *se_tpg = to_tpg(item);
+ struct se_wwn *se_wwn = se_tpg->se_tpg_wwn;
+ struct tcm_qla2xxx_lport *lport = container_of(se_wwn,
+ struct tcm_qla2xxx_lport, lport_wwn);
+ struct scsi_qla_host *vha = lport->qla_vha;
struct tcm_qla2xxx_tpg *tpg = container_of(se_tpg,
struct tcm_qla2xxx_tpg, se_tpg);
unsigned long op;
@@ -979,24 +955,16 @@ static ssize_t tcm_qla2xxx_tpg_enable_store(struct config_item *item,
if (atomic_read(&tpg->lport_tpg_enabled))
return -EEXIST;

- INIT_WORK(&tpg->tpg_base_work, tcm_qla2xxx_depend_tpg);
+ atomic_set(&tpg->lport_tpg_enabled, 1);
+ qlt_enable_vha(vha);
} else {
if (!atomic_read(&tpg->lport_tpg_enabled))
return count;

- INIT_WORK(&tpg->tpg_base_work, tcm_qla2xxx_undepend_tpg);
+ atomic_set(&tpg->lport_tpg_enabled, 0);
+ qlt_stop_phase1(vha->vha_tgt.qla_tgt);
}
- init_completion(&tpg->tpg_base_comp);
- schedule_work(&tpg->tpg_base_work);
- wait_for_completion(&tpg->tpg_base_comp);

- if (op) {
- if (!atomic_read(&tpg->lport_tpg_enabled))
- return -ENODEV;
- } else {
- if (atomic_read(&tpg->lport_tpg_enabled))
- return -EPERM;
- }
return count;
}

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.h b/drivers/scsi/qla2xxx/tcm_qla2xxx.h
index 7550ba2..147cf6c 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.h
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.h
@@ -48,9 +48,6 @@ struct tcm_qla2xxx_tpg {
struct tcm_qla2xxx_tpg_attrib tpg_attrib;
/* Returned by tcm_qla2xxx_make_tpg() */
struct se_portal_group se_tpg;
- /* Items for dealing with configfs_depend_item */
- struct completion tpg_base_comp;
- struct work_struct tpg_base_work;
};

struct tcm_qla2xxx_fc_loopid {
--
1.9.1



2018-12-13 21:56:06

by Anatoliy Glagolev

[permalink] [raw]
Subject: Re: [PATCH] scsi: qla2xxx: deadlock by configfs_depend_item

Any thoughts on this change?

Thanks.


2018-12-20 02:49:44

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH] scsi: qla2xxx: deadlock by configfs_depend_item


Anatoliy,

> The intent of invoking configfs_depend_item in commit 7474f52a82d51
> ("tcm_qla2xxx: Perform configfs depend/undepend for base_tpg") was to
> prevent a physical Fibre Channel port removal when virtual (NPIV)
> ports announced through that physical port are active. The change
> does not work as expected: it makes enabled physical port dependent on
> target configfs subsystem (the port's parent), something the configfs
> guarantees anyway.
>
> Besides, scheduling work in a worker thread and waiting for the work's
> completion is not really a valid workaround for the requirement not to
> call configfs_depend_item from a configfs callback: the call
> occasionally deadlocks.
>
> Thus, removing configfs_depend_item calls does not break anything and
> fixes the deadlock problem.

Applied to 4.21/scsi-queue, thank you!

--
Martin K. Petersen Oracle Linux Engineering