From: Nicholas Bellinger <[email protected]>
Hi all,
Here are two additional target bug-fixes that have been found
by the DATERA Q/A + automation team during extended longevity
and stress testing atop v4.1.y stable code.
The first is a iscsi-target specific TMR reference leak during
session shutdown when se_cmd quiesce occured before hand-off
back to iscsi-target, that results in se_tmr_req descriptors
being left after se_cmd was freed.
This would manifest as kernel paging OOPsen during LUN_RESET
after the leak occured, because the associated se_tmr_req had
not been released even though the pre-allocated parent se_cmd
descriptor had already been freed during session shutdown.
The second is a race between se_lun configfs unlink shutdown
when se_lun->lun_ref percpu_ref RCU grace period is happening,
and user-space attempts to add a new NodeACL mappedlun configfs
symlink while se_lun shutdown is occuring.
This would manifest as NULL pointer dereference OOPsen within
target_stat_scsi_att_intr_port_show_attr_dev() after a new
NodeACL mappedlun was added to a se_lun that had already
been shutdown.
Both have been verified on v4.1.y stable code, and forward
ported to v4.11-rc.
Please review.
--nab
Nicholas Bellinger (2):
iscsi-target: Fix TMR reference leak during session shutdown
target: Avoid mappedlun symlink creation during lun shutdown
drivers/target/iscsi/iscsi_target_util.c | 12 +++++++-----
drivers/target/target_core_fabric_configfs.c | 5 +++++
drivers/target/target_core_tpg.c | 4 ++++
include/target/target_core_base.h | 1 +
4 files changed, 17 insertions(+), 5 deletions(-)
--
1.9.1
From: Nicholas Bellinger <[email protected]>
This patch fixes a iscsi-target specific TMR reference leak
during session shutdown, that could occur when a TMR was
quiesced before the hand-off back to iscsi-target code
via transport_cmd_check_stop_to_fabric().
The reference leak happens because iscsit_free_cmd() was
incorrectly skipping the final target_put_sess_cmd() for
TMRs when transport_generic_free_cmd() returned zero because
the se_cmd->cmd_kref did not reach zero, due to the missing
se_cmd assignment in original code.
The result was iscsi_cmd and it's associated se_cmd memory
would be freed once se_sess->sess_cmd_map where released,
but the associated se_tmr_req was leaked and remained part
of se_device->dev_tmr_list.
This bug would manfiest itself as kernel paging request
OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req
attempted to dereference it's se_cmd pointer that had
already been released during normal session shutdown.
To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD
and ISCSI_OP_SCSI_TMFUNC the same when there is an extra
se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use
op_scsi to signal __iscsit_free_cmd() when the former
needs to clear any further iscsi related I/O state.
Reported-by: Rob Millner <[email protected]>
Cc: Rob Millner <[email protected]>
Reported-by: Chu Yuan Lin <[email protected]>
Cc: Chu Yuan Lin <[email protected]>
Tested-by: Chu Yuan Lin <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
---
drivers/target/iscsi/iscsi_target_util.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 5041a9c..b464033 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -737,21 +737,23 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown)
{
struct se_cmd *se_cmd = NULL;
int rc;
+ bool op_scsi = false;
/*
* Determine if a struct se_cmd is associated with
* this struct iscsi_cmd.
*/
switch (cmd->iscsi_opcode) {
case ISCSI_OP_SCSI_CMD:
- se_cmd = &cmd->se_cmd;
- __iscsit_free_cmd(cmd, true, shutdown);
+ op_scsi = true;
/*
* Fallthrough
*/
case ISCSI_OP_SCSI_TMFUNC:
- rc = transport_generic_free_cmd(&cmd->se_cmd, shutdown);
- if (!rc && shutdown && se_cmd && se_cmd->se_sess) {
- __iscsit_free_cmd(cmd, true, shutdown);
+ se_cmd = &cmd->se_cmd;
+ __iscsit_free_cmd(cmd, op_scsi, shutdown);
+ rc = transport_generic_free_cmd(se_cmd, shutdown);
+ if (!rc && shutdown && se_cmd->se_sess) {
+ __iscsit_free_cmd(cmd, op_scsi, shutdown);
target_put_sess_cmd(se_cmd);
}
break;
--
1.9.1
From: Nicholas Bellinger <[email protected]>
This patch closes a race between se_lun deletion during configfs
unlink in target_fabric_port_unlink() -> core_dev_del_lun()
-> core_tpg_remove_lun(), when transport_clear_lun_ref() blocks
waiting for percpu_ref RCU grace period to finish, but a new
NodeACL mappedlun is added before the RCU grace period has
completed.
This can happen in target_fabric_mappedlun_link() because it
only checks for se_lun->lun_se_dev, which is not cleared until
after transport_clear_lun_ref() percpu_ref RCU grace period
finishes.
This bug originally manifested as NULL pointer dereference
OOPsen in target_stat_scsi_att_intr_port_show_attr_dev() on
v4.1.y code, because it dereferences lun->lun_se_dev without
a explicit NULL pointer check.
In post v4.1 code with target-core RCU conversion, the code
in target_stat_scsi_att_intr_port_show_attr_dev() no longer
uses se_lun->lun_se_dev, but the same race still exists.
To address the bug, go ahead and set se_lun>lun_shutdown as
early as possible in core_tpg_remove_lun(), and ensure new
NodeACL mappedlun creation in target_fabric_mappedlun_link()
fails during se_lun shutdown.
Reported-by: James Shen <[email protected]>
Cc: James Shen <[email protected]>
Tested-by: James Shen <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
---
drivers/target/target_core_fabric_configfs.c | 5 +++++
drivers/target/target_core_tpg.c | 4 ++++
include/target/target_core_base.h | 1 +
3 files changed, 10 insertions(+)
diff --git a/drivers/target/target_core_fabric_configfs.c b/drivers/target/target_core_fabric_configfs.c
index d8a16ca..d1e6cab 100644
--- a/drivers/target/target_core_fabric_configfs.c
+++ b/drivers/target/target_core_fabric_configfs.c
@@ -92,6 +92,11 @@ static int target_fabric_mappedlun_link(
pr_err("Source se_lun->lun_se_dev does not exist\n");
return -EINVAL;
}
+ if (lun->lun_shutdown) {
+ pr_err("Unable to create mappedlun symlink because"
+ " lun->lun_shutdown=true\n");
+ return -EINVAL;
+ }
se_tpg = lun->lun_tpg;
nacl_ci = &lun_acl_ci->ci_parent->ci_group->cg_item;
diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
index 6fb1919..dfaef4d 100644
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -642,6 +642,8 @@ void core_tpg_remove_lun(
*/
struct se_device *dev = rcu_dereference_raw(lun->lun_se_dev);
+ lun->lun_shutdown = true;
+
core_clear_lun_from_tpg(lun, tpg);
/*
* Wait for any active I/O references to percpu se_lun->lun_ref to
@@ -663,6 +665,8 @@ void core_tpg_remove_lun(
}
if (!(dev->se_hba->hba_flags & HBA_FLAGS_INTERNAL_USE))
hlist_del_rcu(&lun->link);
+
+ lun->lun_shutdown = false;
mutex_unlock(&tpg->tpg_lun_mutex);
percpu_ref_exit(&lun->lun_ref);
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 4b784b6..2e28246 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -705,6 +705,7 @@ struct se_lun {
u64 unpacked_lun;
#define SE_LUN_LINK_MAGIC 0xffff7771
u32 lun_link_magic;
+ bool lun_shutdown;
bool lun_access_ro;
u32 lun_index;
--
1.9.1
On Thu, 2017-03-30 at 08:29 +0000, Nicholas A. Bellinger wrote:
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index 5041a9c..b464033 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -737,21 +737,23 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown)
> {
> struct se_cmd *se_cmd = NULL;
> int rc;
> + bool op_scsi = false;
> /*
> * Determine if a struct se_cmd is associated with
> * this struct iscsi_cmd.
> */
> switch (cmd->iscsi_opcode) {
> case ISCSI_OP_SCSI_CMD:
> - se_cmd = &cmd->se_cmd;
> - __iscsit_free_cmd(cmd, true, shutdown);
> + op_scsi = true;
> /*
> * Fallthrough
> */
> case ISCSI_OP_SCSI_TMFUNC:
> - rc = transport_generic_free_cmd(&cmd->se_cmd, shutdown);
> - if (!rc && shutdown && se_cmd && se_cmd->se_sess) {
> - __iscsit_free_cmd(cmd, true, shutdown);
> + se_cmd = &cmd->se_cmd;
> + __iscsit_free_cmd(cmd, op_scsi, shutdown);
> + rc = transport_generic_free_cmd(se_cmd, shutdown);
> + if (!rc && shutdown && se_cmd->se_sess) {
> + __iscsit_free_cmd(cmd, op_scsi, shutdown);
> target_put_sess_cmd(se_cmd);
> }
> break;
Hello Nic,
I agree that this patch fixes a leak. However, an existing bug in
iscsit_free_cmd() is not addressed by this patch. Before the TMF code started
using kref_get() / kref_put() it was possible for transport_generic_free_cmd()
to determine whether or not iscsit_free_cmd() should call __iscsit_free_cmd()
by checking the command reference count. I think that since the TMF code
manipulates the command reference count it is no longer possible for
transport_generic_free_cmd() to determine this. If iscsit_free_cmd() is called
while a LUN RESET is in progress then the return value of
transport_generic_free_cmd() will be wrong. I will post a few patches that not
only address what I just described but also the leak fixed by this patch.
Bart.
On Thu, 2017-03-30 at 17:08 +0000, Bart Van Assche wrote:
> On Thu, 2017-03-30 at 08:29 +0000, Nicholas A. Bellinger wrote:
> > diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> > index 5041a9c..b464033 100644
> > --- a/drivers/target/iscsi/iscsi_target_util.c
> > +++ b/drivers/target/iscsi/iscsi_target_util.c
> > @@ -737,21 +737,23 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown)
> > {
> > struct se_cmd *se_cmd = NULL;
> > int rc;
> > + bool op_scsi = false;
> > /*
> > * Determine if a struct se_cmd is associated with
> > * this struct iscsi_cmd.
> > */
> > switch (cmd->iscsi_opcode) {
> > case ISCSI_OP_SCSI_CMD:
> > - se_cmd = &cmd->se_cmd;
> > - __iscsit_free_cmd(cmd, true, shutdown);
> > + op_scsi = true;
> > /*
> > * Fallthrough
> > */
> > case ISCSI_OP_SCSI_TMFUNC:
> > - rc = transport_generic_free_cmd(&cmd->se_cmd, shutdown);
> > - if (!rc && shutdown && se_cmd && se_cmd->se_sess) {
> > - __iscsit_free_cmd(cmd, true, shutdown);
> > + se_cmd = &cmd->se_cmd;
> > + __iscsit_free_cmd(cmd, op_scsi, shutdown);
> > + rc = transport_generic_free_cmd(se_cmd, shutdown);
> > + if (!rc && shutdown && se_cmd->se_sess) {
> > + __iscsit_free_cmd(cmd, op_scsi, shutdown);
> > target_put_sess_cmd(se_cmd);
> > }
> > break;
>
> Hello Nic,
>
> I agree that this patch fixes a leak. However, an existing bug in
> iscsit_free_cmd() is not addressed by this patch. Before the TMF code started
> using kref_get() / kref_put() it was possible for transport_generic_free_cmd()
> to determine whether or not iscsit_free_cmd() should call __iscsit_free_cmd()
> by checking the command reference count. I think that since the TMF code
> manipulates the command reference count it is no longer possible for
> transport_generic_free_cmd() to determine this. If iscsit_free_cmd() is called
> while a LUN RESET is in progress then the return value of
> transport_generic_free_cmd() will be wrong.
No. Your assumption is incorrect wrt transport_generic_free_cmd()
having a wrong return value during LUN_RESET.
It's incorrect because when iscsit_free_cmd() is called with
shutdown=true resulting in transport_generic_free_cmd() with
wait_for_tasks=true, target_wait_free_cmd() checks for CMD_T_ABORTED to
determine if se_cmd has been aborted by target_core_tmr.c logic, and
returns aborted=true back up to transport_generic_free_cmd().
When transport_generic_free_cmd() gets aborted=true, it waits for
se_cmd->cmd_wait_comp to finish and calls cmd->se_tfo->release_cmd()
to release se_cmd back to the pre-allocated session pool.
At this point, transport_generic_free_cmd() with aborted=true must
return '1' it's caller, signaling iscsit_free_cmd() to not attempt to
perform the extra __iscsi_free_cmd() or target_put_sess_cmd(), because
se_cmd has already been released back to the session pool.