2022-12-14 07:19:27

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

commit 948e922fc4461 ("scsi: core: map PQ=1, PDT=other values to
SCSI_SCAN_TARGET_PRESENT") returns SCSI_SCAN_TARGET_PRESENT if inquiry
returns PQ=1.

According to the SPC, PQ=1 means the addressed logical unit having the
indicated device type is not accessible, it does not mean the addressed
logical unit is invalid. We still can map this lun to an sg device.

In some conditions, we do not want to skip these devices, for example
with iSCSI:

When iSCSI initiator logged in target, the target attached none valid
lun but lun0. lun0 is not an valid disk, while it would response
inquiry command with PQ=1 and other general scsi commands like probe lun.
The others luns of target is added/removed dynamicly.

We want the lun0 to be mapped to an sg device in initiator, so we can
probe luns of target based on lun0.

In first patch, I add an interface to control if to skip luns return
PQ=1 for inquiry.

In second patch, make iscsi_tcp do not skip luns return PQ=1 as default,
since I do not have iscsi_tcp environment, so here just modified the
iscsi_tcp.

Wenchao Hao (2):
scsi:core:Add sysfs interface to control if skip lun with PQ=1
scsi:iscsi_tcp:Do not skip lun inquiry returns PQ=1

drivers/scsi/iscsi_tcp.c | 1 +
drivers/scsi/scsi_scan.c | 9 ++++++---
drivers/scsi/scsi_sysfs.c | 29 +++++++++++++++++++++++++++++
include/scsi/scsi_host.h | 3 +++
4 files changed, 39 insertions(+), 3 deletions(-)

--
2.32.0


2022-12-14 07:22:01

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH 1/2] scsi:core:Add sysfs interface to control if skip lun with PQ=1

commit 948e922fc4461 ("scsi: core: map PQ=1, PDT=other values to
SCSI_SCAN_TARGET_PRESENT") returns SCSI_SCAN_TARGET_PRESENT if inquiry
returns PQ=1.

According to the SPC, PQ=1 means the addressed logical unit having the
indicated device type is not accessible, it does not mean the addressed
logical unit is invalid. We still can map this lun to an sg device.

In some conditions, we do not want to skip these devices, for example
with iSCSI:

When iSCSI initiator logged in target, the target attached none valid
lun but lun0. lun0 is not an valid disk, while it would response
inquiry command with PQ=1 and other general scsi commands like probe lun.
The others luns of target is added/removed dynamicly.

We want the lun0 to be mapped to an sg device in initiator, so we can
probe luns of target based on lun0.

I add an sysfs interface named no_skip_pq1 in each Scsi_Host to
control if to skip lun which return PQ=1 for inquiry.

The default behavior is not changed, which means we would still skip
add lun if inquiry returns PQ=1. We can set host's no_skip_pq1 in
specific drivers or via sysfs.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_scan.c | 9 ++++++---
drivers/scsi/scsi_sysfs.c | 29 +++++++++++++++++++++++++++++
include/scsi/scsi_host.h | 3 +++
3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 920b145f80b7..bd4faaabee8c 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1233,10 +1233,13 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
* that no LUN is present, so don't add sdev in these cases.
* Two specific examples are:
* 1) NetApp targets: return PQ=1, PDT=0x1f
- * 2) IBM/2145 targets: return PQ=1, PDT=0
- * 3) USB UFI: returns PDT=0x1f, with the PQ bits being "reserved"
+ * 2) USB UFI: returns PDT=0x1f, with the PQ bits being "reserved"
* in the UFI 1.0 spec (we cannot rely on reserved bits).
*
+ * Targets set PQ=1 would be skipped if shost->no_skip_pq1 is not set
+ * For example:
+ * 1) IBM/2145 targets: return PQ=1, PDT=0
+ *
* References:
* 1) SCSI SPC-3, pp. 145-146
* PQ=1: "A peripheral device having the specified peripheral
@@ -1248,7 +1251,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
* PDT=00h Direct-access device (floppy)
* PDT=1Fh none (no FDD connected to the requested logical unit)
*/
- if (((result[0] >> 5) == 1 ||
+ if ((((result[0] >> 5) == 1 && !shost->no_skip_pq1) ||
(starget->pdt_1f_for_no_lun && (result[0] & 0x1f) == 0x1f)) &&
!scsi_is_wlun(lun)) {
SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index f2a345cc0f8a..a72466c7e3c4 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -369,6 +369,34 @@ store_shost_eh_deadline(struct device *dev, struct device_attribute *attr,

static DEVICE_ATTR(eh_deadline, S_IRUGO | S_IWUSR, show_shost_eh_deadline, store_shost_eh_deadline);

+static ssize_t
+show_no_skip_pq1(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(dev);
+
+ return sysfs_emit(buf, "%s\n", shost->no_skip_pq1 ? "Y" : "N");
+}
+
+static ssize_t
+store_no_skip_pq1(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct Scsi_Host *shost = class_to_shost(dev);
+ int ret = -EINVAL;
+ bool store_val;
+
+ ret = kstrtobool(buf, &store_val);
+ if (ret)
+ return ret;
+
+ shost->no_skip_pq1 = store_val;
+
+ return count;
+}
+
+static DEVICE_ATTR(no_skip_pq1, S_IRUGO | S_IWUSR, show_no_skip_pq1, store_no_skip_pq1);
+
shost_rd_attr(unique_id, "%u\n");
shost_rd_attr(cmd_per_lun, "%hd\n");
shost_rd_attr(can_queue, "%d\n");
@@ -421,6 +449,7 @@ static struct attribute *scsi_sysfs_shost_attrs[] = {
&dev_attr_host_reset.attr,
&dev_attr_eh_deadline.attr,
&dev_attr_nr_hw_queues.attr,
+ &dev_attr_no_skip_pq1.attr,
NULL
};

diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 587cc767bb67..56bb11d9a886 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -659,6 +659,9 @@ struct Scsi_Host {
/* The transport requires the LUN bits NOT to be stored in CDB[1] */
unsigned no_scsi2_lun_in_cdb:1;

+ /* Do not skip adding lun if inquiry command returns PQ == 1 */
+ unsigned no_skip_pq1:1;
+
/*
* Optional work queue to be utilized by the transport
*/
--
2.32.0

2022-12-14 07:35:27

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH 2/2] scsi:iscsi_tcp:Do not skip lun inquiry returns PQ=1

When luns inquiry return PQ=1, do not skip this lun and try to
map these luns to an sg device.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/iscsi_tcp.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 5fb1f364e815..316e2e17c72d 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -941,6 +941,7 @@ iscsi_sw_tcp_session_create(struct iscsi_endpoint *ep, uint16_t cmds_max,
shost->max_id = 0;
shost->max_channel = 0;
shost->max_cmd_len = SCSI_MAX_VARLEN_CDB_SIZE;
+ shost->no_skip_pq1 = 1;

rc = iscsi_host_get_max_scsi_cmds(shost, cmds_max);
if (rc < 0)
--
2.32.0

2022-12-15 07:19:09

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

On Wed, Dec 14, 2022 at 03:08:44PM +0800, Wenchao Hao wrote:
> When iSCSI initiator logged in target, the target attached none valid
> lun but lun0. lun0 is not an valid disk, while it would response
> inquiry command with PQ=1 and other general scsi commands like probe lun.
> The others luns of target is added/removed dynamicly.

I can't find any special casing of LUN0 in RFC7144, can you clarify
where you think that treats LUN0 any differently than other transports?

2022-12-15 08:13:10

by Ulrich Windl

[permalink] [raw]
Subject: Antw: [EXT] Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

>>> Christoph Hellwig <[email protected]> schrieb am 15.12.2022 um 08:06 in
Nachricht <[email protected]>:
> On Wed, Dec 14, 2022 at 03:08:44PM +0800, Wenchao Hao wrote:
>> When iSCSI initiator logged in target, the target attached none valid
>> lun but lun0. lun0 is not an valid disk, while it would response
>> inquiry command with PQ=1 and other general scsi commands like probe lun.
>> The others luns of target is added/removed dynamicly.
>
> I can't find any special casing of LUN0 in RFC7144, can you clarify
> where you think that treats LUN0 any differently than other transports?

Actusally I have no idea, but as a user of FC SAN systems I can remember a case when a storage system had to present a dummy LUN0 to enable hosts to find other LUNs (while LUN0 was never actually used). Maybe the client code was imperfect, I don't know.

>
> --
> You received this message because you are subscribed to the Google Groups
> "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/open-iscsi/Y5rHX95Vvl1aLhbp%40infradead.org
> .




2022-12-15 10:27:25

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts


On 2022/12/15 15:06, Christoph Hellwig wrote:
> On Wed, Dec 14, 2022 at 03:08:44PM +0800, Wenchao Hao wrote:
>> When iSCSI initiator logged in target, the target attached none valid
>> lun but lun0. lun0 is not an valid disk, while it would response
>> inquiry command with PQ=1 and other general scsi commands like probe lun.
>> The others luns of target is added/removed dynamicly.
>
> I can't find any special casing of LUN0 in RFC7144, can you clarify
> where you think that treats LUN0 any differently than other transports?
> .
This is not described in RFC7144. The sense described above
aims to tell that a dummy lun is useful.

In my opinion, if the addressed lun still response the
inquiry and other commands, we should not skip it,
maybe let the scsi drivers like sd/st/sg to determine
how to handle this lun accordint to the PQ value.

As discussed in following mail, another drivers would
be broken too.

https://lore.kernel.org/linux-scsi/CA+PODjqrRzyJnOKoabMOV4EPByNnL1LgTi+QAKENP3NwUq5YCw@mail.gmail.com/

The key point of my patch is to add one way to map
dummy lun to an sg device.

2022-12-16 07:31:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

On Thu, Dec 15, 2022 at 05:09:31PM +0800, Wenchao Hao wrote:
> In my opinion, if the addressed lun still response the
> inquiry and other commands, we should not skip it,
> maybe let the scsi drivers like sd/st/sg to determine
> how to handle this lun accordint to the PQ value.
>
> As discussed in following mail, another drivers would
> be broken too.

So why do you force a specific behavior for iSCSI?

2022-12-16 07:56:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Antw: [EXT] Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

On Thu, Dec 15, 2022 at 09:07:28AM +0100, Ulrich Windl wrote:
> Actusally I have no idea, but as a user of FC SAN systems I can remember a case when a storage system had to present a dummy LUN0 to enable hosts to find other LUNs (while LUN0 was never actually used). Maybe the client code was imperfect, I don't know.

Ignoring some of the well known LU bits that never really became
practically relevant, lun0 is needed to use the REPORT_LUNS command
to scane for the other logical units. But unless the PQ says it
actually is a valid logic unit, we never add a sdev for it.

2022-12-16 11:54:31

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts


On 2022/12/16 15:12, Christoph Hellwig wrote:
> On Thu, Dec 15, 2022 at 05:09:31PM +0800, Wenchao Hao wrote:
>> In my opinion, if the addressed lun still response the
>> inquiry and other commands, we should not skip it,
>> maybe let the scsi drivers like sd/st/sg to determine
>> how to handle this lun accordint to the PQ value.
>>
>> As discussed in following mail, another drivers would
>> be broken too.
>
> So why do you force a specific behavior for iSCSI?
> .

For nothing, I want the iscsi_tcp transport do not skip PQ=1 default
as what it did before commit 948e922fc4461 ("scsi: core: map PQ=1,
PDT=other values to SCSI_SCAN_TARGET_PRESENT").

2022-12-18 22:29:30

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH 1/2] scsi:core:Add sysfs interface to control if skip lun with PQ=1

On 12/14/22 1:08 AM, Wenchao Hao wrote:
>
> When iSCSI initiator logged in target, the target attached none valid
> lun but lun0. lun0 is not an valid disk, while it would response
> inquiry command with PQ=1 and other general scsi commands like probe lun.
> The others luns of target is added/removed dynamicly.
>
> We want the lun0 to be mapped to an sg device in initiator, so we can
> probe luns of target based on lun0.

What do you want to do exactly? Is the idea with your patch we would create
an sg device, then in userpsace you would do some scan related commands. If
you find devices then you use sysfs to have scsi-ml scan/add a specific device
like

echo 5 0 0 8 > host5/scan

?

It's not really clear to me why you need the sg device, and can't just do?

echo - - - > .../hostN/scan

? Do you only want to add specific devices like you are doing some sort of
LUN masking on the initiator side?

Is the issue that you need the sg device there, so you can detect when a device
is no longer present on the target and then you will have userspace remove the
device via the sysfs delete file?

2022-12-23 16:04:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/2] scsi:donot skip lun if inquiry returns PQ=1 for all hosts

On Fri, Dec 16, 2022 at 07:41:26PM +0800, Wenchao Hao wrote:
> For nothing, I want the iscsi_tcp transport do not skip PQ=1 default
> as what it did before commit 948e922fc4461 ("scsi: core: map PQ=1,
> PDT=other values to SCSI_SCAN_TARGET_PRESENT").

Well, that commit was very much intentional and is now three an a half
years old, so we've not just going to partially revert it on a
per-transport basis when it is in no way transport related.

If you can come up with a good enough rationale we could do the
sysfs override, but so far the reason mostly seems to be "I want"
and not anctual explanation of why it is useful.

2022-12-28 09:29:17

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH 1/2] scsi:core:Add sysfs interface to control if skip lun with PQ=1

On 2022/12/19 5:37, Mike Christie wrote:
> On 12/14/22 1:08 AM, Wenchao Hao wrote:
>>
>> When iSCSI initiator logged in target, the target attached none valid
>> lun but lun0. lun0 is not an valid disk, while it would response
>> inquiry command with PQ=1 and other general scsi commands like probe lun.
>> The others luns of target is added/removed dynamicly.
>>
>> We want the lun0 to be mapped to an sg device in initiator, so we can
>> probe luns of target based on lun0.
>
> What do you want to do exactly? Is the idea with your patch we would create
> an sg device, then in userpsace you would do some scan related commands. If
> you find devices then you use sysfs to have scsi-ml scan/add a specific device
> like
>
> echo 5 0 0 8 > host5/scan
>
> ?
>
> It's not really clear to me why you need the sg device, and can't just do?
>
> echo - - - > .../hostN/scan
>
We do not directly echo - - - > .../hostN/scan. Instead, we use rescan-scsi-bus.sh
defined in sg3_utils to scan luns. If no sg device mapped to this host, the default
scan operations could not recognize luns.

We can simulate this with scsi_debug by following steps:

1. build scsi debug with following changes;

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 8553277effb3..36dcdd2c3fe4 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -836,6 +836,7 @@ static int dif_errors;
static bool sdeb_zbc_in_use; /* true for host-aware and host-managed disks */
static int sdeb_zbc_zone_cap_mb;
static int sdeb_zbc_zone_size_mb;
+static int sdeb_pq_type;
static int sdeb_zbc_max_open = DEF_ZBC_MAX_OPEN_ZONES;
static int sdeb_zbc_nr_conv = DEF_ZBC_NR_CONV_ZONES;

@@ -1583,6 +1584,7 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
u32 alloc_len, n;
int ret;
bool have_wlun, is_disk, is_zbc, is_disk_zbc;
+ struct scsi_device *sdev = scp->device;

alloc_len = get_unaligned_be16(cmd + 3);
arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
@@ -1598,7 +1600,11 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
pq_pdt = 0x7f; /* not present, PQ=3, PDT=0x1f */
else
pq_pdt = (sdebug_ptype & 0x1f);
- arr[0] = pq_pdt;
+ if (sdev->lun == 0) {
+ arr[0] = pq_pdt | (sdeb_pq_type << 5);
+ } else {
+ arr[0] = pq_pdt;
+ }
if (0x2 & cmd[1]) { /* CMDDT bit set */
mk_sense_invalid_fld(scp, SDEB_IN_CDB, 1, 1);
kfree(arr);
@@ -5883,6 +5889,7 @@ module_param_named(zone_cap_mb, sdeb_zbc_zone_cap_mb, int, S_IRUGO);
module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO);
+module_param_named(pq_type, sdeb_pq_type, int, S_IRUGO);

MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -7825,6 +7832,7 @@ static int sdebug_driver_probe(struct device *dev)
error = -ENODEV;
return error;
}
+ //hpnt->no_skip_pq1 = 1;
if (submit_queues > nr_cpu_ids) {
pr_warn("%s: trim submit_queues (was %d) to nr_cpu_ids=%u\n",
my_name, submit_queues, nr_cpu_ids);

2. insmod scsi_debug with modparam max_luns=4 pq_type=1, we would get following scsi devices

# lsscsi -g
[0:0:0:1] disk Linux scsi_debug 0191 /dev/sda /dev/sg0
[0:0:0:2] disk Linux scsi_debug 0191 /dev/sdb /dev/sg1
[0:0:0:3] disk Linux scsi_debug 0191 /dev/sdc /dev/sg2

3. remove the scsi_debug disks

echo 1 > /sys/block/sda/device/delete
echo 1 > /sys/block/sdb/device/delete
echo 1 > /sys/block/sdc/device/delete

4. scan with rescan-scsi-bus.sh would failed to discovery luns.

# rescan-scsi-bus.sh
Scanning SCSI subsystem for new devices
Scanning host 0 for SCSI target IDs 0 1 2 3 4 5 6 7, all LUNs
0 new or changed device(s) found.
0 remapped or resized device(s) found.
0 device(s) removed.

Did not add the luns. If do not comment the line hpnt->no_skip_pq1 = 1 in
scsi_debug.c, the scan operations would found these luns succeed.

> ? Do you only want to add specific devices like you are doing some sort of
> LUN masking on the initiator side?
>
> Is the issue that you need the sg device there, so you can detect when a device
> is no longer present on the target and then you will have userspace remove the
> device via the sysfs delete file?
> .

Yes, this is one scenarios. luns on the target are removed or added dynamically,
the only fixed one is LUN0 which response PQ=1.

If luns are removed from target after been added by initiator, we could know which
one should be removed from initiator via LUN0.

We can think LUN0 with PQ=1 is a dummy lun, which is fixed and can tell luns info on
the target.

According to the SPC, PQ=1 means the addressed logical unit having the
indicated device type is not accessible, it does not mean the addressed
logical unit is invalid. So I think we still can map this lun to an sg device.