2002-06-15 13:36:19

by Kurt Garloff

[permalink] [raw]
Subject: /proc/scsi/map

Hi SCSI users,

from people using SCSI devices, there's one question that turns up again
and again: How can one assign stable device names to SCSI devices in
case there are devices that may or may not be switched on or connected.

There are a couple of ways to address this problem:
(a) mount by uuid
(b) userspace programs that collect information to create
alternative and persistent names / device nodes, such
as Eric Youngdale's scsidev[1], Doug Gilbert's sg_map[2], scsimon[3],
or Mike Sullivan's scsiname/devnaming[4]
(c) devfs

[1] http://www.garloff.de/kurt/linux/scsidev/
[2] http://www.torque.net/sg/
[3] http://www.torque.net/scsi/scsimon.html
[4] http://oss.software.ibm.com/devreg/

Unfortunately, those approaches all have some deficiencies.
Ad (a): Does only work for ext2 filesystems. For locating
/ one needs additional initrd work.
Ad (b): A considerable amount of work needs to be done in userspace:
- For all devices types you need to probe possible devices
- You need to do SCSI_IOCTL_GET_IDLUN to get controller,bus,
target and unit number
The problem is that the collection of this information is
not always successful. If a medium is not inserted, the
open() fails for some device types, despite O_NONBLOCK.
Assumptions about the order of devices OTOH are not safe,
as remove-single-device/add-single-device may result in a
non-straightforward ordering.
Ad (c): devfs is currently not (yet?) an option for distributions
due to security and stability considerations.

Life would be easier if the scsi subsystem would just report which SCSI
device (uniquely identified by the controller,bus,target,unit tuple) belongs
to which high-level device. The information is available in the kernel.

Attached patch does this:
garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
# C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
1,0,01,00 0x05 1 sg1 c:15:01 sr1 b:0b:01
1,0,02,00 0x01 1 sg2 c:15:02 osst0 c:ce:00
1,0,03,00 0x05 1 sg3 c:15:03 sr2 b:0b:02
1,0,05,00 0x00 1 sg4 c:15:04 sda b:08:00
1,0,09,00 0x00 1 sg5 c:15:05 sdb b:08:10
2,0,01,00 0x05 1 sg6 c:15:06 sr3 b:0b:03
2,0,02,00 0x01 1 sg7 c:15:07 osst1 c:ce:01
2,0,03,00 0x05 1 sg8 c:15:08 sr4 b:0b:04
2,0,05,00 0x00 1 sg9 c:15:09 sdc b:08:20
2,0,09,00 0x00 1 sg10 c:15:0a sdd b:08:30
3,0,10,00 0x00 1 sg11 c:15:0b sde b:08:40
3,0,12,00 0x00 1 sg12 c:15:0c sdf b:08:50

This allows a simple script to parse the map and create device nodes as
needed.

The patch does work the following way.
- Add a find_kdev() function pointer to the high-level driver template
structure. The function takes a Scsi_Device pointer (points to a
low-level device) and returns a name and a kdev_t if the device
is attached to this high-level driver.
- Implement the function for sg, sd, sr, st, osst
- Make scsi/scsi_proc iterate over all devices and calls the high-level
drivers find_kdev() to find out about it.

Obviously, it can only report the assignment of high-level drivers,
if they are loaded, otherwise the last two columns will stay empty.
(sg is handled especially, as we know it supports all devices.)
If we attach a third high-level device driver, two more columns would show
up.
(Is this variable column number format a problem?)

The patch also includes a simple shell script that does assign
/dev/scsi/sdc2b0t9u0 type names to those devices and making a device nodes
(or optionally symlinks to the old name devices) with this name.

The design allows for two more things:
* using root=/dev/scsi/sdc1b0t9u0p5 without much additional code
(patch follows in another mail)
* in case we want to support more than 128 scsi disks, the information
about additional majors can be reported by /proc/scsi/map without further
change

Patch is against 2.4.19pre10.

I'd like to get it accepted into the kernel.
So please give your criticism ...
I already got some by Doug Gilbert :-)

A patch for 2.5 should be done as well, if the design is OK, of course.

Regards,
--
Kurt Garloff <[email protected]> [Eindhoven, NL]
Physics: Plasma simulations <[email protected]> [TU Eindhoven, NL]
Linux: SCSI, Security <[email protected]> [SuSE Nuernberg, DE]
(See mail header or public key servers for PGP2 and GPG public keys.)


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2002-06-16 19:24:44

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: /proc/scsi/map

Kurt Garloff writes:
> garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
> # C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
> 0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
> 1,0,01,00 0x05 1 sg1 c:15:01 sr1 b:0b:01
> 1,0,02,00 0x01 1 sg2 c:15:02 osst0 c:ce:00
> 1,0,03,00 0x05 1 sg3 c:15:03 sr2 b:0b:02
> 1,0,05,00 0x00 1 sg4 c:15:04 sda b:08:00
> 1,0,09,00 0x00 1 sg5 c:15:05 sdb b:08:10
> 2,0,01,00 0x05 1 sg6 c:15:06 sr3 b:0b:03
> 2,0,02,00 0x01 1 sg7 c:15:07 osst1 c:ce:01
> 2,0,03,00 0x05 1 sg8 c:15:08 sr4 b:0b:04
> 2,0,05,00 0x00 1 sg9 c:15:09 sdc b:08:20
> 2,0,09,00 0x00 1 sg10 c:15:0a sdd b:08:30
> 3,0,10,00 0x00 1 sg11 c:15:0b sde b:08:40
> 3,0,12,00 0x00 1 sg12 c:15:0c sdf b:08:50
>
> This allows a simple script to parse the map and create device
> nodes as needed.
...
> Obviously, it can only report the assignment of high-level drivers,
> if they are loaded, otherwise the last two columns will stay empty.
> (sg is handled especially, as we know it supports all devices.)
> If we attach a third high-level device driver, two more columns
> would show up. (Is this variable column number format a problem?)

The variable column format is of course annoying, but use
it if you must. The also-annoying alternative is to pick
a fill character that would be easy for a beginner to
handle in a script. Maybe one of: @ - . / ?

The header line is far worse. It's too terse to be very helpful.
It gets in the way of every person writing a parser. Even in
your example script, you had to hack your way around it:

> +while read cbtu tp onl sgnm sgdev othnm othdev oothnm oothdev rest; do
> + # Skip comment line(s)
> + if test "${cbtu:0:1}" = "#"; then continue; fi
> + # If we're just dealing with one device, do skip the others
> + if test ! -z "$CMPAGAINST" -a "$CMPAGAINST" != "$cbtu"; then continue;
> fi

2002-06-16 19:41:13

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Richard,

I was in no way intending to trigger a discussion about devfs.
Some of the things addressed by the scsi/map patch indeed are no issue if
you use devfs; that's why I mentioned devfs at all.
I don't want to bash devfs and I think it's nice that it's in the kernel,
so users have the choice to use it and the motivation to improve it.

But the problem that I wanted to address IMHO should also be solved
for those people that for one or another reason decided not to use
devfs.

And face it: I do not think that all major Linux distributions will
start to use devfs within short future. The example you mentioned
(Mandrake) is certainly not a good one: Look at their update kernel.

Best regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (926.00 B)
(No filename) (189.00 B)
Download all attachments

2002-06-16 20:36:16

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Sancho,

On Sun, Jun 16, 2002 at 12:40:50AM +0200, Sancho Dauskardt wrote:
> >In lk 2.5 we are hoping that driverfs will give us an
> >"information bridge" between scsi pseudo devices
> >and other driver subsystems such as ide, usb and iee1394.
> >Mike Sullivan's persistent naming patch (that I mentioned
> >in my previous post on this thread) adds driverfs capability
> >into the scsi subsystem. Driverfs capability is already
> >in the ide and usb subsystems.
> Driverfs will hopefully solve the problem, of "oh there's a SCSI device.
> how is it connected ?".
>
> But to date, SCSI doesn't know about the GUID's, right ?
> And without this, we won't get a uniform way of creating stable names for
> hot-plugable devices...

For the SCSI code, a device is uniquely identified by the CBTU tuple.
Every device has one and it's known to the SCSI code.
If you don't change your SCSI IDs (which rarely happens in practice) and
make sure the SCSI host adpters are loaded in the same order always (by
using scsihosts= or by just doing the modprobe in a certain order) you
even a stable way to address devices.

The SCSI code does not know about GUIDs or such things. It's up to the
lowlevel code to report such things. Or change the SCSI code to collect
such information.

Low-level drivers that are supposed to have a lot of SCSI devices attached,
such as FC drivers, should provide a way of making a stable mapping from the
identifiers in FC Land (Unique IDs) to the identifiers in SCSI land (CBTU).
If not, they should at least report the mapping in some proc file.

A more general approach could be to build a unique identifier from a host
adapter identifier and the BTU and use this unique identifier (which would
be the UID of FC devices for FC devices) inside the SCSI code instead of
CBTU. Then the first column of /proc/scsi/map would report this identifier
instead ...

But that would be a more invasive change than the simple code that I added
to report the SCSI CBTU to SCSI high-level devices mapping.

Regards,
--
Kurt Garloff <[email protected]> [Eindhoven, NL]
Physics: Plasma simulations <[email protected]> [TU Eindhoven, NL]
Linux: SCSI, Security <[email protected]> [SuSE Nuernberg, DE]
(See mail header or public key servers for PGP2 and GPG public keys.)


Attachments:
(No filename) (2.27 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-16 21:04:27

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Andries,

On Sat, Jun 15, 2002 at 06:04:19PM +0200, Andries Brouwer wrote:
> > How can one assign stable device names to SCSI devices in
> > case there are devices that may or may not be switched on or connected.
>
> An interesting unsolved problem.
> [Your discussion confuses a few things, especially in the context
> of removable devices: a uuid lives on the disk, the C,B,T,U tends
> to identify the drive rather than the disk.]

Sure. But those are the things that are normally proposed to relieve the
situation. (And yes, I got the uuid thing confused.)
I actually forgot one. LVM. There, signatures are stored on the disks.

> > Life would be easier if the scsi subsystem would just report which
> > SCSI device (uniquely identified by the controller,bus,target,unit tuple)
> > belongs to which high-level device.
>
> Yes. I took your patch, ported it to 2.5, and tried it out.

Oh, I expected to do it myself if it does not get bashed too much ...
THANKS!
And, yes, I agree, I would prefer to know it's accepted in 2.5 before it's
applied to 2.4. It's just that I develop on 2.4 currently :-(

> Very good - in combination with /proc/scsi/scsi this gives
> good information. I like it.
>
> But just "cat /proc/scsi/map" is not good enough.

I did not want to duplicate the information from /proc/scsi/scsi. The
idea was that it should be straightforward to make device nodes from
the information provided in map and to allow more elaborated code in
userspace to know where to collect information from.

> >From the above output alone one cannot easily guess which is which.
> One would need a small utility that reads /proc/scsi/map and
> /proc/scsi/scsi and produces something readable.
> Will add sth to util-linux in case this gets accepted.

That would be great!
Because the mapping between sg and the other device is known, you can use
the sg device to do ioctls on it (or even send SCSI commands) or to use
the information from /proc/scsi/sg/ as well.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (2.15 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-16 21:22:43

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Albert,

On Sun, Jun 16, 2002 at 03:24:33PM -0400, Albert D. Cahalan wrote:
> Kurt Garloff writes:
> > If we attach a third high-level device driver, two more columns
> > would show up. (Is this variable column number format a problem?)
>
> The variable column format is of course annoying, but use
> it if you must. The also-annoying alternative is to pick
> a fill character that would be easy for a beginner to
> handle in a script. Maybe one of: @ - . / ?

Yes, as you correctly mention in your other mail, this would make it easier
to add more columns later.
But the problem then would be that we would need to fix (and limit) the
number of high-level devices that may be reported this way, which is not so
nice either. At this moment it's not a problem, of course, AFAIK.

> The header line is far worse. It's too terse to be very helpful.
> It gets in the way of every person writing a parser. Even in
> your example script, you had to hack your way around it:

I would not call it a hack. Ignoring comment lines is one of the basic
things each parser needs to do. Defining a format that does not allow
for comments actually would not be a very clever move.

But for a file exported from kernel, you may have a valid point.

Actually, the exact format of /proc/scsi/map is certainly something
that can be discussed separately from the basic idea of adding a file
that does expose the mapping of a SCSI address (CBTU) and the attached
high level drivers.
The way I designed it just should make it easy for a shell script to use
it. And keeping it simple certainly is a good thing.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (1.78 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-17 11:33:30

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi John,

On Sat, Jun 15, 2002 at 10:08:54PM +0800, John Summerfield wrote:
> > Life would be easier if the scsi subsystem would just report which SCSI
> > device (uniquely identified by the controller,bus,target,unit tuple) belongs
> > to which high-level device. The information is available in the kernel.
>
> Does this not fail if I pull a device off, change its ID (perhaps to fit
> into another system), then plug it in again? Or if I move it from one
> adaptor to another?

Sure it does.
The kernel can offer you the knowledge of a hardware path to your device,
which is given by controller,bust,SCSI target and unit numbers. This is
pretty stable in most configurations.

If you want to have more, you will probably use some sort of signatures.

But that's nothing which happens at SCSI layer.
For plain old SCSI devices, you may e.g. inquire the serial number (INQUIRY,
page code 0x80) which gives you a unique identifier if combined with
vendor and model strings. scsidev does this for you. But it occasinally
fails, as the open on scsi device may fail and we don't know the relation
between the sg devices (that can be used reliably to collect such
information) and the other high level devices. /proc/scsi/map solves this.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (1.43 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-17 17:56:26

by Ingo Oeser

[permalink] [raw]
Subject: Re: /proc/scsi/map

On Sun, Jun 16, 2002 at 09:41:12PM +0200, Kurt Garloff wrote:
> I don't want to bash devfs and I think it's nice that it's in the kernel,
> so users have the choice to use it and the motivation to improve it.
>
> But the problem that I wanted to address IMHO should also be solved
> for those people that for one or another reason decided not to use
> devfs.

So make it optional too. I don't need this kind of code
duplication, because I use devfs and I consider it bloat.

And if you do this kind of change, why not implementing
persistent naming for ALL devices?

Other devices have the very same problem.

Regards

Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth

2002-06-17 20:36:54

by Patrick Mansfield

[permalink] [raw]
Subject: Re: /proc/scsi/map

On Sat, Jun 15, 2002 at 03:36:06PM +0200, Kurt Garloff wrote:
> Life would be easier if the scsi subsystem would just report which SCSI
> device (uniquely identified by the controller,bus,target,unit tuple) belongs
> to which high-level device. The information is available in the kernel.

I prefer we refer to the tuple as host, channel, id, lun (H, C, I, L), so
as to more closely match /proc/scsi/scsi, /proc/scsi/sg, and attached
messages:

[root@elm3a50 root]# cat /proc/scsi/scsi | head -4
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: IBM-PSG Model: ST318203LC !# Rev: B222
Type: Direct-Access ANSI SCSI revision: 02

root # cat /proc/scsi/sg/device_hdr /proc/scsi/sg/devices | head -2
host chan id lun type opens qdepth busy online
0 0 0 0 0 4 253 0 1

Attached messages are of the form:

Attached scsi disk sdn at scsi2, channel 0, id 2, lun 0

> Attached patch does this:
> garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
> # C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
> 0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
> 1,0,01,00 0x05 1 sg1 c:15:01 sr1 b:0b:01
> 1,0,02,00 0x01 1 sg2 c:15:02 osst0 c:ce:00
> 1,0,03,00 0x05 1 sg3 c:15:03 sr2 b:0b:02
> 1,0,05,00 0x00 1 sg4 c:15:04 sda b:08:00
> 1,0,09,00 0x00 1 sg5 c:15:05 sdb b:08:10
> 2,0,01,00 0x05 1 sg6 c:15:06 sr3 b:0b:03
> 2,0,02,00 0x01 1 sg7 c:15:07 osst1 c:ce:01
> 2,0,03,00 0x05 1 sg8 c:15:08 sr4 b:0b:04
> 2,0,05,00 0x00 1 sg9 c:15:09 sdc b:08:20
> 2,0,09,00 0x00 1 sg10 c:15:0a sdd b:08:30
> 3,0,10,00 0x00 1 sg11 c:15:0b sde b:08:40
> 3,0,12,00 0x00 1 sg12 c:15:0c sdf b:08:50

Why not treat each upper layer driver the same? Type is already
in /proc/scsi/scsi, or implied by the upper level drivers attached.
Online should really be part of /proc/scsi/scsi.

Then, each line is a path followed by a list of upper level devices.
This would also simplify the code, although the ordering of the upper
level devices becomes link or module load order dependent.

And similiar to sg (someone commented on parsing '^#'), have a _hdr
entry; something like:

$ cat /proc/scsi/map_hdr /proc/scsi/map

H:C:I:L online type:name:block/char:maj:min
00:00:00:00 1 sg:sg0:c:15:00 sr:sr0:b:0b:00
01:00:01:00 1 sg:sg1:c:15:01 sr:sr1:b:0b:01
01:00:02:00 1 sg:sg2:c:15:02 osst:osst0:c:ce:00
02:00:09:00 1 sg:sg3:c:15:03 sd:sdd:b:08:30

Or:

H:C:I:L online type:enumeration:block/char:maj:min
00:00:00:00 1 sg:0:c:15:00 sr:0:b:0b:00
01:00:01:00 1 sg:1:c:15:01 sr:1:b:0b:01
01:00:02:00 1 sg:2:c:15:02 osst:0:c:ce:00
02:00:09:00 1 sg:3:c:15:03 sd:d:b:08:30


> A patch for 2.5 should be done as well, if the design is OK, of course.
>

IMO, we should use driverfs for this in 2.5. Mike Sullivan's scsi driverfs
patch currently ends up with a driverfs layout (showing one Scsi_Device
with two partitions, sg and sd attached) like this:

[root@elm3a50 devices]# tree ./root/pci1/01\:06.0/scsi2/2\:0\:2\:0/
./root/pci1/01:06.0/scsi2/2:0:2:0/
|-- 2:0:2:0:disc
| |-- kdev
| |-- name
| |-- power
| `-- type
|-- 2:0:2:0:gen
| |-- kdev
| |-- name
| |-- power
| `-- type
|-- 2:0:2:0:p1
| |-- kdev
| |-- name
| |-- power
| `-- type
|-- 2:0:2:0:p2
| |-- kdev
| |-- name
| |-- power
| `-- type
|-- name
|-- power
`-- type

Right now, the name is storing an ID; the ID is retrieved in the kernel (using
page 0x80 or page 0x83 or the path). For example disc has:

[root@elm3a50 2:0:2:0:disc]# pwd
/devices/root/pci1/01:06.0/scsi2/2:0:2:0/2:0:2:0:disc
[root@elm3a50 2:0:2:0:disc]# ls
kdev name power type
[root@elm3a50 2:0:2:0:disc]# cat *
8d0
U20000020371719e8disc
0
BLK

-- Patrick Mansfield

2002-06-17 20:57:53

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Patrick,

On Mon, Jun 17, 2002 at 01:35:34PM -0700, Patrick Mansfield wrote:
> On Sat, Jun 15, 2002 at 03:36:06PM +0200, Kurt Garloff wrote:
> > Life would be easier if the scsi subsystem would just report which SCSI
> > device (uniquely identified by the controller,bus,target,unit tuple) belongs
> > to which high-level device. The information is available in the kernel.
>
> I prefer we refer to the tuple as host, channel, id, lun (H, C, I, L), so
> as to more closely match /proc/scsi/scsi, /proc/scsi/sg, and attached
> messages:

You are refering to the naming of this 4-tuple, right: HCIL vs. CBTU?
I chose for CBTU, because that on's used in devfs. Actually, as you can see
from scsidev, I like HCIL more. But that's a detail the kernel should not
care about. The header line should be removed anyway as Albert remarked.
And helping those people who think that 200 bytes is unacceptable bloat.

[...]
> > 3,0,12,00 0x00 1 sg12 c:15:0c sdf b:08:50
>
> Why not treat each upper layer driver the same? Type is already
> in /proc/scsi/scsi, or implied by the upper level drivers attached.
> Online should really be part of /proc/scsi/scsi.

I'm not sure I know what you mean. The fact that I decided to put
the sg device name first independently of the (potentially) random
order in which high-level drivers are assigned?

> Then, each line is a path followed by a list of upper level devices.

It is.

> This would also simplify the code, although the ordering of the upper
> level devices becomes link or module load order dependent.

Just I decided to report shg first. This has a very pratical reason:
I you want to use userspace tools to collect more advanced (and maybe type
dependant information), you will always want to use the sg device, which
you can use to send SCSI commands and which you can open, even if there is
no medium or if the device is in use.

> And similiar to sg (someone commented on parsing '^#'), have a _hdr
> entry; something like:
>
> $ cat /proc/scsi/map_hdr /proc/scsi/map
>
> H:C:I:L online type:name:block/char:maj:min
> 00:00:00:00 1 sg:sg0:c:15:00 sr:sr0:b:0b:00
> 01:00:01:00 1 sg:sg1:c:15:01 sr:sr1:b:0b:01
> 01:00:02:00 1 sg:sg2:c:15:02 osst:osst0:c:ce:00
> 02:00:09:00 1 sg:sg3:c:15:03 sd:sdd:b:08:30

This looks find to me as well, by the way.
The reason why I chose to additionally report the device type reported by
inquiry is that you will only see the attached (and thus only the loaded)
high-level drivers of a device. With the device type, a userspace tool could
easily decide whether to trigger a modprobe and start again ...

> Or:
>
> H:C:I:L online type:enumeration:block/char:maj:min
> 00:00:00:00 1 sg:0:c:15:00 sr:0:b:0b:00
> 01:00:01:00 1 sg:1:c:15:01 sr:1:b:0b:01
> 01:00:02:00 1 sg:2:c:15:02 osst:0:c:ce:00
> 02:00:09:00 1 sg:3:c:15:03 sd:d:b:08:30
>
>
> > A patch for 2.5 should be done as well, if the design is OK, of course.
>
> IMO, we should use driverfs for this in 2.5. Mike Sullivan's scsi driverfs
> patch currently ends up with a driverfs layout (showing one Scsi_Device
> with two partitions, sg and sd attached) like this:

I still think the easy /proc/scsi/map format would be a nice basis to
inquire more information on the SCSI devices from userspace, even if you add
hierarchical attachment information via driverfs. And I think a solution
that works with both 2.4 and 2.5 would help most users, of course.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (3.69 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-17 21:47:53

by Patrick Mansfield

[permalink] [raw]
Subject: Re: /proc/scsi/map

Kurt -

On Mon, Jun 17, 2002 at 10:57:50PM +0200, Kurt Garloff wrote:
> Hi Patrick,
>
> > I prefer we refer to the tuple as host, channel, id, lun (H, C, I, L), so
> > as to more closely match /proc/scsi/scsi, /proc/scsi/sg, and attached
> > messages:
>
> You are refering to the naming of this 4-tuple, right: HCIL vs. CBTU?

Yes.

> I chose for CBTU, because that on's used in devfs. Actually, as you can see
> from scsidev, I like HCIL more. But that's a detail the kernel should not
> care about. The header line should be removed anyway as Albert remarked.
> And helping those people who think that 200 bytes is unacceptable bloat.
>
> [...]
> > > 3,0,12,00 0x00 1 sg12 c:15:0c sdf b:08:50
> >
> > Why not treat each upper layer driver the same? Type is already
> > in /proc/scsi/scsi, or implied by the upper level drivers attached.
> > Online should really be part of /proc/scsi/scsi.
>
> I'm not sure I know what you mean. The fact that I decided to put
> the sg device name first independently of the (potentially) random
> order in which high-level drivers are assigned?

Yes, I don't know why I took that to mean that sg was displayed differently.

> Just I decided to report shg first. This has a very pratical reason:
> I you want to use userspace tools to collect more advanced (and maybe type
> dependant information), you will always want to use the sg device, which
> you can use to send SCSI commands and which you can open, even if there is
> no medium or if the device is in use.

No matter the column position sg can be found if each column includes
the upper level name (sg, sd etc.). Then you do not need to know or
check the scsi_type of the template, or explicitly locate the sg
template in scsi_proc_map().

And then without the header scsi_proc_map() gets really simple.

> Regards,
> --
> Kurt Garloff <[email protected]> Eindhoven, NL
> GPG key: See mail header, key servers Linux kernel development
> SuSE Linux AG, Nuernberg, DE SCSI, Security

-- Patrick Mansfield

2002-06-17 22:08:24

by Doug Ledford

[permalink] [raw]
Subject: Re: /proc/scsi/map

On Sat, Jun 15, 2002 at 03:36:06PM +0200, Kurt Garloff wrote:
> Hi SCSI users,
>
> from people using SCSI devices, there's one question that turns up again
> and again: How can one assign stable device names to SCSI devices in
> case there are devices that may or may not be switched on or connected.


> Life would be easier if the scsi subsystem would just report which SCSI
> device (uniquely identified by the controller,bus,target,unit tuple) belongs
> to which high-level device. The information is available in the kernel.

Umm, this patently fails to meet the criteria you posted of "stable device
name". Adding a controller to a system is just as likely to blow this
naming scheme to hell as it is to blow the traditional linux /dev/sd?
scheme. IOW, even though the /proc/scsi/map file looks nice and usefull,
it fails to solve the very problem you are trying to solve.


--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2002-06-17 23:06:55

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Doug,

On Mon, Jun 17, 2002 at 06:08:18PM -0400, Doug Ledford wrote:
> On Sat, Jun 15, 2002 at 03:36:06PM +0200, Kurt Garloff wrote:
> > Life would be easier if the scsi subsystem would just report which SCSI
> > device (uniquely identified by the controller,bus,target,unit tuple) belongs
> > to which high-level device. The information is available in the kernel.
>
> Umm, this patently fails to meet the criteria you posted of "stable device
> name". Adding a controller to a system is just as likely to blow this
> naming scheme to hell as it is to blow the traditional linux /dev/sd?
> scheme. IOW, even though the /proc/scsi/map file looks nice and usefull,
> it fails to solve the very problem you are trying to solve.

In case you just add controllers, you just need to make sure you get them the
same numbers again. A solution for this exists already:
* For a kernel where SCSI low-level drivers are loaded as modules,
you just need to keep the order constant
* For compiled in SCSI drivers, use scsihosts=


But actually, the patch is not meant to be the holy grail of persistent
device naming. But it enables userspace tools to collect information
* reliably
(fails so far due to possible open() failures with unknown
relation to the corresponding sg device (which could be opened))
* without too much trouble

Both things I consider important and useful.

The patch basically does provide two pieces of information:
* mapping between sg vs. other high level devices
* mapping CBTU to high-level devices
The latter one is enough for many setups, and the former can be used for
more elaborate solutions involving userspace tools more advanced than the
simple script I included in the patch.

If you want to go for the holy grail, you may either come up with a
unique address at hardware level (which does currently not exist for all
types dealt with by the SCSI subsystem) and make it available to SCSI mid
level or use signatures that allows you to find devices back. LVM, e.g.
does the latter.
But at this moment, I fear, neither of them are possible in all cases.

Regards,
--
Kurt Garloff <[email protected]> [Eindhoven, NL]
Physics: Plasma simulations <[email protected]> [TU Eindhoven, NL]
Linux: SCSI, Security <[email protected]> [SuSE Nuernberg, DE]
(See mail header or public key servers for PGP2 and GPG public keys.)


Attachments:
(No filename) (2.35 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-18 02:40:48

by Doug Ledford

[permalink] [raw]
Subject: Re: /proc/scsi/map

On Tue, Jun 18, 2002 at 01:06:48AM +0200, Kurt Garloff wrote:
> Hi Doug,
>
> On Mon, Jun 17, 2002 at 06:08:18PM -0400, Doug Ledford wrote:
> > On Sat, Jun 15, 2002 at 03:36:06PM +0200, Kurt Garloff wrote:
> > > Life would be easier if the scsi subsystem would just report which SCSI
> > > device (uniquely identified by the controller,bus,target,unit tuple) belongs
> > > to which high-level device. The information is available in the kernel.
> >
> > Umm, this patently fails to meet the criteria you posted of "stable device
> > name". Adding a controller to a system is just as likely to blow this
> > naming scheme to hell as it is to blow the traditional linux /dev/sd?
> > scheme. IOW, even though the /proc/scsi/map file looks nice and usefull,
> > it fails to solve the very problem you are trying to solve.
>
> In case you just add controllers, you just need to make sure you get them the
> same numbers again. A solution for this exists already:
> * For a kernel where SCSI low-level drivers are loaded as modules,
> you just need to keep the order constant
> * For compiled in SCSI drivers, use scsihosts=

No, this is not true. If you add a new controller (for some new disks in
a new external enclosure or whatever), and that controller uses the same
driver as other controller(s) in your system, then you have no guarantee
of order. For example, adding a 4th aic7xxx controller to your system
might or might not place the new controller at the end of the list
depending on PCI scan order, etc. There simply is *no* guarantee here of
any consistent naming, so don't bother trying to claim there is.

Now don't get me wrong, I'm not saying the patch isn't usefull, but the
patch doesn't provide *any* guarantee of consistent device naming and so
using that as a reason to put the patch into the mainstream kernel is
utter crap. Go ahead and make your case for why it should be in the
kernel, but don't use reasons that aren't correct.

> But actually, the patch is not meant to be the holy grail of persistent
> device naming. But it enables userspace tools to collect information
> * reliably
> (fails so far due to possible open() failures with unknown
> relation to the corresponding sg device (which could be opened))

This can be done without your patch (the mapping from /dev/sg to /dev/sd?
or /dev/st? or /dev/scd? or whatever is not impossible from user space
without your patch, it just requires a user space tool to open the files
and start comparing host/bus/id/lun combinations from dev file to dev
file).

> * without too much trouble

This part is true enough, it is easier to read the map file than to
program the information retrieval I mentioned above.

> Both things I consider important and useful.
>
> The patch basically does provide two pieces of information:
> * mapping between sg vs. other high level devices

This I think is usefull.

> * mapping CBTU to high-level devices

This I don't think is usefull at all. It's no more reliable than our
current system and people that are depending on this to solve their "I
can't tell what device is what" delima are going to be sorely upset when
they realize that hardware changes can change this stuff around just as
fast as it changes around the /dev/sd? mappings.

> The latter one is enough for many setups,

The latter one is just as broken in design as the original /dev/sd?
enumeration problem (which stands to reason since this method also is an
enumeration method, it's just that instead of enumerating the disks
starting at 0, we are enumerating the SCSI controllers starting at the
first one we find and going from there).

> and the former can be used for
> more elaborate solutions involving userspace tools more advanced than the
> simple script I included in the patch.

Which is the much better way to go.

> If you want to go for the holy grail, you may either come up with a
> unique address at hardware level (which does currently not exist for all
> types dealt with by the SCSI subsystem) and make it available to SCSI mid
> level or use signatures that allows you to find devices back. LVM, e.g.
> does the latter.
> But at this moment, I fear, neither of them are possible in all cases.
>
> Regards,
> --
> Kurt Garloff <[email protected]> [Eindhoven, NL]
> Physics: Plasma simulations <[email protected]> [TU Eindhoven, NL]
> Linux: SCSI, Security <[email protected]> [SuSE Nuernberg, DE]
> (See mail header or public key servers for PGP2 and GPG public keys.)



--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2002-06-18 03:24:47

by Austin Gonyou

[permalink] [raw]
Subject: [Possibly OT] Re: /proc/scsi/map


On Mon, 2002-06-17 at 21:40, Doug Ledford wrote:
> On Tue, Jun 18, 2002 at 01:06:48AM +0200, Kurt Garloff wrote:
> > Hi Doug,
> >
> ....
> > In case you just add controllers, you just need to make sure you get them the
> > same numbers again. A solution for this exists already:
> > * For a kernel where SCSI low-level drivers are loaded as modules,
> > you just need to keep the order constant
> > * For compiled in SCSI drivers, use scsihosts=
>
....
> No, this is not true. If you add a new controller (for some new disks in
> a new external enclosure or whatever), and that controller uses the same
> driver as other controller(s) in your system, then you have no guarantee
> of order. For example, adding a 4th aic7xxx controller to your system
> might or might not place the new controller at the end of the list
> depending on PCI scan order, etc. There simply is *no* guarantee here of
> any consistent naming, so don't bother trying to claim there is.
>

Taking a bit of an example from Veritas, would it be, at all, feasible
if n+ blocks were used at the end of the disk or partition(beginning
maybe?), to write a specific identifier that is unique to a specific
controller, or to make note of the drive serial number and store that on
the disk somewhere in some agreed upon understood way.

Much like the private region on a veritas disk or volume. With the extra
accounting, which should only be needed during boot, or during
disk/volume manipulation, one could conceivably always have a sane
device map, all the time.

As to the rest of the comments lower down on the original mail, I'd say
that this is *a lot* of trouble, versus the opposite, but if implemented
properly would be highly useful. Using LVM and the like, which does
something like this, seems to be fine for most people(even when moving
disks around, etc), but this ability, without the overhead of LVM in the
mix would seem a good idea for some.

Just my $.02
TIA
--
Austin Gonyou <[email protected]>

2002-06-18 05:18:50

by Doug Ledford

[permalink] [raw]
Subject: Re: [Possibly OT] Re: /proc/scsi/map

On Mon, Jun 17, 2002 at 10:24:40PM -0500, Austin Gonyou wrote:
> Taking a bit of an example from Veritas, would it be, at all, feasible
> if n+ blocks were used at the end of the disk or partition(beginning
> maybe?), to write a specific identifier that is unique to a specific
> controller, or to make note of the drive serial number and store that on
> the disk somewhere in some agreed upon understood way.

Both LVM and the md code already do this. Ext2 and ext3 also have volume
labels that can be used for this purpose. As much as I hate to admit it,
this is the one area where I think MicroSoft did the right thing and
snagged an unused byte in the partition table to mark the disks ordering
(although we would need more than one byte). By putting it in the
partition table, it would only need to be dealt with by one area of code
(the partition reading code), would work for all filesystems, would work
for all LVM and md types of code, and would be universal on linux systems
and provide consistent, persistent device naming. Of course, if a disk
dies and you put a new one in, then you have to rename the new disk to the
old disks names when you partition it, but you would have to do that or
something similar to that with all such possible solutions.

The simple fact of the matter is that to provide truly consistent,
persistent device naming requires that the naming be "end-to-end". You
can not rely on *any* ordering issues (such as controllers, PCI busses,
devices, etc), you have to read the name from the device itself and the
name has to be totally irrespective of the devices current location on
whatever bus it uses.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2002-06-18 09:03:46

by Kurt Garloff

[permalink] [raw]
Subject: Re: /proc/scsi/map

Hi Doug,

On Mon, Jun 17, 2002 at 10:40:47PM -0400, Doug Ledford wrote:
> On Tue, Jun 18, 2002 at 01:06:48AM +0200, Kurt Garloff wrote:
> > * For compiled in SCSI drivers, use scsihosts=
>
> No, this is not true. If you add a new controller (for some new disks in
> a new external enclosure or whatever), and that controller uses the same
> driver as other controller(s) in your system, then you have no guarantee
> of order. For example, adding a 4th aic7xxx controller to your system
> might or might not place the new controller at the end of the list
> depending on PCI scan order, etc. There simply is *no* guarantee here of
> any consistent naming, so don't bother trying to claim there is.

You're right; there is no guarantee.

In scsidev, the IO port is added as identifier to the host number, but even
that may change in case your Plug'n'Pray BIOS assigns a different IO port
upon PCI configuration ...

> > But actually, the patch is not meant to be the holy grail of persistent
> > device naming. But it enables userspace tools to collect information
> > * reliably
> > (fails so far due to possible open() failures with unknown
> > relation to the corresponding sg device (which could be opened))
>
> This can be done without your patch (the mapping from /dev/sg to /dev/sd?
> or /dev/st? or /dev/scd? or whatever is not impossible from user space
> without your patch, it just requires a user space tool to open the files
> and start comparing host/bus/id/lun combinations from dev file to dev
> file).

No, it unfortunately can't. The reason is you just can't reliably open a
device to do the SCSI_IOCTL_GET_IDLUN to find out about CBTU.
Devices with removable media are a good example. The open on a tape
that is not inserted just fails. Or it may be in use

Or cause very nasty side effects, like stalling for a minute because the
tape driver rewinds the drive and looks for meta-information (osst).

This problem is really the reason why scsidev is not foolproof.
sg_map does not do magic either.

root@pckurt:~ $ sg_map
sg_map: close error: Input/output error
/dev/sg0 /dev/scd0
/dev/sg1 /dev/sda
/dev/sg2 /dev/sdb
/dev/sg3 /dev/sdc
/dev/sg4 /dev/sdd
/dev/sg5 /dev/sde
/dev/sg6 /dev/sdf
/dev/sg7 /dev/scd1
/dev/sg8 /dev/osst0
/dev/sg9 /dev/scd2
/dev/sg10 /dev/scd3
/dev/sg11
/dev/sg12 /dev/scd4

The /proc/scsi/map does provide the missing piece of information.

root@pckurt:~ $ cat /proc/scsi/map
# C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
1,0,09,00 0x00 1 sg2 c:15:02 sdb b:08:10
1,0,05,00 0x00 1 sg1 c:15:01 sda b:08:00
1,0,01,00 0x05 1 sg7 c:15:07 sr1 b:0b:01
1,0,02,00 0x01 1 sg8 c:15:08 osst0 c:ce:00
1,0,03,00 0x05 1 sg9 c:15:09 sr2 b:0b:02
2,0,05,00 0x00 1 sg3 c:15:03 sdc b:08:20
2,0,09,00 0x00 1 sg4 c:15:04 sdd b:08:30
2,0,01,00 0x05 1 sg10 c:15:0a sr3 b:0b:03
2,0,02,00 0x01 1 sg11 c:15:0b osst1 c:ce:01
2,0,03,00 0x05 1 sg12 c:15:0c sr4 b:0b:04
3,0,10,00 0x00 1 sg5 c:15:05 sde b:08:40
3,0,12,00 0x00 1 sg6 c:15:06 sdf b:08:50

> > * without too much trouble
>
> This part is true enough, it is easier to read the map file than to
> program the information retrieval I mentioned above.
>
> > Both things I consider important and useful.
> >
> > The patch basically does provide two pieces of information:
> > * mapping between sg vs. other high level devices
>
> This I think is usefull.
>
> > * mapping CBTU to high-level devices
>
> This I don't think is usefull at all. It's no more reliable than our
> current system and people that are depending on this to solve their "I
> can't tell what device is what" delima are going to be sorely upset when
> they realize that hardware changes can change this stuff around just as
> fast as it changes around the /dev/sd? mappings.

You less often build in new host adpaters or jumper your drives' IDs
differently than you switch on/off an external ZIP drive/scanner/ e.g.

And if you take a disk from one controller to another one, you arguably even
want the address to change as your path to the device changed. At least you
can expect it.

CBTU still gives you reasonable information about the path to a device.

The "C" identification may need improvement ... and I'm certainly open
for suggestions.

And I personally do think that at SCSI midlayer in kernel, you want to
report about this path to a device.

> > The latter one is enough for many setups,
>
> The latter one is just as broken in design as the original /dev/sd?
> enumeration problem (which stands to reason since this method also is an
> enumeration method, it's just that instead of enumerating the disks
> starting at 0, we are enumerating the SCSI controllers starting at the
> first one we find and going from there).

I don't think the original /dev/sd? design is broken. It's a choice that
makes some sense if you think about the fact that you only have a limited
amount of majors reserved for SCSI disks.
But it has some disadvantages ...

> > and the former can be used for
> > more elaborate solutions involving userspace tools more advanced than the
> > simple script I included in the patch.
>
> Which is the much better way to go.

Expect a scsidev release (2.25) that does use /proc/scsi/map (if present) to
present a somewhat better mapping than the little script included.
It optionally does assign aliases based on the serial number
(INQUIRY,evpd=1,pg.cd. 0x80), which is rather stable in case your do provide
this piece of information. But not unique in case you provide multiple
paths to your devices (like having more than one SCSI host adapter on a bus,
which I BTW do for testing purposes).

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (6.05 kB)
(No filename) (189.00 B)
Download all attachments

2002-06-15 14:11:02

by John Summerfield

[permalink] [raw]
Subject: Re: /proc/scsi/map


>
> Life would be easier if the scsi subsystem would just report which SCSI
> device (uniquely identified by the controller,bus,target,unit tuple) belongs
> to which high-level device. The information is available in the kernel.


Does this not fail if I pull a device off, change its ID (perhaps to fit into another system), then plug it in again? Or if I move it from one adaptor to another?





--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/

Note: mail delivered to me is deemed to be intended for me, for my disposition.

==============================
If you don't like being told you're wrong,
be right!



2002-06-15 15:52:22

by Richard Gooch

[permalink] [raw]
Subject: Re: /proc/scsi/map

Kurt Garloff writes:
> Hi SCSI users,
>
> from people using SCSI devices, there's one question that turns up again=20
> and again: How can one assign stable device names to SCSI devices in
> case there are devices that may or may not be switched on or connected.
>
> There are a couple of ways to address this problem:
[...]
> (c) devfs
[...]
> Unfortunately, those approaches all have some deficiencies.
[...]
> Ad (c): devfs is currently not (yet?) an option for distributions
> due to security and stability considerations.

Mandrake is using devfs. And the security and stability issues have
been fixed many months ago. The "devfs races" that Al used to complain
about regularly have been fixed. I haven't heard from Al for many
months (I see that as a positive sign:-). The current devfs code is in
maintenance mode. The next release of code will be a new devfs core
which uses the VFS for tree maintenance, making the code much smaller.
I.e. not a bugfixing release.

If there *are* remaining bugs with the devfs core (or devfsd for that
matter), I've not been made aware of them. If you know something I
don't, please let me know. AFAICT, all the bugs are long since solved.

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2002-06-15 16:04:19

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: /proc/scsi/map

> How can one assign stable device names to SCSI devices in
> case there are devices that may or may not be switched on or connected.

An interesting unsolved problem.
[Your discussion confuses a few things, especially in the context
of removable devices: a uuid lives on the disk, the C,B,T,U tends
to identify the drive rather than the disk.]

> Life would be easier if the scsi subsystem would just report which
> SCSI device (uniquely identified by the controller,bus,target,unit tuple)
> belongs to which high-level device.

Yes. I took your patch, ported it to 2.5, and tried it out.

# cat /proc/scsi/map
# C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
1,0,06,00 0x00 1 sg0 c:15:00 sda b:08:00
2,0,00,00 0x00 1 sg1 c:15:01 sdb b:08:10
2,0,00,01 0x00 1 sg2 c:15:02 sdc b:08:20
3,0,00,00 0x00 1 sg3 c:15:03 sdd b:08:30
3,0,00,01 0x00 1 sg4 c:15:04 sde b:08:40

Very good - in combination with /proc/scsi/scsi this gives
good information. I like it.

But just "cat /proc/scsi/map" is not good enough.
>From the above output alone one cannot easily guess which is which.
One would need a small utility that reads /proc/scsi/map and
/proc/scsi/scsi and produces something readable.
Will add sth to util-linux in case this gets accepted.

Andries


2002-06-15 19:50:12

by Sancho Dauskardt

[permalink] [raw]
Subject: Re: /proc/scsi/map


>Life would be easier if the scsi subsystem would just report which SCSI
>device (uniquely identified by the controller,bus,target,unit tuple) belongs
>to which high-level device. The information is available in the kernel.
>
>Attached patch does this:
>garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
># C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
>0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
[...]

Great, this was really missing badly.

But how about adding another column: GUID.
Most usb-storage and (all?) FireWire devices have such a unique identitiy.
In contrast to native SCSI devices, these emulated SCSI devices on
hot-plugging busses will change their LUNs/IDs. Therefor the GUID is really
a must to be able to create stable names (laptop suspend, etc.).

Both usb-storage and iee1394-sbp2 know the GUID. It only needs to be
communicated..

I'd guess that FibreChannel has similar problems ?

- sda

2002-06-15 21:10:07

by Douglas Gilbert

[permalink] [raw]
Subject: Re: /proc/scsi/map

[email protected] wrote:

> > How can one assign stable device names to SCSI devices in
> > case there are devices that may or may not be switched on or connected.
>
> An interesting unsolved problem.
> [Your discussion confuses a few things, especially in the context
> of removable devices: a uuid lives on the disk, the C,B,T,U tends
> to identify the drive rather than the disk.]

In the lsml "[RFC] Persistent naming of scsi devices" thread
Martin K. Petersen <[email protected]> summarized scsi addressing issues
thus:
#
# What we want is (at least) three ways of addressing a device:
#
# 1. By content. This is the persistent naming. Think
# filesystem/MD/LVM UUID. This is what you put in /etc/fstab and
# what metadisk systems use to assemble logical volumes.
#
# Content referencing is used for accessing data.
#
# 2. By physical path. This naming is not persistent. Not even runtime
# because hotplug, iSCSI and whatnot may mess things up.
#
# Path naming is for discovery and recovery. When you add an
# unlabeled disk you want to reference it by path to give it a name.
# When you have a failed disk on your system you want to know which
# physical device to pull from the array.
#
# 3. By enumeration. This is what the kernel happens to be using to
# reference the device. diskN. Certainly not persistent.
#
# Enumeration is for the kernel.

The /proc/scsi/map facility proposed by Kurt does a very
good job at tying together points 2) and 3) . As a bonus
it also shows the sg device to primary device mapping
(which is a major headache for me as the sg maintainer).

As for the physical path getting fooled by a re-plug, I would
like to present the idea of a "device attach event number"
which would be uniquely issued (ascending order) to each newly
attached scsi device [an attach timestamp could be useful as well].
This is easy to implement, has a many-to-one
relationship with the physical path (i.e. C,B,T,U) but, at
any instant, has a one-to-one relationship. So the attach
event number can be used as an alias for C,B,T,U** and makes
no pretensions to be persistent from one kernel lifetime
to the next (on the same machine). The scsimon driver supports
a device "attach event number". If the idea is workable the
number should probably be put in the scsi mid-level (i.e.
Scsi_Device structure).

** The C,B,T,U tuple (or nexus) is already a handful and
will probably get uglier with 8-byte luns and iSCSI extensions.
The SCSI_IOCTL_GET_IDLUN tries to pack C,B,T,U into one integer
and obviously won't scale to an 8 byte lun. A new ioctl such as
SCSI_IOCTL_GET_PHYSICAL_PATH (for example) could fix "GET_IDLUN"'s
shortcomings and yield the device attach event number.


Mike Sullivan (who started the above-mentioned "[RFC] Persistent
naming of scsi devices" thread) recently presented patches on
lsml for a devnaming (and scsiname) utility. These patches
address Martin's point 1) [content addressing].


There remains the issue of removable media (i.e. device stays
put but the media is changed). In the absence of asynchronous
event notification, the next (normal) scsi command receives a
"unit attention" to alert the kernel. I'm not sure if Mike's
patch addresses this issue.

> > Life would be easier if the scsi subsystem would just report which
> > SCSI device (uniquely identified by the controller,bus,target,unit tuple)
> > belongs to which high-level device.
>
> Yes. I took your patch, ported it to 2.5, and tried it out.
>
> # cat /proc/scsi/map
> # C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
> 1,0,06,00 0x00 1 sg0 c:15:00 sda b:08:00
> 2,0,00,00 0x00 1 sg1 c:15:01 sdb b:08:10
> 2,0,00,01 0x00 1 sg2 c:15:02 sdc b:08:20
> 3,0,00,00 0x00 1 sg3 c:15:03 sdd b:08:30
> 3,0,00,01 0x00 1 sg4 c:15:04 sde b:08:40
>
> Very good - in combination with /proc/scsi/scsi this gives
> good information. I like it.

Adding INQUIRY strings would make it harder to parse and more
cluttered for humans to read.

> But just "cat /proc/scsi/map" is not good enough.
> From the above output alone one cannot easily guess which is which.
> One would need a small utility that reads /proc/scsi/map and
> /proc/scsi/scsi and produces something readable.

Note the possible race condition: reading /proc/scsi/scsi
then /proc/scsi/map (or vice versa) when a hotplug is
occurring...

BTW In lk 2.5 we store the whole INQUIRY response in the
Scsi_Device structure. Currently we have no ioctl to yield
this information to the user space :-(

> Will add sth to util-linux in case this gets accepted.

IMO as soon as lk 2.4.19 comes out, this patch should
be presented to Marcelo (for inclusion in lk 2.4.20). The
sooner we get it into the lk 2.5 series the better. Could you
forward your lk 2.5 version of Kurt's patch to Linus (and the
linux-scsi list)?

Doug Gilbert

2002-06-15 21:54:23

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: /proc/scsi/map

>Life would be easier if the scsi subsystem would just report which SCSI
>device (uniquely identified by the controller,bus,target,unit tuple) belongs
>to which high-level device. The information is available in the kernel.
>
>Attached patch does this:
>garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
># C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
>0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
[...]

Great, this was really missing badly.

But how about adding another column: GUID.
Most usb-storage and (all?) FireWire devices have such a unique identitiy.
In contrast to native SCSI devices, these emulated SCSI devices on
hot-plugging busses will change their LUNs/IDs. Therefor the GUID is
really a must to be able to create stable names (laptop suspend, etc.).

Both usb-storage and iee1394-sbp2 know the GUID. It only needs to be
communicated..

The usb-storage GUID is just one random item of information.
One might wish for much more.

And: this information is already somewhere:

% cat /proc/scsi/sg/host_strs
SCSI host adapter emulation for IDE ATAPI devices
Iomega VPI2 (imm) interface
SCSI emulation for USB Mass Storage devices
SCSI emulation for USB Mass Storage devices
%

This tells me that host 0 will be in ide-scsi, host 1 in imm,
host 2 in usb-storage-0, host 3 in usb-storage-1.
And

% cat /proc/scsi/ide-scsi/0
SCSI host adapter emulation for IDE ATAPI devices
% cat /proc/scsi/imm/1
Version : 2.05 (for Linux 2.4.0)
Parport : parport0
Mode : SPP
% cat /proc/scsi/usb-storage-0/2
Host scsi2: usb-storage
Vendor: DataFab Systems Inc.
Product: USB CF+SM
Serial Number: 5DC69477C6
Protocol: Transparent SCSI
Transport: Datafab Bulk-Only
GUID: 07c4a1090000005dc69477c6
Attached: Yes
% cat /proc/scsi/usb-storage-1/3
Host scsi3: usb-storage
Vendor: SCM Microsystems Inc.
Product: eUSB SmartMedia / CompactFlash
Serial Number: None
Protocol: Transparent SCSI
Transport: Control/Bulk-EUSB/SDDR09
GUID: 04e600050000000000000000
Attached: Yes
%

A small utility that looks around in /proc is able to
find the GUID. Of course it would be better when fewer
heuristics were required.

Finally, the GUIDs you see here do not determine the LUN.
So, there is no well-defined line in /proc/scsi/map
where they would belong.

Andries

2002-06-15 22:29:12

by Douglas Gilbert

[permalink] [raw]
Subject: Re: /proc/scsi/map

[email protected] wrote:
>
> >Life would be easier if the scsi subsystem would just report which SCSI
> >device (uniquely identified by the controller,bus,target,unit tuple) belongs
> >to which high-level device. The information is available in the kernel.
> >
> >Attached patch does this:
> >garloff@pckurt:/raid5/Kernel/src $ cat /proc/scsi/map
> ># C,B,T,U Type onl sg_nm sg_dev nm dev(hex)
> >0,0,00,00 0x05 1 sg0 c:15:00 sr0 b:0b:00
> [...]
>
> Great, this was really missing badly.
>
> But how about adding another column: GUID.
> Most usb-storage and (all?) FireWire devices have such a unique identitiy.
> In contrast to native SCSI devices, these emulated SCSI devices on
> hot-plugging busses will change their LUNs/IDs. Therefor the GUID is
> really a must to be able to create stable names (laptop suspend, etc.).
>
> Both usb-storage and iee1394-sbp2 know the GUID. It only needs to be
> communicated..
>
> The usb-storage GUID is just one random item of information.
> One might wish for much more.
>
> And: this information is already somewhere:
>
> % cat /proc/scsi/sg/host_strs
> SCSI host adapter emulation for IDE ATAPI devices
> Iomega VPI2 (imm) interface
> SCSI emulation for USB Mass Storage devices
> SCSI emulation for USB Mass Storage devices
> %
>
> This tells me that host 0 will be in ide-scsi, host 1 in imm,
> host 2 in usb-storage-0, host 3 in usb-storage-1.
> And
>
> % cat /proc/scsi/ide-scsi/0
> SCSI host adapter emulation for IDE ATAPI devices
> % cat /proc/scsi/imm/1
> Version : 2.05 (for Linux 2.4.0)
> Parport : parport0
> Mode : SPP
> % cat /proc/scsi/usb-storage-0/2
> Host scsi2: usb-storage
> Vendor: DataFab Systems Inc.
> Product: USB CF+SM
> Serial Number: 5DC69477C6
> Protocol: Transparent SCSI
> Transport: Datafab Bulk-Only
> GUID: 07c4a1090000005dc69477c6
> Attached: Yes
> % cat /proc/scsi/usb-storage-1/3
> Host scsi3: usb-storage
> Vendor: SCM Microsystems Inc.
> Product: eUSB SmartMedia / CompactFlash
> Serial Number: None
> Protocol: Transparent SCSI
> Transport: Control/Bulk-EUSB/SDDR09
> GUID: 04e600050000000000000000
> Attached: Yes
> %
>
> A small utility that looks around in /proc is able to
> find the GUID. Of course it would be better when fewer
> heuristics were required.
>
> Finally, the GUIDs you see here do not determine the LUN.
> So, there is no well-defined line in /proc/scsi/map
> where they would belong.

In lk 2.5 we are hoping that driverfs will give us an
"information bridge" between scsi pseudo devices
and other driver subsystems such as ide, usb and iee1394.
Mike Sullivan's persistent naming patch (that I mentioned
in my previous post on this thread) adds driverfs capability
into the scsi subsystem. Driverfs capability is already
in the ide and usb subsystems.

procfs, driverfs and devfs ...

Doug Gilbert

2002-06-15 22:28:57

by Sancho Dauskardt

[permalink] [raw]
Subject: Re: /proc/scsi/map


>
> Both usb-storage and iee1394-sbp2 know the GUID. It only needs to be
> communicated..
>
>The usb-storage GUID is just one random item of information.

Why is this 'one random item' ??
For usb-storage devices the GUID is built using Vendor ID & Device ID and
the device's Serial Nr.

For identification purposes, the serial number is useless without Vendor ID
& Device ID.
Ofcourse we'll never have a change of creating stable name for devices that
don't have a serialnr.


>One might wish for much more.
>
>And: this information is already somewhere:
Sure. But
a) not easily readable
b) totall different for FireWire devices
c) race-conditions (reading multiple files).



>% cat /proc/scsi/usb-storage-0/2
> Host scsi2: usb-storage
> Vendor: DataFab Systems Inc.
> Product: USB CF+SM
>Serial Number: 5DC69477C6
> Protocol: Transparent SCSI
> Transport: Datafab Bulk-Only
> GUID: 07c4a1090000005dc69477c6
> Attached: Yes
Exactly, but finding this out at the moment involves reading:
/proc/scsi/scsi, scanning /proc/scsi/usb-storage-*, scanning
/proc/scsi/usb-storage-X/*, reading /proc/scsi/usb-storage-X/Y.


>% cat /proc/scsi/usb-storage-1/3
> Host scsi3: usb-storage
> Vendor: SCM Microsystems Inc.
> Product: eUSB SmartMedia / CompactFlash
>Serial Number: None
> Protocol: Transparent SCSI
> Transport: Control/Bulk-EUSB/SDDR09
> GUID: 04e600050000000000000000
> Attached: Yes

Well that's just SCM, huh ? The newer SCM-Orca chipset has a S/N.



>Finally, the GUIDs you see here do not determine the LUN.
>So, there is no well-defined line in /proc/scsi/map
>where they would belong.

But the LUN-map for such a device will never change ?
Incase a device has 4 LUN's, well have 4 /proc/scsi/map entries with the
same GUID.

The hotplug agent will indeed be watching GUID+Lun.

- sda

2002-06-15 22:41:08

by Sancho Dauskardt

[permalink] [raw]
Subject: Re: /proc/scsi/map


>
>In lk 2.5 we are hoping that driverfs will give us an
>"information bridge" between scsi pseudo devices
>and other driver subsystems such as ide, usb and iee1394.
>Mike Sullivan's persistent naming patch (that I mentioned
>in my previous post on this thread) adds driverfs capability
>into the scsi subsystem. Driverfs capability is already
>in the ide and usb subsystems.
Driverfs will hopefully solve the problem, of "oh there's a SCSI device.
how is it connected ?".

But to date, SCSI doesn't know about the GUID's, right ?
And without this, we won't get a uniform way of creating stable names for
hot-plugable devices...

- sda

2002-06-15 23:00:55

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: /proc/scsi/map

From [email protected] Sat Jun 15 23:10:06 2002

In the lsml "[RFC] Persistent naming of scsi devices" thread
Martin K. Petersen <[email protected]> summarized scsi addressing issues
thus:
#
# What we want is (at least) three ways of addressing a device:
#
# 1. By content. This is the persistent naming. Think
# filesystem/MD/LVM UUID. This is what you put in /etc/fstab and
# what metadisk systems use to assemble logical volumes.
#
# Content referencing is used for accessing data.
#
# 2. By physical path. This naming is not persistent. Not even runtime
# because hotplug, iSCSI and whatnot may mess things up.
#
# Path naming is for discovery and recovery. When you add an
# unlabeled disk you want to reference it by path to give it a name.
# When you have a failed disk on your system you want to know which
# physical device to pull from the array.
#
# 3. By enumeration. This is what the kernel happens to be using to
# reference the device. diskN. Certainly not persistent.
#
# Enumeration is for the kernel.

Yes, a very nice summary. Maybe also

# 4. By fingerprint. This naming is persistent, but need not
# identify the device uniquely. Think device type, vendor,
# serial number, capacity.
#
# Fingerprints are convenient for the user. "My ZIP drive".


The /proc/scsi/map facility proposed by Kurt does a very
good job at tying together points 2) and 3) .

Yes. Or, more precisely, it ties together the name sde and the
path (C,B,T,U) = (3,0,00,01). However, this path is not very
"physical". Indeed, these four numbers all arose by enumeration -
they are kernel numbers for some usb-storage device.

I can find the physical path, but that requires quite some digging in
/proc/scsi/scsi and /proc/scsi/sg/host_strs and /proc/scsi/usb-storage*/*
and /proc/bus/usb/devices.


> Very good - in combination with /proc/scsi/scsi this gives
> good information. I like it.

Adding INQUIRY strings would make it harder to parse and more
cluttered for humans to read.

Yes, I would like to leave that to a user-space utility.
Writing that will also show what kernel facilities are missing.

> But just "cat /proc/scsi/map" is not good enough.
> From the above output alone one cannot easily guess which is which.
> One would need a small utility that reads /proc/scsi/map and
> /proc/scsi/scsi and produces something readable.

Note the possible race condition: reading /proc/scsi/scsi
then /proc/scsi/map (or vice versa) when a hotplug is
occurring...

Already reading /proc/scsi/map has a race: no locking is done.

> Will add sth to util-linux in case this gets accepted.

IMO as soon as lk 2.4.19 comes out, this patch should
be presented to Marcelo (for inclusion in lk 2.4.20).

Yes, perhaps. I would like to get some experience first,
play with it, see what information is missing.

I have to admit that 2.5 is not very suitable for development
these days, it is not stable enough, but the appropriate path
is to try things out in 2.5 first, and when something crystallizes
out, backport.

The sooner we get it into the lk 2.5 series the better. Could you
forward your lk 2.5 version of Kurt's patch to Linus (and the
linux-scsi list)?

Yes, but I'll wait for 2.5.22 so that it is clear against
what tree to diff.

Andries