2011-06-15 08:23:34

by Nao Nishijima

[permalink] [raw]
Subject: [PATCH 0/3] [RFC] Persistent device name using preferred name

Hi,

This patch series provides preferred name into kernel and procfs
messages. Preferred name is user's preferred name for a device.

The purpose of this feature is to solve the persistent device
naming issues which was discussed here:

http://marc.info/?l=linux-scsi&m=130200794615884&w=2

There are four issues.
1. kernel messages doesn't show persistent device names
2. procfs messages doesn't show persistent device names
3. Some commands didn't support persistent device name in arguments
4. Some commands message didn't show persistent device names

Then I suggested the intermediate device naming which changes
the naming scheme, but it was rejected. I realized that we should
use udev to provide persistent device names instead of change the
naming scheme.

In LKML discussion, a new idea was suggested by James Bottomley.
This idea allows kernel messages show preferred names by adding a
new attribute to a device, kernel messages show this new attribute.
This idea's advantage is not to change the current naming scheme.

I tried implementation of preferred name, and then there are two
discussion points.

(a) Which devices need support?
Preferred name is stored in struct device. Therefore it is available
for all devices if we make preferred name support with other device
types.

This patch series only support scsi block device. Is there the device
which needs support? (e.g. Ntwork devices, generic SCSI devices, etc.)

(b) What kind of procfs form is good?
I implemented preferred name something like this,

(preferred name assigned foo to sda)
#cat /proc/partitions
major minor #blocks name

8 0 488386584 foo
8 1 194560 foo1
...

Do you needs device name filed?
Something like this,

(preferred name assigned foo to sda)
#cat /proc/partitions
major minor #blocks name preferred

8 0 488386584 sda foo
8 1 194560 sda1 foo1
...


Issue 3 and 4 is command releated issue. Commands have to be
modified to use preferred name. We need to create library for
preferred name.

Our goal is to solve those issues, and users can use and see
preferred name anywhere.

TODO:
- To change kernel messages
I'm going to change a device name to a preferred name by
dev_name() in mmc, blk-core, sg, sr, st, fs, etc.

I would welcome any thoughts, comments and suggestions.

Thanks,

---

Nao Nishijima (3):
[RFC] fs: print preferred name in procfs messages
[RFC] sd: print preferred name in kernel messages.
[RFC] genhd: add a new attribute in device structure


block/genhd.c | 28 ++++++++++++++++++++++++++++
drivers/scsi/sd.c | 2 +-
drivers/scsi/sd.h | 2 +-
fs/partitions/check.c | 8 +++++---
include/linux/device.h | 9 +++++++++
include/scsi/scsi_device.h | 2 +-
6 files changed, 45 insertions(+), 6 deletions(-)

--
Nao Nishijima ([email protected])


2011-06-15 08:22:57

by Nao Nishijima

[permalink] [raw]
Subject: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

Allow users to set the preferred name of device via sysfs interface.

(Exsample) sda -> foo
# echo foo > /sys/block/sda/preferred_name

Suggested-by: James Bottomley <[email protected]>
Suggested-by: Jon Masters <[email protected]>
Signed-off-by: Nao Nishijima <[email protected]>
---

block/genhd.c | 28 ++++++++++++++++++++++++++++
include/linux/device.h | 9 +++++++++
2 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 95822ae..79b97f6 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -909,6 +909,31 @@ static int __init genhd_device_init(void)

subsys_initcall(genhd_device_init);

+static ssize_t preferred_name_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return snprintf(buf, PREFERRED_NAME_LEN, "%s\n", dev->preferred_name);
+}
+
+static ssize_t preferred_name_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct gendisk *disk = dev_to_disk(dev);
+ if (!count)
+ return -EINVAL;
+ if (strlen(buf) >= PREFERRED_NAME_LEN) {
+ printk(KERN_ERR "preferred_name: %s is too long\n", buf);
+ return -EINVAL;
+ }
+ dev->preferred_name = kasprintf(GFP_KERNEL, "%s", buf);
+ if (!dev->preferred_name)
+ return -ENOMEM;
+ printk(KERN_INFO "preferred_name: assigned %s to %s\n",
+ buf, disk->disk_name);
+ return count;
+}
+
static ssize_t disk_range_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -968,6 +993,8 @@ static ssize_t disk_discard_alignment_show(struct device *dev,
return sprintf(buf, "%d\n", queue_discard_alignment(disk->queue));
}

+static DEVICE_ATTR(preferred_name, S_IRUGO|S_IWUSR, preferred_name_show,
+ preferred_name_store);
static DEVICE_ATTR(range, S_IRUGO, disk_range_show, NULL);
static DEVICE_ATTR(ext_range, S_IRUGO, disk_ext_range_show, NULL);
static DEVICE_ATTR(removable, S_IRUGO, disk_removable_show, NULL);
@@ -990,6 +1017,7 @@ static struct device_attribute dev_attr_fail_timeout =
#endif

static struct attribute *disk_attrs[] = {
+ &dev_attr_preferred_name.attr,
&dev_attr_range.attr,
&dev_attr_ext_range.attr,
&dev_attr_removable.attr,
diff --git a/include/linux/device.h b/include/linux/device.h
index c66111a..0486644 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -490,6 +490,9 @@ struct device_dma_parameters {
unsigned long segment_boundary_mask;
};

+/* maximum length of preferred name */
+#define PREFERRED_NAME_LEN 256
+
/**
* struct device - The basic device structure
* @parent: The device's "parent" device, the device to which it is attached.
@@ -500,6 +503,7 @@ struct device_dma_parameters {
* See the comment of the struct device_private for detail.
* @kobj: A top-level, abstract class from which other classes are derived.
* @init_name: Initial name of the device.
+ * @preferred_name: Preferred name of the device.
* @type: The type of device.
* This identifies the device type and carries type-specific
* information.
@@ -556,6 +560,7 @@ struct device {

struct kobject kobj;
const char *init_name; /* initial name of the device */
+ const char *preferred_name; /* preferred name of the device */
const struct device_type *type;

struct mutex mutex; /* mutex to synchronize calls to
@@ -608,6 +613,10 @@ struct device {

static inline const char *dev_name(const struct device *dev)
{
+ /* Use the preferred name when users set it */
+ if (dev->preferred_name)
+ return dev->preferred_name;
+
/* Use the init name until the kobject becomes available */
if (dev->init_name)
return dev->init_name;

2011-06-15 08:23:48

by Nao Nishijima

[permalink] [raw]
Subject: [PATCH 2/3] [RFC] sd: print preferred name in kernel messages.

This patch modify sd_printk() and scmd_printk() to use preferred name.

Suggested-by: James Bottomley <[email protected]>
Suggested-by: Jon Masters <[email protected]>
Signed-off-by: Nao Nishijima <[email protected]>
---

drivers/scsi/sd.c | 2 +-
drivers/scsi/sd.h | 2 +-
include/scsi/scsi_device.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 953773c..6119760 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1071,7 +1071,7 @@ static int sd_ioctl(struct block_device *bdev, fmode_t mode,
int error;

SCSI_LOG_IOCTL(1, printk("sd_ioctl: disk=%s, cmd=0x%x\n",
- disk->disk_name, cmd));
+ dev_name(disk_to_dev(disk)), cmd));

/*
* If we are in the middle of error recovery, don't let anyone
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 6ad798b..3fa14f8 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -91,7 +91,7 @@ static inline struct scsi_disk *scsi_disk(struct gendisk *disk)
#define sd_printk(prefix, sdsk, fmt, a...) \
(sdsk)->disk ? \
sdev_printk(prefix, (sdsk)->device, "[%s] " fmt, \
- (sdsk)->disk->disk_name, ##a) : \
+ dev_name(disk_to_dev((sdsk)->disk)), ##a) : \
sdev_printk(prefix, (sdsk)->device, fmt, ##a)

/*
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index dd82e02..0fb955a 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -219,7 +219,7 @@ struct scsi_dh_data {
#define scmd_printk(prefix, scmd, fmt, a...) \
(scmd)->request->rq_disk ? \
sdev_printk(prefix, (scmd)->device, "[%s] " fmt, \
- (scmd)->request->rq_disk->disk_name, ##a) : \
+ dev_name(disk_to_dev((scmd)->request->rq_disk)), ##a) : \
sdev_printk(prefix, (scmd)->device, fmt, ##a)

enum scsi_target_state {

2011-06-15 08:23:31

by Nao Nishijima

[permalink] [raw]
Subject: [PATCH 3/3] [RFC] fs: print preferred name in procfs messages

Make disk_name() return preferred name instead of disk_name
when preferred name is set. disk_name() is used in
/proc/{partitions, diskstats}. Therefore, those files show
preferred name.

Suggested-by: James Bottomley <[email protected]>
Signed-off-by: Nao Nishijima <[email protected]>
---

fs/partitions/check.c | 8 +++++---
1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index d545e97..4ffdcf4 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -125,11 +125,13 @@ static int (*check_part[])(struct parsed_partitions *) = {
char *disk_name(struct gendisk *hd, int partno, char *buf)
{
if (!partno)
- snprintf(buf, BDEVNAME_SIZE, "%s", hd->disk_name);
+ snprintf(buf, BDEVNAME_SIZE, "%s", dev_name(disk_to_dev(hd)));
else if (isdigit(hd->disk_name[strlen(hd->disk_name)-1]))
- snprintf(buf, BDEVNAME_SIZE, "%sp%d", hd->disk_name, partno);
+ snprintf(buf, BDEVNAME_SIZE, "%sp%d", dev_name(disk_to_dev(hd)),
+ partno);
else
- snprintf(buf, BDEVNAME_SIZE, "%s%d", hd->disk_name, partno);
+ snprintf(buf, BDEVNAME_SIZE, "%s%d", dev_name(disk_to_dev(hd)),
+ partno);

return buf;
}

2011-06-15 14:43:30

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Wed, 2011-06-15 at 17:16 +0900, Nao Nishijima wrote:
> Allow users to set the preferred name of device via sysfs interface.
>
> (Exsample) sda -> foo
> # echo foo > /sys/block/sda/preferred_name
>
> Suggested-by: James Bottomley <[email protected]>

So, as I said previously, I think this needs to be in struct device not
struct gendisk. It will be an easier infrastructure to use and it
easily generalises to our other naming problems (like net).

James

2011-06-15 15:40:19

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Wed, Jun 15, 2011 at 05:16:28PM +0900, Nao Nishijima wrote:
> Allow users to set the preferred name of device via sysfs interface.
>
> (Exsample) sda -> foo
> # echo foo > /sys/block/sda/preferred_name
>
> Suggested-by: James Bottomley <[email protected]>
> Suggested-by: Jon Masters <[email protected]>
> Signed-off-by: Nao Nishijima <[email protected]>

You don't document this new sysfs file (which is required), nor do you
explain what it is for and how to use it.

Please do that in this patch, and in a Documentation/ABI/ file for any
new sysfs file you create.

I still fail to understand how a "preferred" file will help anyone out
here at all...

greg k-h

2011-06-15 15:40:21

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 0/3] [RFC] Persistent device name using preferred name

On Wed, Jun 15, 2011 at 05:16:10PM +0900, Nao Nishijima wrote:
> Hi,
>
> This patch series provides preferred name into kernel and procfs
> messages. Preferred name is user's preferred name for a device.
>
> The purpose of this feature is to solve the persistent device
> naming issues which was discussed here:
>
> http://marc.info/?l=linux-scsi&m=130200794615884&w=2
>
> There are four issues.
> 1. kernel messages doesn't show persistent device names

That is because a persistent device name could be anything, there are
multiple ways of defining a device, and the kernel will not know them
all as multiple ones could be in use for the same device.

> 2. procfs messages doesn't show persistent device names

See above.

> 3. Some commands didn't support persistent device name in arguments

Then fix the commands!

Seriously, this could be done by now, it's been over a year since this
was first discussed. All distros could have the updated packages by now
and this would not be an issue.

I still think this is the correct way to solve the problem as it is a
userspace issue, not a kernel one.

> 4. Some commands message didn't show persistent device names

Same as #3.

> Then I suggested the intermediate device naming which changes
> the naming scheme, but it was rejected. I realized that we should
> use udev to provide persistent device names instead of change the
> naming scheme.

Yes.

> In LKML discussion, a new idea was suggested by James Bottomley.
> This idea allows kernel messages show preferred names by adding a
> new attribute to a device, kernel messages show this new attribute.
> This idea's advantage is not to change the current naming scheme.
>
> I tried implementation of preferred name, and then there are two
> discussion points.
>
> (a) Which devices need support?
> Preferred name is stored in struct device. Therefore it is available
> for all devices if we make preferred name support with other device
> types.
>
> This patch series only support scsi block device. Is there the device
> which needs support? (e.g. Ntwork devices, generic SCSI devices, etc.)
>
> (b) What kind of procfs form is good?
> I implemented preferred name something like this,
>
> (preferred name assigned foo to sda)
> #cat /proc/partitions
> major minor #blocks name
>
> 8 0 488386584 foo
> 8 1 194560 foo1
> ...
>
> Do you needs device name filed?
> Something like this,
>
> (preferred name assigned foo to sda)
> #cat /proc/partitions
> major minor #blocks name preferred
>
> 8 0 488386584 sda foo
> 8 1 194560 sda1 foo1
> ...

Sorry, but you can not change the format of procfs files without
breaking a lot of tools, that's no longer allowed.

> Issue 3 and 4 is command releated issue. Commands have to be
> modified to use preferred name. We need to create library for
> preferred name.

Again, this is quite simple and could have been finished by now :(

> Our goal is to solve those issues, and users can use and see
> preferred name anywhere.

I don't see how your proposed solution would solve the issue of
userspace using different persistant names for the same device. How
would it know which one is correct?

Again, this is a userspace thing, not a kernel thing, please solve it in
userspace.

greg k-h

2011-06-16 12:03:59

by Nao Nishijima

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

Hi Greg,

(2011/06/16 0:33), Greg KH wrote:
> On Wed, Jun 15, 2011 at 05:16:28PM +0900, Nao Nishijima wrote:
>> Allow users to set the preferred name of device via sysfs interface.
>>
>> (Exsample) sda -> foo
>> # echo foo > /sys/block/sda/preferred_name
>>
>> Suggested-by: James Bottomley <[email protected]>
>> Suggested-by: Jon Masters <[email protected]>
>> Signed-off-by: Nao Nishijima <[email protected]>
>
> You don't document this new sysfs file (which is required), nor do you
> explain what it is for and how to use it.
>
> Please do that in this patch, and in a Documentation/ABI/ file for any
> new sysfs file you create.
>

I'm afraid that my explanation was not enough.
I will add explanation to this patch and Documentation/ABI file.

> I still fail to understand how a "preferred" file will help anyone out
> here at all...
>

Let me explain, users cannot identify a device from a device name
because device names may change at each boot up time. If kernel show
preferred names in kernel messages, users can easily identify a device
from kernel messages.

> greg k-h
>

Thanks,

--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: [email protected]

2011-06-16 15:41:58

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 09:03:43PM +0900, Nao Nishijima wrote:
> Hi Greg,
>
> (2011/06/16 0:33), Greg KH wrote:
> > On Wed, Jun 15, 2011 at 05:16:28PM +0900, Nao Nishijima wrote:
> >> Allow users to set the preferred name of device via sysfs interface.
> >>
> >> (Exsample) sda -> foo
> >> # echo foo > /sys/block/sda/preferred_name
> >>
> >> Suggested-by: James Bottomley <[email protected]>
> >> Suggested-by: Jon Masters <[email protected]>
> >> Signed-off-by: Nao Nishijima <[email protected]>
> >
> > You don't document this new sysfs file (which is required), nor do you
> > explain what it is for and how to use it.
> >
> > Please do that in this patch, and in a Documentation/ABI/ file for any
> > new sysfs file you create.
> >
>
> I'm afraid that my explanation was not enough.
> I will add explanation to this patch and Documentation/ABI file.
>
> > I still fail to understand how a "preferred" file will help anyone out
> > here at all...
> >
>
> Let me explain, users cannot identify a device from a device name
> because device names may change at each boot up time. If kernel show
> preferred names in kernel messages, users can easily identify a device
> from kernel messages.

I understand your request for the kernel to print out these types of
names, but I'm still not sold on this being an issue that is the
kernel's to deal with at all.

Again, how would you handle multiple persistant names for the same
device being used at the same time?

And again, why not just fix the userspace tools? That is trivial to do
so and again, could have been done by now in the years this has been
discussed.

thanks,

greg k-h

2011-06-16 15:50:59

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, 2011-06-16 at 08:41 -0700, Greg KH wrote:
> On Thu, Jun 16, 2011 at 09:03:43PM +0900, Nao Nishijima wrote:
> > Hi Greg,
> >
> > (2011/06/16 0:33), Greg KH wrote:
> > > On Wed, Jun 15, 2011 at 05:16:28PM +0900, Nao Nishijima wrote:
> > >> Allow users to set the preferred name of device via sysfs interface.
> > >>
> > >> (Exsample) sda -> foo
> > >> # echo foo > /sys/block/sda/preferred_name
> > >>
> > >> Suggested-by: James Bottomley <[email protected]>
> > >> Suggested-by: Jon Masters <[email protected]>
> > >> Signed-off-by: Nao Nishijima <[email protected]>
> > >
> > > You don't document this new sysfs file (which is required), nor do you
> > > explain what it is for and how to use it.
> > >
> > > Please do that in this patch, and in a Documentation/ABI/ file for any
> > > new sysfs file you create.
> > >
> >
> > I'm afraid that my explanation was not enough.
> > I will add explanation to this patch and Documentation/ABI file.
> >
> > > I still fail to understand how a "preferred" file will help anyone out
> > > here at all...
> > >
> >
> > Let me explain, users cannot identify a device from a device name
> > because device names may change at each boot up time. If kernel show
> > preferred names in kernel messages, users can easily identify a device
> > from kernel messages.
>
> I understand your request for the kernel to print out these types of
> names, but I'm still not sold on this being an issue that is the
> kernel's to deal with at all.
>
> Again, how would you handle multiple persistant names for the same
> device being used at the same time?
>
> And again, why not just fix the userspace tools? That is trivial to do
> so and again, could have been done by now in the years this has been
> discussed.

So I can summarise where I think we are in these discussions:

We provide the ability to give all kernel devices a "preferred name".
By default this will be the device name the kernel would have originally
assigned. the dev_printk's will use the preferred name, and it will be
modifiable from user space. All the kernel will do is print out
whatever it is ... no guarantees of uniqueness or specific format will
be made. Since we're only providing one preferred_name file, the kernel
can only have one preferred name for a device at any given time
(although it is modifiable on the fly as many times as the user
chooses).

The design is to use this preferred name to implement what Hitachi wants
in terms of persistent name, but we don't really care.

All userspace naming will be taken care of by the usual udev rules, so
for disks, something like /dev/disk/by-preferred/<fred> which would be
the usual symbolic link.

This will ensure that kernel output and udev input are consistent. It
will still require that user space utilities which derive a name for a
device will need modifying to print out the preferred name.

James

2011-06-16 16:15:21

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 11:50:54AM -0400, James Bottomley wrote:
> > And again, why not just fix the userspace tools? That is trivial to do
> > so and again, could have been done by now in the years this has been
> > discussed.
>
> So I can summarise where I think we are in these discussions:
>
> We provide the ability to give all kernel devices a "preferred name".
> By default this will be the device name the kernel would have originally
> assigned. the dev_printk's will use the preferred name, and it will be
> modifiable from user space. All the kernel will do is print out
> whatever it is ... no guarantees of uniqueness or specific format will
> be made. Since we're only providing one preferred_name file, the kernel
> can only have one preferred name for a device at any given time
> (although it is modifiable on the fly as many times as the user
> chooses).
>
> The design is to use this preferred name to implement what Hitachi wants
> in terms of persistent name, but we don't really care.
>
> All userspace naming will be taken care of by the usual udev rules, so
> for disks, something like /dev/disk/by-preferred/<fred> which would be
> the usual symbolic link.

No, udev can not create such a link after the preferred name is set, as
it has no way of knowing that the name was set.

> This will ensure that kernel output and udev input are consistent. It
> will still require that user space utilities which derive a name for a
> device will need modifying to print out the preferred name.

It also doesn't solve the issue of userspace wanting to use such a
"preferred" name in the command line of tools, as there will not be a
link back to the "kernel" name directly in /dev/.

So as userspace tools will still need to be fixed, I don't see how
adding a kernel file for this is going to help any. Well, a bit in that
the kernel log files will look "different", but again, that really isn't
a problem that userspace couldn't also solve with no kernel changes
needed.

thanks,

greg k-h

2011-06-16 16:25:12

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
> On Thu, Jun 16, 2011 at 11:50:54AM -0400, James Bottomley wrote:
> > > And again, why not just fix the userspace tools? That is trivial to do
> > > so and again, could have been done by now in the years this has been
> > > discussed.
> >
> > So I can summarise where I think we are in these discussions:
> >
> > We provide the ability to give all kernel devices a "preferred name".
> > By default this will be the device name the kernel would have originally
> > assigned. the dev_printk's will use the preferred name, and it will be
> > modifiable from user space. All the kernel will do is print out
> > whatever it is ... no guarantees of uniqueness or specific format will
> > be made. Since we're only providing one preferred_name file, the kernel
> > can only have one preferred name for a device at any given time
> > (although it is modifiable on the fly as many times as the user
> > chooses).
> >
> > The design is to use this preferred name to implement what Hitachi wants
> > in terms of persistent name, but we don't really care.
> >
> > All userspace naming will be taken care of by the usual udev rules, so
> > for disks, something like /dev/disk/by-preferred/<fred> which would be
> > the usual symbolic link.
>
> No, udev can not create such a link after the preferred name is set, as
> it has no way of knowing that the name was set.

It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
equally happy having whatever sets the kernel name create the link (or
tickle udev to create it). We definitely require device links, though,
to get this to work.

> > This will ensure that kernel output and udev input are consistent. It
> > will still require that user space utilities which derive a name for a
> > device will need modifying to print out the preferred name.
>
> It also doesn't solve the issue of userspace wanting to use such a
> "preferred" name in the command line of tools, as there will not be a
> link back to the "kernel" name directly in /dev/.

Right ... most tools use the name they're given (and all variants
including the preferred one have links in /dev), which means they will
show the preferred name by default (if they were given that name as
input). The only problem is tools that attempt to derive a device name,
which is quite a small subset.

> So as userspace tools will still need to be fixed, I don't see how
> adding a kernel file for this is going to help any. Well, a bit in that
> the kernel log files will look "different", but again, that really isn't
> a problem that userspace couldn't also solve with no kernel changes
> needed.

This is true, but I think for the small effort it takes to implement the
feature in-kernel compared with what we'd have to do to the
distributions to get it implemented in userspace (we'd need klogd to do
the conversion for dmesg ... I'm entirely unclear what we need to modify
for /proc/partitions, etc.) the benefit outweighs the cost.

Additionally, since renaming is something users seem to want (just look
at net interfaces), if we can make this work, we now have a definitive
answer to point people at.

James

2011-06-16 17:10:08

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 18:25, James Bottomley
<[email protected]> wrote:
> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>> On Thu, Jun 16, 2011 at 11:50:54AM -0400, James Bottomley wrote:
>> > > And again, why not just fix the userspace tools?  That is trivial to do
>> > > so and again, could have been done by now in the years this has been
>> > > discussed.
>> >
>> > So I can summarise where I think we are in these discussions:
>> >
>> > We provide the ability to give all kernel devices a "preferred name".
>> > By default this will be the device name the kernel would have originally
>> > assigned.  the dev_printk's will use the preferred name, and it will be
>> > modifiable from user space.  All the kernel will do is print out
>> > whatever it is ... no guarantees of uniqueness or specific format will
>> > be made.  Since we're only providing one preferred_name file, the kernel
>> > can only have one preferred name for a device at any given time
>> > (although it is modifiable on the fly as many times as the user
>> > chooses).
>> >
>> > The design is to use this preferred name to implement what Hitachi wants
>> > in terms of persistent name, but we don't really care.
>> >
>> > All userspace naming will be taken care of by the usual udev rules, so
>> > for disks, something like /dev/disk/by-preferred/<fred> which would be
>> > the usual symbolic link.
>>
>> No, udev can not create such a link after the preferred name is set, as
>> it has no way of knowing that the name was set.
>
> It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be
> equally happy having whatever sets the kernel name create the link (or
> tickle udev to create it).  We definitely require device links, though,
> to get this to work.

The tool which sets the name would be udev, I guess. What would be a
good example where this name would come from?

If these links are to be used in reality, all that must work from the
very first steps during early boot in initramfs I guess. Adding names
later to existing devices by some other tool, doesn't sound too
convincing.

I'm not opposed to the idea of a 'pretty name' in general, but I like
to see some real world example that makes sense, is better than what
we have, provides some generally useful infrastructure, solves a real
problem, and see how it can be consistently used.

I mean doing all that in contrast to simply have, per example: udev
always log the current 'kernel name' -> 'all symlinks' to syslog, and
be able to parse all history after that log entry from syslog just
fine. That can probably be done today already just fine, with a few
lines of udev rules.

I guess the real problem is to finally to admit that free-text syslog
is not the way to reliably do things in the future. We need proper
debug/error reporting from the kernel and not 'printk() from driver
hackers to admins' to read and try to make sense out of it. The real
answer is probably a 'smart' kernel-syslog and a reliable channel with
structured data from the kernel to userspace. All that 'pretty name'
stuff look suspiciously like a paper-over the real problem of a
missing general infrastructure which nobody wants to address for
years. Guess it's time to leave the UNIX stone age behind us nothing
against fancy text files filled with driver debug someone thought to
be useful, I added enough of that myself, but I doubt that the 'pretty
names' are the thing that can solve what enterprise use cases are
looking for since a very long time.

>> > This will ensure that kernel output and udev input are consistent.  It
>> > will still require that user space utilities which derive a name for a
>> > device will need modifying to print out the preferred name.
>>
>> It also doesn't solve the issue of userspace wanting to use such a
>> "preferred" name in the command line of tools, as there will not be a
>> link back to the "kernel" name directly in /dev/.
>
> Right ... most tools use the name they're given (and all variants
> including the preferred one have links in /dev), which means they will
> show the preferred name by default (if they were given that name as
> input).  The only problem is tools that attempt to derive a device name,
> which is quite a small subset.

The most important tool for disks, mount(8) canonicalizes the link
names to the primary device node name. :)

>> So as userspace tools will still need to be fixed, I don't see how
>> adding a kernel file for this is going to help any.  Well, a bit in that
>> the kernel log files will look "different", but again, that really isn't
>> a problem that userspace couldn't also solve with no kernel changes
>> needed.
>
> This is true, but I think for the small effort it takes to implement the
> feature in-kernel compared with what we'd have to do to the
> distributions to get it implemented in userspace (we'd need klogd to do
> the conversion for dmesg ... I'm entirely unclear what we need to modify
> for /proc/partitions, etc.) the benefit outweighs the cost.
>
> Additionally, since renaming is something users seem to want (just look
> at net interfaces), if we can make this work, we now have a definitive
> answer to point people at.

The way netifs are done today is a pretty good example how we did
things wrong. We need to step back here, and probably put the naming
of on-board net interfaces right into the kernel, and do nothing for
interfaces which are not explicitly configured. The races that arrise
with renaming we just can't handle properly. But anyway, that's a
different story.

Netifs are very different too, in the sense that firewall rules use
the names _in_ the kernel. We don't have such requirements for block
devices, unlike netifs, they are always just a number to the kernel.

And network interfaces have concept of a 'pretty name' in the kernel
already, it's: /sys/class/net/*/ifalias. And it is not commonly used,
because people want not a single but multiple names at the same time,
or just want the primary name set. :)

Kay

2011-06-16 17:20:29

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 19:09, Kay Sievers <[email protected]> wrote:
> On Thu, Jun 16, 2011 at 18:25, James Bottomley
> <[email protected]> wrote:
>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>> > All userspace naming will be taken care of by the usual udev rules, so
>>> > for disks, something like /dev/disk/by-preferred/<fred> which would be
>>> > the usual symbolic link.
>>>
>>> No, udev can not create such a link after the preferred name is set, as
>>> it has no way of knowing that the name was set.
>>
>> It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be
>> equally happy having whatever sets the kernel name create the link (or
>> tickle udev to create it).  We definitely require device links, though,
>> to get this to work.

Guess all that would work now, including mount(8) not canonicalizing.
What would happen if we mount:
/dev/disk/by-pretty/foo
and some tool later thinks the pretty name should better be 'bar', it
writes the name to /sys, we get a uevent, the old link disappears, we
get a new link, mount has no device node anymore for the mounted
device ...

So we basically get a one-shot additional pretty name? Guess, the
_single_ name changed anytime later just asks for serious problems. We
need to set it very early to be really useful, but how, where is it
coming from?

Kay

2011-06-16 17:33:04

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On 11-06-16 11:50 AM, James Bottomley wrote:
> On Thu, 2011-06-16 at 08:41 -0700, Greg KH wrote:
>> On Thu, Jun 16, 2011 at 09:03:43PM +0900, Nao Nishijima wrote:
>>> Hi Greg,
>>>
>>> (2011/06/16 0:33), Greg KH wrote:
>>>> On Wed, Jun 15, 2011 at 05:16:28PM +0900, Nao Nishijima wrote:
>>>>> Allow users to set the preferred name of device via sysfs interface.
>>>>>
>>>>> (Exsample) sda -> foo
>>>>> # echo foo> /sys/block/sda/preferred_name
>>>>>
>>>>> Suggested-by: James Bottomley<[email protected]>
>>>>> Suggested-by: Jon Masters<[email protected]>
>>>>> Signed-off-by: Nao Nishijima<[email protected]>
>>>>
>>>> You don't document this new sysfs file (which is required), nor do you
>>>> explain what it is for and how to use it.
>>>>
>>>> Please do that in this patch, and in a Documentation/ABI/ file for any
>>>> new sysfs file you create.
>>>>
>>>
>>> I'm afraid that my explanation was not enough.
>>> I will add explanation to this patch and Documentation/ABI file.
>>>
>>>> I still fail to understand how a "preferred" file will help anyone out
>>>> here at all...
>>>>
>>>
>>> Let me explain, users cannot identify a device from a device name
>>> because device names may change at each boot up time. If kernel show
>>> preferred names in kernel messages, users can easily identify a device
>>> from kernel messages.
>>
>> I understand your request for the kernel to print out these types of
>> names, but I'm still not sold on this being an issue that is the
>> kernel's to deal with at all.
>>
>> Again, how would you handle multiple persistant names for the same
>> device being used at the same time?
>>
>> And again, why not just fix the userspace tools? That is trivial to do
>> so and again, could have been done by now in the years this has been
>> discussed.
>
> So I can summarise where I think we are in these discussions:
>
> We provide the ability to give all kernel devices a "preferred name".
> By default this will be the device name the kernel would have originally
> assigned. the dev_printk's will use the preferred name, and it will be
> modifiable from user space. All the kernel will do is print out
> whatever it is ... no guarantees of uniqueness or specific format will
> be made. Since we're only providing one preferred_name file, the kernel
> can only have one preferred name for a device at any given time
> (although it is modifiable on the fly as many times as the user
> chooses).
>
> The design is to use this preferred name to implement what Hitachi wants
> in terms of persistent name, but we don't really care.
>
> All userspace naming will be taken care of by the usual udev rules, so
> for disks, something like /dev/disk/by-preferred/<fred> which would be
> the usual symbolic link.
>
> This will ensure that kernel output and udev input are consistent. It
> will still require that user space utilities which derive a name for a
> device will need modifying to print out the preferred name.

I like the idea but wonder about some of the details.

Basically the problem is that you find yourself at the
keyboard of a server with a non-trivial storage set up
where something is wrong. The device names that a tool
like lsscsi shows (by default) are device nodes in
the /dev directory. For whatever reason those names may
not match what dmesg and kernel logs are showing.

Whether Greg likes it or not, pretty well all user space
tools and those who use them, look around the /dev directory
for devices **.

lsscsi by default shows the first name in the /dev directory
that matches the major and minor in
/sys/block/<kernel_name>/dev ***. With the '--kname' option
lsscsi will show the kernel (/sys/block) device name instead.


As James' notes each storage device can only have one
"preferred name" at a time. But how can that preferred name
be represented in sysfs ? If /sys/block/<kernel_name>
suddenly changed to /sys/block/<preferred_name> that would
be very disruptive. I would prefer something like
/sys/class/block_preferred_name/<preferred_name> was introduced
as a symlink to /sys/block/<kernel_name> but that would require
that all <preferred_name>s were unique. [Not a bad restriction
IMO]. If the <preferred_name> was only placed in the
/sys/block/<kernel_name> directory that would be ugly from
the user space tool point of view.


** I write several storage user space tools and can only
think of one instance where a sysfs "device" node is
accepted (smp_utils with the sgv4(bsg) interface) and
that is in addition to a /dev/bsg device node.

*** lsscsi could be smarter and drill down through /dev 's
sub-directories until a match is found. It could also
flag that the /dev name and the kernel_name are
different.

Doug Gilbert

2011-06-16 18:01:16

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On 11-06-16 01:20 PM, Kay Sievers wrote:
> On Thu, Jun 16, 2011 at 19:09, Kay Sievers<[email protected]> wrote:
>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>> <[email protected]> wrote:
>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>> for disks, something like /dev/disk/by-preferred/<fred> which would be
>>>>> the usual symbolic link.
>>>>
>>>> No, udev can not create such a link after the preferred name is set, as
>>>> it has no way of knowing that the name was set.
>>>
>>> It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
>>> equally happy having whatever sets the kernel name create the link (or
>>> tickle udev to create it). We definitely require device links, though,
>>> to get this to work.
>
> Guess all that would work now, including mount(8) not canonicalizing.
> What would happen if we mount:
> /dev/disk/by-pretty/foo
> and some tool later thinks the pretty name should better be 'bar', it
> writes the name to /sys, we get a uevent, the old link disappears, we
> get a new link, mount has no device node anymore for the mounted
> device ...
>
> So we basically get a one-shot additional pretty name? Guess, the
> _single_ name changed anytime later just asks for serious problems. We
> need to set it very early to be really useful, but how, where is it
> coming from?

One obvious candidate for a preferred block device name
is:
- a SATA disk's WWN (NAA 5 64 bit), or
- a SCSI disk's logical unit name (e.g. SAS: NAA 5)

These names (actually numbers) are meant to be world wide
unique.

The kernel's device naming (following from how devices are
discovered) is topological. However at higher levels
the user is interested in the device identity. So if
unique device names were used as preferred names and
preferred names were unique (in a Linux system at any
given time) then any subsequent path to an existing device
would be highlighted. [That is because subsequent attempts
to create its preferred name would fail because it is
already there.]

You don't need thousands of dollars of equipment to
demonstrate this point. An external single disk
SATA enclosure with a USB and eSATA interface will do.

Doug Gilbert

2011-06-16 18:02:47

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 01:32:10PM -0400, Douglas Gilbert wrote:

> As James' notes each storage device can only have one
> "preferred name" at a time. But how can that preferred name
> be represented in sysfs ? If /sys/block/<kernel_name>
> suddenly changed to /sys/block/<preferred_name> that would
> be very disruptive. I would prefer something like
> /sys/class/block_preferred_name/<preferred_name> was introduced
> as a symlink to /sys/block/<kernel_name> but that would require
> that all <preferred_name>s were unique. [Not a bad restriction
> IMO]. If the <preferred_name> was only placed in the
> /sys/block/<kernel_name> directory that would be ugly from
> the user space tool point of view.

Oh, rapture... so now we can expect the container crowd to introduce
yet *another* "namespace" ;-/

2011-06-16 18:05:22

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 20:00, Douglas Gilbert <[email protected]> wrote:
> On 11-06-16 01:20 PM, Kay Sievers wrote:
>> On Thu, Jun 16, 2011 at 19:09, Kay Sievers<[email protected]>  wrote:
>>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>>> <[email protected]>  wrote:
>>>>
>>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>>>
>>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>>>
>>>>>> for disks, something like /dev/disk/by-preferred/<fred>  which would
>>>>>> be
>>>>>> the usual symbolic link.
>>>>>
>>>>> No, udev can not create such a link after the preferred name is set, as
>>>>> it has no way of knowing that the name was set.
>>>>
>>>> It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be
>>>> equally happy having whatever sets the kernel name create the link (or
>>>> tickle udev to create it).  We definitely require device links, though,
>>>> to get this to work.
>>
>> Guess all that would work now, including mount(8) not canonicalizing.
>> What would happen if we mount:
>>   /dev/disk/by-pretty/foo
>> and some tool later thinks the pretty name should better be 'bar', it
>> writes the name to /sys, we get a uevent, the old link disappears, we
>> get a new link, mount has no device node anymore for the mounted
>> device ...
>>
>> So we basically get a one-shot additional pretty name? Guess, the
>> _single_ name changed anytime later just asks for serious problems. We
>> need to set it very early to be really useful, but how, where is it
>> coming from?
>
> One obvious candidate for a preferred block device name
> is:
>  - a SATA disk's WWN (NAA 5 64 bit), or
>  - a SCSI disk's logical unit name (e.g. SAS: NAA 5)
>
> These names (actually numbers) are meant to be world wide
> unique.
>
> The kernel's device naming (following from how devices are
> discovered) is topological. However at higher levels
> the user is interested in the device identity. So if
> unique device names were used as preferred names and
> preferred names were unique (in a Linux system at any
> given time) then any subsequent path to an existing device
> would be highlighted. [That is because subsequent attempts
> to create its preferred name would fail because it is
> already there.]
>
> You don't need thousands of dollars of equipment to
> demonstrate this point. An external single disk
> SATA enclosure with a USB and eSATA interface will do.

Udev does that already since quite a while. This is my cheap laptop:
# find /dev/disk/ -name "wwn*"
/dev/disk/by-id/wwn-0x50015179593f3038-part1
/dev/disk/by-id/wwn-0x50015179593f3038-part4
/dev/disk/by-id/wwn-0x50015179593f3038-part3
/dev/disk/by-id/wwn-0x50015179593f3038-part2
/dev/disk/by-id/wwn-0x50015179593f3038

Kay

2011-06-16 18:16:20

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On 11-06-16 02:05 PM, Kay Sievers wrote:
> On Thu, Jun 16, 2011 at 20:00, Douglas Gilbert<[email protected]> wrote:
>> On 11-06-16 01:20 PM, Kay Sievers wrote:
>>> On Thu, Jun 16, 2011 at 19:09, Kay Sievers<[email protected]> wrote:
>>>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>>>> <[email protected]> wrote:
>>>>>
>>>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>>>>
>>>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>>>>
>>>>>>> for disks, something like /dev/disk/by-preferred/<fred> which would
>>>>>>> be
>>>>>>> the usual symbolic link.
>>>>>>
>>>>>> No, udev can not create such a link after the preferred name is set, as
>>>>>> it has no way of knowing that the name was set.
>>>>>
>>>>> It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
>>>>> equally happy having whatever sets the kernel name create the link (or
>>>>> tickle udev to create it). We definitely require device links, though,
>>>>> to get this to work.
>>>
>>> Guess all that would work now, including mount(8) not canonicalizing.
>>> What would happen if we mount:
>>> /dev/disk/by-pretty/foo
>>> and some tool later thinks the pretty name should better be 'bar', it
>>> writes the name to /sys, we get a uevent, the old link disappears, we
>>> get a new link, mount has no device node anymore for the mounted
>>> device ...
>>>
>>> So we basically get a one-shot additional pretty name? Guess, the
>>> _single_ name changed anytime later just asks for serious problems. We
>>> need to set it very early to be really useful, but how, where is it
>>> coming from?
>>
>> One obvious candidate for a preferred block device name
>> is:
>> - a SATA disk's WWN (NAA 5 64 bit), or
>> - a SCSI disk's logical unit name (e.g. SAS: NAA 5)
>>
>> These names (actually numbers) are meant to be world wide
>> unique.
>>
>> The kernel's device naming (following from how devices are
>> discovered) is topological. However at higher levels
>> the user is interested in the device identity. So if
>> unique device names were used as preferred names and
>> preferred names were unique (in a Linux system at any
>> given time) then any subsequent path to an existing device
>> would be highlighted. [That is because subsequent attempts
>> to create its preferred name would fail because it is
>> already there.]
>>
>> You don't need thousands of dollars of equipment to
>> demonstrate this point. An external single disk
>> SATA enclosure with a USB and eSATA interface will do.
>
> Udev does that already since quite a while. This is my cheap laptop:
> # find /dev/disk/ -name "wwn*"
> /dev/disk/by-id/wwn-0x50015179593f3038-part1
> /dev/disk/by-id/wwn-0x50015179593f3038-part4
> /dev/disk/by-id/wwn-0x50015179593f3038-part3
> /dev/disk/by-id/wwn-0x50015179593f3038-part2
> /dev/disk/by-id/wwn-0x50015179593f3038

That is my point, if that disk is eSATA and USB connected
which transport is that link pointing to? I would
prefer eSATA over USB any day but is udev that smart? Or
are we just seeing a symlink to the first (or perhaps last)
path discovered?

Doug Gilbert

2011-06-16 18:22:44

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote:
> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
> > > All userspace naming will be taken care of by the usual udev rules, so
> > > for disks, something like /dev/disk/by-preferred/<fred> which would be
> > > the usual symbolic link.
> >
> > No, udev can not create such a link after the preferred name is set, as
> > it has no way of knowing that the name was set.
>
> It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
> equally happy having whatever sets the kernel name create the link (or
> tickle udev to create it). We definitely require device links, though,
> to get this to work.

And no, I don't want to trigger a uevent, Kay pointed out where this
will go very wrong very quickly if this is done.

> > > This will ensure that kernel output and udev input are consistent. It
> > > will still require that user space utilities which derive a name for a
> > > device will need modifying to print out the preferred name.
> >
> > It also doesn't solve the issue of userspace wanting to use such a
> > "preferred" name in the command line of tools, as there will not be a
> > link back to the "kernel" name directly in /dev/.
>
> Right ... most tools use the name they're given (and all variants
> including the preferred one have links in /dev), which means they will
> show the preferred name by default (if they were given that name as
> input). The only problem is tools that attempt to derive a device name,
> which is quite a small subset.

Douglas pointed out that those tools look in /dev/ which would not work
properly for this type of thing.

> > So as userspace tools will still need to be fixed, I don't see how
> > adding a kernel file for this is going to help any. Well, a bit in that
> > the kernel log files will look "different", but again, that really isn't
> > a problem that userspace couldn't also solve with no kernel changes
> > needed.
>
> This is true, but I think for the small effort it takes to implement the
> feature in-kernel compared with what we'd have to do to the
> distributions to get it implemented in userspace (we'd need klogd to do
> the conversion for dmesg ... I'm entirely unclear what we need to modify
> for /proc/partitions, etc.) the benefit outweighs the cost.
>
> Additionally, since renaming is something users seem to want (just look
> at net interfaces), if we can make this work, we now have a definitive
> answer to point people at.

Renaming is something that we do NOT want to do, as we have learned our
lesson of the network device renaming mess. And as Kay pointed out, we
already have an "alias" name there, which no one uses.

So again, I really don't like this, just fix the userspace tools to map
the proper device name that the kernel is using to the userspace name
the tool used, and all is fine. This has been done already today,
succesfully, by many of the big "enterprise" monitoring systems that
work quite well on Linux, proving that this is not something that the
kernel needs to provide to implement properly.

thanks,

greg k-h

2011-06-16 18:42:37

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 20:15, Douglas Gilbert <[email protected]> wrote:
> On 11-06-16 02:05 PM, Kay Sievers wrote:
>>> The kernel's device naming (following from how devices are
>>> discovered) is topological. However at higher levels
>>> the user is interested in the device identity. So if
>>> unique device names were used as preferred names and
>>> preferred names were unique (in a Linux system at any
>>> given time) then any subsequent path to an existing device
>>> would be highlighted. [That is because subsequent attempts
>>> to create its preferred name would fail because it is
>>> already there.]
>>>
>>> You don't need thousands of dollars of equipment to
>>> demonstrate this point. An external single disk
>>> SATA enclosure with a USB and eSATA interface will do.
>>
>> Udev does that already since quite a while. This is my cheap laptop:
>>   # find /dev/disk/ -name "wwn*"
>>   /dev/disk/by-id/wwn-0x50015179593f3038-part1
>>   /dev/disk/by-id/wwn-0x50015179593f3038-part4
>>   /dev/disk/by-id/wwn-0x50015179593f3038-part3
>>   /dev/disk/by-id/wwn-0x50015179593f3038-part2
>>   /dev/disk/by-id/wwn-0x50015179593f3038
>
> That is my point, if that disk is eSATA and USB connected
> which transport is that link pointing to? I would
> prefer eSATA over USB any day but is udev that smart? Or
> are we just seeing a symlink to the first (or perhaps last)
> path discovered?

I don't know if any bridge supports both connections at the same time.
Mine doesn't.

If two kernel devices fight for the same link, the last one wins,
unless there are link-priorities specified in udev rules.

Kay

2011-06-16 20:31:37

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, 2011-06-16 at 11:19 -0700, Greg KH wrote:
> On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote:
> > On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
> > > > All userspace naming will be taken care of by the usual udev rules, so
> > > > for disks, something like /dev/disk/by-preferred/<fred> which would be
> > > > the usual symbolic link.
> > >
> > > No, udev can not create such a link after the preferred name is set, as
> > > it has no way of knowing that the name was set.
> >
> > It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
> > equally happy having whatever sets the kernel name create the link (or
> > tickle udev to create it). We definitely require device links, though,
> > to get this to work.
>
> And no, I don't want to trigger a uevent, Kay pointed out where this
> will go very wrong very quickly if this is done.

As I said: we just need a by-preferred type of link.

> > > This will ensure that kernel output and udev input are consistent. It
> > > > will still require that user space utilities which derive a name for a
> > > > device will need modifying to print out the preferred name.
> > >
> > > It also doesn't solve the issue of userspace wanting to use such a
> > > "preferred" name in the command line of tools, as there will not be a
> > > link back to the "kernel" name directly in /dev/.
> >
> > Right ... most tools use the name they're given (and all variants
> > including the preferred one have links in /dev), which means they will
> > show the preferred name by default (if they were given that name as
> > input). The only problem is tools that attempt to derive a device name,
> > which is quite a small subset.
>
> Douglas pointed out that those tools look in /dev/ which would not work
> properly for this type of thing.

Not all tools try to deduce the name ... most just use what they're
given. So some tools need changing, but it's by no means all tools.

> > > So as userspace tools will still need to be fixed, I don't see how
> > > adding a kernel file for this is going to help any. Well, a bit in that
> > > the kernel log files will look "different", but again, that really isn't
> > > a problem that userspace couldn't also solve with no kernel changes
> > > needed.
> >
> > This is true, but I think for the small effort it takes to implement the
> > feature in-kernel compared with what we'd have to do to the
> > distributions to get it implemented in userspace (we'd need klogd to do
> > the conversion for dmesg ... I'm entirely unclear what we need to modify
> > for /proc/partitions, etc.) the benefit outweighs the cost.
> >
> > Additionally, since renaming is something users seem to want (just look
> > at net interfaces), if we can make this work, we now have a definitive
> > answer to point people at.
>
> Renaming is something that we do NOT want to do, as we have learned our
> lesson of the network device renaming mess. And as Kay pointed out, we
> already have an "alias" name there, which no one uses.

Look at this as an opportunity to get it right. The original proposal
was for renaming. By iterating over the actual requirements, we have it
reduced to simply having the kernel print a preferred name. I think
that's a nice achievement which we can point other proponents of
renaming to as they arise.

> So again, I really don't like this, just fix the userspace tools to map
> the proper device name that the kernel is using to the userspace name
> the tool used, and all is fine. This has been done already today,
> succesfully, by many of the big "enterprise" monitoring systems that
> work quite well on Linux, proving that this is not something that the
> kernel needs to provide to implement properly.

Well, it's expediency. Sure we could try to patch the world, but I
think the simple patch of getting the kernel to print a preferred name
solves 90% of the problem. Sure there is a long tail of userspace
components that needs fixing, but that can be done gradually if we take
the kernel route. If we go the userspace route, it will be a long while
before we even get to 50% coverage.

James

2011-06-16 21:25:52

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Jun 16 Douglas Gilbert wrote:
> One obvious candidate for a preferred block device name
> is:
> - a SATA disk's WWN (NAA 5 64 bit), or
> - a SCSI disk's logical unit name (e.g. SAS: NAA 5)
>
> These names (actually numbers) are meant to be world wide
> unique.
>
> The kernel's device naming (following from how devices are
> discovered) is topological.

I disagree.

1. The kernel name is not about topology. It is from a flat
namespace where nothing else than oder of registration counts.
In some very simple systems there is a deceptively strong correlation
between topology and order of registration. But in the general case
there is no correlation.

2. The persistent worldwide unique name of a device may actually not be
what a particular admin considers a "preferred" name. They may prefer
a topological name ("disk in bay 7") or they may prefer a mnemonic name
from a label on the hardware, invented and tacked on by the vendor
(especially vendor--model name tuples like "Canon MV5i camcorder") or by
the admin themselves, or stored in device memory rather than written
out on a sticker.

Besides, the preference may change from situation to situation. In some
situations, _two_ names are actually required at the same time (besides the
unloved kernel name). So, a "preferred name" as a single datum per device
does not cut it, as also already noted by Kay at the example of the netif
alias.

The retrieval and mapping of the list of "more or less preferred names"
that people realistically use is probably better implemented in userland.
Any kernel solution is bound to impose arbitrary limitations.
--
Stefan Richter
-=====-==-== -==- =----
http://arcgraph.de/sr/

2011-06-16 22:05:55

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 22:31, James Bottomley
<[email protected]> wrote:
> On Thu, 2011-06-16 at 11:19 -0700, Greg KH wrote:
>> On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote:
>> > On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:

>> > > No, udev can not create such a link after the preferred name is set, as
>> > > it has no way of knowing that the name was set.
>> >
>> > It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be
>> > equally happy having whatever sets the kernel name create the link (or
>> > tickle udev to create it).  We definitely require device links, though,
>> > to get this to work.
>>
>> And no, I don't want to trigger a uevent, Kay pointed out where this
>> will go very wrong very quickly if this is done.
>
> As I said: we just need a by-preferred type of link.

And if the user changes the name, the link and all earlier uses will
be dangling, even /proc/mounts might show non-existing device names.

I honestly don't think there will ever be _the_ name for a device.
We've been there, stuff seems not to work that way in the real world.

I really like to hear how stuff is supposed to compose _the_ name in a
real world use-case like a multipath setup, in initramfs, in a heavy
hotplug setup, and so on ... And with more details than "udev will set
_the_ name", I really fail to see how that is supposed to happen to be
useful.

I think any solution that assumes the name can change later after it
is possibly already used, is just wishful thinking.

We need many names, and we need all of them from the very beginning,
and they should not change during device lifetime unless the device
state changes.

>> > > So as userspace tools will still need to be fixed, I don't see how
>> > > adding a kernel file for this is going to help any.  Well, a bit in that
>> > > the kernel log files will look "different", but again, that really isn't
>> > > a problem that userspace couldn't also solve with no kernel changes
>> > > needed.
>> >
>> > This is true, but I think for the small effort it takes to implement the
>> > feature in-kernel compared with what we'd have to do to the
>> > distributions to get it implemented in userspace (we'd need klogd to do
>> > the conversion for dmesg ... I'm entirely unclear what we need to modify
>> > for /proc/partitions, etc.) the benefit outweighs the cost.
>> >
>> > Additionally, since renaming is something users seem to want (just look
>> > at net interfaces), if we can make this work, we now have a definitive
>> > answer to point people at.
>>
>> Renaming is something that we do NOT want to do, as we have learned our
>> lesson of the network device renaming mess.  And as Kay pointed out, we
>> already have an "alias" name there, which no one uses.
>
> Look at this as an opportunity to get it right.  The original proposal
> was for renaming.  By iterating over the actual requirements, we have it
> reduced to simply having the kernel print a preferred name.  I think
> that's a nice achievement which we can point other proponents of
> renaming to as they arise.

Sure, we absolutely don't want renaming, and we can provide countless
solid technical reasons why we should not allow it to happen. But I'm
also pretty sure, we also don't want just-another-single-name to put
somewhere in the kernel.

>> So again, I really don't like this, just fix the userspace tools to map
>> the proper device name that the kernel is using to the userspace name
>> the tool used, and all is fine.  This has been done already today,
>> succesfully, by many of the big "enterprise" monitoring systems that
>> work quite well on Linux, proving that this is not something that the
>> kernel needs to provide to implement properly.
>
> Well, it's expediency.  Sure we could try to patch the world, but I
> think the simple patch of getting the kernel to print a preferred name
> solves 90% of the problem.  Sure there is a long tail of userspace
> components that needs fixing, but that can be done gradually if we take
> the kernel route.  If we go the userspace route, it will be a long while
> before we even get to 50% coverage.

I need to ask again ask for an explanation why logging all symlinks at
device discovery from udev, does not solve exactly this problem. With
that tag in the syslog message stream, all later kernel names can be
safely associated with _all_ the current device names in question,
until the next tag from udev is found.

Kay

2011-06-16 22:45:27

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, 2011-06-17 at 00:05 +0200, Kay Sievers wrote:
> On Thu, Jun 16, 2011 at 22:31, James Bottomley
> <[email protected]> wrote:
> > On Thu, 2011-06-16 at 11:19 -0700, Greg KH wrote:
> >> On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote:
> >> > On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>
> >> > > No, udev can not create such a link after the preferred name is set, as
> >> > > it has no way of knowing that the name was set.
> >> >
> >> > It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
> >> > equally happy having whatever sets the kernel name create the link (or
> >> > tickle udev to create it). We definitely require device links, though,
> >> > to get this to work.
> >>
> >> And no, I don't want to trigger a uevent, Kay pointed out where this
> >> will go very wrong very quickly if this is done.
> >
> > As I said: we just need a by-preferred type of link.
>
> And if the user changes the name, the link and all earlier uses will
> be dangling, even /proc/mounts might show non-existing device names.

I don't understand this. If a user decides to call sda "fred" by doing
an echo to the preferred named file, then there should be a link

/dev/disk/by-preferred/fred -> ../../sda

Even if the name later changes to "angela", /dev/disk/by-preferred/fred
will be valid if we don't clean it up.

> I honestly don't think there will ever be _the_ name for a device.
> We've been there, stuff seems not to work that way in the real world.

So that's not really the point ... all we do with it in-kernel is use it
as the device name for log prints and some basic /proc files (again,
mainly as prints); nothing more.

> I really like to hear how stuff is supposed to compose _the_ name in a
> real world use-case like a multipath setup, in initramfs, in a heavy
> hotplug setup, and so on ... And with more details than "udev will set
> _the_ name", I really fail to see how that is supposed to happen to be
> useful.

All that really has to happen is that we get a database of 1:1
correspondence between preferred name and actual name (with device
links).

> I think any solution that assumes the name can change later after it
> is possibly already used, is just wishful thinking.

The ability to change on the fly isn't part of the original hitachi
proposal, but I don't really see why it can't ... it just alters the way
the kernel prints out the name, nothing more.

> We need many names, and we need all of them from the very beginning,
> and they should not change during device lifetime unless the device
> state changes.

So that's actually an argument for leaving the links, surely? We can
have many inbound links, but the kernel can only print one name in
messages, which would be the preferred name that was currently set.

> >> > > So as userspace tools will still need to be fixed, I don't see how
> >> > > adding a kernel file for this is going to help any. Well, a bit in that
> >> > > the kernel log files will look "different", but again, that really isn't
> >> > > a problem that userspace couldn't also solve with no kernel changes
> >> > > needed.
> >> >
> >> > This is true, but I think for the small effort it takes to implement the
> >> > feature in-kernel compared with what we'd have to do to the
> >> > distributions to get it implemented in userspace (we'd need klogd to do
> >> > the conversion for dmesg ... I'm entirely unclear what we need to modify
> >> > for /proc/partitions, etc.) the benefit outweighs the cost.
> >> >
> >> > Additionally, since renaming is something users seem to want (just look
> >> > at net interfaces), if we can make this work, we now have a definitive
> >> > answer to point people at.
> >>
> >> Renaming is something that we do NOT want to do, as we have learned our
> >> lesson of the network device renaming mess. And as Kay pointed out, we
> >> already have an "alias" name there, which no one uses.
> >
> > Look at this as an opportunity to get it right. The original proposal
> > was for renaming. By iterating over the actual requirements, we have it
> > reduced to simply having the kernel print a preferred name. I think
> > that's a nice achievement which we can point other proponents of
> > renaming to as they arise.
>
> Sure, we absolutely don't want renaming, and we can provide countless
> solid technical reasons why we should not allow it to happen. But I'm
> also pretty sure, we also don't want just-another-single-name to put
> somewhere in the kernel.

I understand why we don't want renaming. However, the technical reason
why we want a preferred name is that it's often associated with a name
printed somewhere on the box (say a label on the disk enclosure, or
ethernet port). Not being able to use this name to address the device
is a usability issue which annoys the enterprise enormously.

So if we stop there, regardless of solution (in-kernel or fix all
userspace), does everyone see what the actual problem is?

> >> So again, I really don't like this, just fix the userspace tools to map
> >> the proper device name that the kernel is using to the userspace name
> >> the tool used, and all is fine. This has been done already today,
> >> succesfully, by many of the big "enterprise" monitoring systems that
> >> work quite well on Linux, proving that this is not something that the
> >> kernel needs to provide to implement properly.
> >
> > Well, it's expediency. Sure we could try to patch the world, but I
> > think the simple patch of getting the kernel to print a preferred name
> > solves 90% of the problem. Sure there is a long tail of userspace
> > components that needs fixing, but that can be done gradually if we take
> > the kernel route. If we go the userspace route, it will be a long while
> > before we even get to 50% coverage.
>
> I need to ask again ask for an explanation why logging all symlinks at
> device discovery from udev, does not solve exactly this problem. With
> that tag in the syslog message stream, all later kernel names can be
> safely associated with _all_ the current device names in question,
> until the next tag from udev is found.

So if the user has one preferred name, us logging all the names (and we
have quite a few for disks) doesn't really help because the user might
want to choose a different name. However, even if we assume they choose
one of the current names, they still have to do the mapping manually;
even if they have all the information, they can't just cut and paste
from dmesg say, they have to cut, edit the buffer to put in the
preferred name and then paste ... that's just one annoying step too far
for most users. I agree that all the output tools within reason can be
fixed to do this automatically, but fixing cat say, just so
cat /proc/partitions works would never be acceptable upstream.

The reason for storing this in the kernel is just that it's easier than
trying to update all the tools, and it solves 90% of the problem, which
makes the solution usable, even if we have to update tools to get to
100%.

James

2011-06-16 22:48:24

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, 2011-06-16 at 13:32 -0400, Douglas Gilbert wrote:
> As James' notes each storage device can only have one
> "preferred name" at a time. But how can that preferred name
> be represented in sysfs ? If /sys/block/<kernel_name>
> suddenly changed to /sys/block/<preferred_name> that would
> be very disruptive. I would prefer something like
> /sys/class/block_preferred_name/<preferred_name> was introduced
> as a symlink to /sys/block/<kernel_name> but that would require
> that all <preferred_name>s were unique. [Not a bad restriction
> IMO]. If the <preferred_name> was only placed in the
> /sys/block/<kernel_name> directory that would be ugly from
> the user space tool point of view.

Right, so the only proposal for sysfs is the addition of the
preferred_name file to devices, nothing more. The entire sysfs tree
structure would be left intact. This is what gets us out of the
problems that a real rename causes.

James

2011-06-16 23:05:14

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 00:45, James Bottomley
<[email protected]> wrote:
> On Fri, 2011-06-17 at 00:05 +0200, Kay Sievers wrote:
>> On Thu, Jun 16, 2011 at 22:31, James Bottomley
>> <[email protected]> wrote:
>> > On Thu, 2011-06-16 at 11:19 -0700, Greg KH wrote:
>> >> On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote:
>> >> > On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>
>> >> > > No, udev can not create such a link after the preferred name is set, as
>> >> > > it has no way of knowing that the name was set.
>> >> >
>> >> > It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be
>> >> > equally happy having whatever sets the kernel name create the link (or
>> >> > tickle udev to create it).  We definitely require device links, though,
>> >> > to get this to work.
>> >>
>> >> And no, I don't want to trigger a uevent, Kay pointed out where this
>> >> will go very wrong very quickly if this is done.
>> >
>> > As I said: we just need a by-preferred type of link.
>>
>> And if the user changes the name, the link and all earlier uses will
>> be dangling, even /proc/mounts might show non-existing device names.
>
> I don't understand this.  If a user decides to call sda "fred" by doing
> an echo to the preferred named file, then there should be a link
>
> /dev/disk/by-preferred/fred -> ../../sda
>
> Even if the name later changes to "angela", /dev/disk/by-preferred/fred
> will be valid if we don't clean it up.

Not currently, udev needs to keep track of all symlinks, and removes
the ones not specified in rules. There is not way for udev to track no
longer valid names.

We can not just add stuff to /dev without a udev database entry, it
would never get removed on device unplug and leave a real mess behind.

>> I honestly don't think there will ever be _the_ name for a device.
>> We've been there, stuff seems not to work that way in the real world.
>
> So that's not really the point ... all we do with it in-kernel is use it
> as the device name for log prints and some basic /proc files (again,
> mainly as prints); nothing more.
>
>> I really like to hear how stuff is supposed to compose _the_ name in a
>> real world use-case like a multipath setup, in initramfs, in a heavy
>> hotplug setup, and so on ... And with more details than "udev will set
>> _the_ name", I really fail to see how that is supposed to happen to be
>> useful.
>
> All that really has to happen is that we get a database of 1:1
> correspondence between preferred name and actual name (with device
> links).
>
>> I think any solution that assumes the name can change later after it
>> is possibly already used, is just wishful thinking.
>
> The ability to change on the fly isn't part of the original hitachi
> proposal, but I don't really see why it can't ... it just alters the way
> the kernel prints out the name, nothing more.

And creates a huge problem for everything that uses that name.

>> We need many names, and we need all of them from the very beginning,
>> and they should not change during device lifetime unless the device
>> state changes.
>
> So that's actually an argument for leaving the links, surely?  We can
> have many inbound links, but the kernel can only print one name in
> messages, which would be the preferred name that was currently set.

I really question any concept of _the_ name. My take on it: It will
never work in reality.

>> >> > > So as userspace tools will still need to be fixed, I don't see how
>> >> > > adding a kernel file for this is going to help any.  Well, a bit in that
>> >> > > the kernel log files will look "different", but again, that really isn't
>> >> > > a problem that userspace couldn't also solve with no kernel changes
>> >> > > needed.
>> >> >
>> >> > This is true, but I think for the small effort it takes to implement the
>> >> > feature in-kernel compared with what we'd have to do to the
>> >> > distributions to get it implemented in userspace (we'd need klogd to do
>> >> > the conversion for dmesg ... I'm entirely unclear what we need to modify
>> >> > for /proc/partitions, etc.) the benefit outweighs the cost.
>> >> >
>> >> > Additionally, since renaming is something users seem to want (just look
>> >> > at net interfaces), if we can make this work, we now have a definitive
>> >> > answer to point people at.
>> >>
>> >> Renaming is something that we do NOT want to do, as we have learned our
>> >> lesson of the network device renaming mess.  And as Kay pointed out, we
>> >> already have an "alias" name there, which no one uses.
>> >
>> > Look at this as an opportunity to get it right.  The original proposal
>> > was for renaming.  By iterating over the actual requirements, we have it
>> > reduced to simply having the kernel print a preferred name.  I think
>> > that's a nice achievement which we can point other proponents of
>> > renaming to as they arise.
>>
>> Sure, we absolutely don't want renaming, and we can provide countless
>> solid technical reasons why we should not allow it to happen. But I'm
>> also pretty sure, we also don't want just-another-single-name to put
>> somewhere in the kernel.
>
> I understand why we don't want renaming.  However, the technical reason
> why we want a preferred name is that it's often associated with a name
> printed somewhere on the box (say a label on the disk enclosure, or
> ethernet port).  Not being able to use this name to address the device
> is a usability issue which annoys the enterprise enormously.
>
> So if we stop there, regardless of solution (in-kernel or fix all
> userspace), does everyone see what the actual problem is?

I don't think that solves the problem, no. We need _smart_ userspace
with a debug/error message channel from the kernel to userspace that
pops out _structured_ data. Userspace needs to index the data, and
merge a lot of userspace information into it.

Adding just another single-name to the kernel just makes the
much-too-dumb free-text printk() a bit more readable, but still sounds
not like a solution. Pimping up syslog is not the solution to this
problem, and it can't be solved in the kernel alone.

>> >> So again, I really don't like this, just fix the userspace tools to map
>> >> the proper device name that the kernel is using to the userspace name
>> >> the tool used, and all is fine.  This has been done already today,
>> >> succesfully, by many of the big "enterprise" monitoring systems that
>> >> work quite well on Linux, proving that this is not something that the
>> >> kernel needs to provide to implement properly.
>> >
>> > Well, it's expediency.  Sure we could try to patch the world, but I
>> > think the simple patch of getting the kernel to print a preferred name
>> > solves 90% of the problem.  Sure there is a long tail of userspace
>> > components that needs fixing, but that can be done gradually if we take
>> > the kernel route.  If we go the userspace route, it will be a long while
>> > before we even get to 50% coverage.
>>
>> I need to ask again ask for an explanation why logging all symlinks at
>> device discovery from udev, does not solve exactly this problem. With
>> that tag in the syslog message stream, all later kernel names can be
>> safely associated with _all_ the current device names in question,
>> until the next tag from udev is found.
>
> So if the user has one preferred name, us logging all the names (and we
> have quite a few for disks) doesn't really help because the user might
> want to choose a different name.  However, even if we assume they choose
> one of the current names, they still have to do the mapping manually;
> even if they have all the information, they can't just cut and paste
> from dmesg say, they have to cut, edit the buffer to put in the
> preferred name and then paste ... that's just one annoying step too far
> for most users.  I agree that all the output tools within reason can be
> fixed to do this automatically, but fixing cat say, just so
> cat /proc/partitions works would never be acceptable upstream.
>
> The reason for storing this in the kernel is just that it's easier than
> trying to update all the tools, and it solves 90% of the problem, which
> makes the solution usable, even if we have to update tools to get to
> 100%.

I don't think we can even solve 10% of the problems that way. It's
just a hack that makes stuff a bit more pretty, but doesn't provide
any reasonable solution to the problem. I doubt we can even make a
simple use case out of it, what name to put into that field for a
multipath setup.

Kay

Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

(2011/06/17 1:14), Greg KH wrote:
> On Thu, Jun 16, 2011 at 11:50:54AM -0400, James Bottomley wrote:
>>> And again, why not just fix the userspace tools? That is trivial to do
>>> so and again, could have been done by now in the years this has been
>>> discussed.
>>
>> So I can summarise where I think we are in these discussions:
>>
>> We provide the ability to give all kernel devices a "preferred name".
>> By default this will be the device name the kernel would have originally
>> assigned. the dev_printk's will use the preferred name, and it will be
>> modifiable from user space. All the kernel will do is print out
>> whatever it is ... no guarantees of uniqueness or specific format will
>> be made. Since we're only providing one preferred_name file, the kernel
>> can only have one preferred name for a device at any given time
>> (although it is modifiable on the fly as many times as the user
>> chooses).
>>
>> The design is to use this preferred name to implement what Hitachi wants
>> in terms of persistent name, but we don't really care.
>>
>> All userspace naming will be taken care of by the usual udev rules, so
>> for disks, something like /dev/disk/by-preferred/<fred> which would be
>> the usual symbolic link.
>
> No, udev can not create such a link after the preferred name is set, as
> it has no way of knowing that the name was set.
>
>> This will ensure that kernel output and udev input are consistent. It
>> will still require that user space utilities which derive a name for a
>> device will need modifying to print out the preferred name.
>
> It also doesn't solve the issue of userspace wanting to use such a
> "preferred" name in the command line of tools, as there will not be a
> link back to the "kernel" name directly in /dev/.

Right, this series just add a preferred name interface, and changes
a part of kernel messages.

> So as userspace tools will still need to be fixed, I don't see how
> adding a kernel file for this is going to help any. Well, a bit in that
> the kernel log files will look "different", but again, that really isn't
> a problem that userspace couldn't also solve with no kernel changes
> needed.

hmm, He didnt say "this can solve all problems". I think
preferred name is just a starting point to solve these problems.
Actually, he decided to fix those user space tools to accept
persistent symbolic links, and to show it in outputs.

It's not complete, but a good starting point, isn't it?

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2011-06-17 05:26:49

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 12:33:05PM +0900, Masami Hiramatsu wrote:
> (2011/06/17 1:14), Greg KH wrote:
> > On Thu, Jun 16, 2011 at 11:50:54AM -0400, James Bottomley wrote:
> >>> And again, why not just fix the userspace tools? That is trivial to do
> >>> so and again, could have been done by now in the years this has been
> >>> discussed.
> >>
> >> So I can summarise where I think we are in these discussions:
> >>
> >> We provide the ability to give all kernel devices a "preferred name".
> >> By default this will be the device name the kernel would have originally
> >> assigned. the dev_printk's will use the preferred name, and it will be
> >> modifiable from user space. All the kernel will do is print out
> >> whatever it is ... no guarantees of uniqueness or specific format will
> >> be made. Since we're only providing one preferred_name file, the kernel
> >> can only have one preferred name for a device at any given time
> >> (although it is modifiable on the fly as many times as the user
> >> chooses).
> >>
> >> The design is to use this preferred name to implement what Hitachi wants
> >> in terms of persistent name, but we don't really care.
> >>
> >> All userspace naming will be taken care of by the usual udev rules, so
> >> for disks, something like /dev/disk/by-preferred/<fred> which would be
> >> the usual symbolic link.
> >
> > No, udev can not create such a link after the preferred name is set, as
> > it has no way of knowing that the name was set.
> >
> >> This will ensure that kernel output and udev input are consistent. It
> >> will still require that user space utilities which derive a name for a
> >> device will need modifying to print out the preferred name.
> >
> > It also doesn't solve the issue of userspace wanting to use such a
> > "preferred" name in the command line of tools, as there will not be a
> > link back to the "kernel" name directly in /dev/.
>
> Right, this series just add a preferred name interface, and changes
> a part of kernel messages.

So, just a tiny part of what you want to do in the end?

> > So as userspace tools will still need to be fixed, I don't see how
> > adding a kernel file for this is going to help any. Well, a bit in that
> > the kernel log files will look "different", but again, that really isn't
> > a problem that userspace couldn't also solve with no kernel changes
> > needed.
>
> hmm, He didnt say "this can solve all problems". I think
> preferred name is just a starting point to solve these problems.
> Actually, he decided to fix those user space tools to accept
> persistent symbolic links, and to show it in outputs.
>
> It's not complete, but a good starting point, isn't it?

A starting point for what? What is your "end goal" here to accomplish?

As this does not seem to me to really solve what you see as your "real"
problem, perhaps you can explain what your next steps are going to be
after this?

thanks,

greg k-h

2011-06-17 05:26:57

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Thu, Jun 16, 2011 at 04:31:29PM -0400, James Bottomley wrote:
> > So again, I really don't like this, just fix the userspace tools to map
> > the proper device name that the kernel is using to the userspace name
> > the tool used, and all is fine. This has been done already today,
> > succesfully, by many of the big "enterprise" monitoring systems that
> > work quite well on Linux, proving that this is not something that the
> > kernel needs to provide to implement properly.
>
> Well, it's expediency. Sure we could try to patch the world, but I
> think the simple patch of getting the kernel to print a preferred name
> solves 90% of the problem. Sure there is a long tail of userspace
> components that needs fixing, but that can be done gradually if we take
> the kernel route. If we go the userspace route, it will be a long while
> before we even get to 50% coverage.

I do not think that just because some people feel it is easier to change
the kernel than change userspace tools, that we are somehow forced to
accept their changes.

As for "expediency", it has been a full year since the last time this
was proposed. All userspace tools that would need to be changed to
implement this in userspace have had updates released for them in that
year, and the changes needed to make to them could have been done
already.

So any argument about "quickness" here which requires a kernel change
instead of fixing userspace programs is totally false, sorry.

greg k-h

2011-06-17 05:58:24

by Nao Nishijima

[permalink] [raw]
Subject: Re: [PATCH 0/3] [RFC] Persistent device name using preferred name

Hi Greg,

(2011/06/16 0:37), Greg KH wrote:
> On Wed, Jun 15, 2011 at 05:16:10PM +0900, Nao Nishijima wrote:
>> Hi,
>>
>> This patch series provides preferred name into kernel and procfs
>> messages. Preferred name is user's preferred name for a device.
>>
>> The purpose of this feature is to solve the persistent device
>> naming issues which was discussed here:
>>
>> http://marc.info/?l=linux-scsi&m=130200794615884&w=2
>>
>> There are four issues.
>> 1. kernel messages doesn't show persistent device names
>
> That is because a persistent device name could be anything, there are
> multiple ways of defining a device, and the kernel will not know them
> all as multiple ones could be in use for the same device.
>

Right, thus I'd like to solve it by assigning a unique preferred name
to each device. I mean, user of preferred name will decide to use one
way of defining a device, then kernel can show the name as a persistent
name for him. Since this is completely based on user's choice/decision,
no one will complain about that.


>> 2. procfs messages doesn't show persistent device names
>
> See above.
>
>> 3. Some commands didn't support persistent device name in arguments
>
> Then fix the commands!
>

Yes, of course. I'd like to fix those commands to accept preferred names.


> Seriously, this could be done by now, it's been over a year since this
> was first discussed. All distros could have the updated packages by now
> and this would not be an issue.
>
> I still think this is the correct way to solve the problem as it is a
> userspace issue, not a kernel one.
>

Agreed. In #3 and #4, I don't think it can solve in kernel space now.

>> 4. Some commands message didn't show persistent device names
>
> Same as #3.
>

Yeah. Again, I've changed my mind, I'll try to fix those commands.

>> Then I suggested the intermediate device naming which changes
>> the naming scheme, but it was rejected. I realized that we should
>> use udev to provide persistent device names instead of change the
>> naming scheme.
>
> Yes.
>
>> In LKML discussion, a new idea was suggested by James Bottomley.
>> This idea allows kernel messages show preferred names by adding a
>> new attribute to a device, kernel messages show this new attribute.
>> This idea's advantage is not to change the current naming scheme.
>>
>> I tried implementation of preferred name, and then there are two
>> discussion points.
>>
>> (a) Which devices need support?
>> Preferred name is stored in struct device. Therefore it is available
>> for all devices if we make preferred name support with other device
>> types.
>>
>> This patch series only support scsi block device. Is there the device
>> which needs support? (e.g. Ntwork devices, generic SCSI devices, etc.)
>>
>> (b) What kind of procfs form is good?
>> I implemented preferred name something like this,
>>
>> (preferred name assigned foo to sda)
>> #cat /proc/partitions
>> major minor #blocks name
>>
>> 8 0 488386584 foo
>> 8 1 194560 foo1
>> ...
>>
>> Do you needs device name filed?
>> Something like this,
>>
>> (preferred name assigned foo to sda)
>> #cat /proc/partitions
>> major minor #blocks name preferred
>>
>> 8 0 488386584 sda foo
>> 8 1 194560 sda1 foo1
>> ...
>
> Sorry, but you can not change the format of procfs files without
> breaking a lot of tools, that's no longer allowed.
>

OK, I would not change the format of procfs files.

>> Issue 3 and 4 is command releated issue. Commands have to be
>> modified to use preferred name. We need to create library for
>> preferred name.
>
> Again, this is quite simple and could have been finished by now :(
>

:(

>> Our goal is to solve those issues, and users can use and see
>> preferred name anywhere.
>
> I don't see how your proposed solution would solve the issue of
> userspace using different persistant names for the same device. How
> would it know which one is correct?
>

My proposal don't use current persistent device names for access. Those
are used in a udev rule as information to identify a device. Something
like this:

(Example: using by-id)
SUBSYSTEM=="block",
ENV{ID_SERIAL}=="scsi-SATA_WDC_WD5000AAKS-_WD-WCASY6088049",
SYMLINK+="disk/by-preferred/foo", PROGRAM=”write_preferred_name %p foo”

To access a device from by-preferred persistently, I create new symbolic
link from /dev/disk/by-preferred/foo to /dev/sdX.

> Again, this is a userspace thing, not a kernel thing, please solve it in
> userspace.
>

To solve it in userspace, we need mapping of a device name to a device
every boot-up time. But if kernel messages can show preferred name, we
can identify a device from only kernel messages.

Thanks,

--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: [email protected]

2011-06-17 06:27:34

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On 06/16/2011 07:20 PM, Kay Sievers wrote:
> On Thu, Jun 16, 2011 at 19:09, Kay Sievers<[email protected]> wrote:
>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>> <[email protected]> wrote:
>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>> for disks, something like /dev/disk/by-preferred/<fred> which would be
>>>>> the usual symbolic link.
>>>>
>>>> No, udev can not create such a link after the preferred name is set, as
>>>> it has no way of knowing that the name was set.
>>>
>>> It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
>>> equally happy having whatever sets the kernel name create the link (or
>>> tickle udev to create it). We definitely require device links, though,
>>> to get this to work.
>
> Guess all that would work now, including mount(8) not canonicalizing.
> What would happen if we mount:
> /dev/disk/by-pretty/foo
> and some tool later thinks the pretty name should better be 'bar', it
> writes the name to /sys, we get a uevent, the old link disappears, we
> get a new link, mount has no device node anymore for the mounted
> device ...
>
> So we basically get a one-shot additional pretty name? Guess, the
> _single_ name changed anytime later just asks for serious problems. We
> need to set it very early to be really useful, but how, where is it
> coming from?
>
Well, certain storage arrays are able to print out the user-defined
name for the LUNs:

# sg_vpd -p 0xc8 /dev/sdc
Extended device identification (RDAC) VPD Page:
Volume Unique Identifier: 60080e50001bf1f0000005004ddb05a4
Creation Number: 1280, Timestamp: Tue May 24 03:11:00 2011
Volume User Label: mas-1
Storage Array Unique Identifier: 60080e50001bf1f0000000004d418973
Storage Array User Label: LSI-SAS-DIF
Logical Unit Number: 0000000000000000

where the 'Volume User Label' is the name the administrator has
given to the LUN on the storage array.

So for these kind of things it would be useful.

However, a single pretty name is quite a limitation.
And I also fail to see why this can't be handled in userspace.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

2011-06-17 06:55:39

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Jun 16 James Bottomley wrote:
> I agree that all the output tools within reason can be
> fixed to do this automatically, but fixing cat say, just so
> cat /proc/partitions works would never be acceptable upstream.

There is an alternative to cat, called sed, which already carries the fix. :-)
--
Stefan Richter
-=====-==-== -==- =---=
http://arcgraph.de/sr/

Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

(2011/06/17 14:22), Greg KH wrote:
>>>> This will ensure that kernel output and udev input are consistent. It
>>>> will still require that user space utilities which derive a name for a
>>>> device will need modifying to print out the preferred name.
>>>
>>> It also doesn't solve the issue of userspace wanting to use such a
>>> "preferred" name in the command line of tools, as there will not be a
>>> link back to the "kernel" name directly in /dev/.
>>
>> Right, this series just add a preferred name interface, and changes
>> a part of kernel messages.
>
> So, just a tiny part of what you want to do in the end?

Right.

>>> So as userspace tools will still need to be fixed, I don't see how
>>> adding a kernel file for this is going to help any. Well, a bit in that
>>> the kernel log files will look "different", but again, that really isn't
>>> a problem that userspace couldn't also solve with no kernel changes
>>> needed.
>>
>> hmm, He didnt say "this can solve all problems". I think
>> preferred name is just a starting point to solve these problems.
>> Actually, he decided to fix those user space tools to accept
>> persistent symbolic links, and to show it in outputs.
>>
>> It's not complete, but a good starting point, isn't it?
>
> A starting point for what? What is your "end goal" here to accomplish?

The goal is to allow users to access their devices via an uniformed
and simple persistent name. "Access" is not only read/write but also
checking the device status(e.g. procfs), finding the device from log
(dmesg) and using tools.
Of course, the last issue is only for the tools, and we should fix
individual tools instead of kernel.

> As this does not seem to me to really solve what you see as your "real"
> problem, perhaps you can explain what your next steps are going to be
> after this?

After this (kernel side change), we'd like to fix those tools to
accept preferred-name, and to show it, as Nao said;

> 1. kernel messages doesn't show persistent device names
> 2. procfs messages doesn't show persistent device names
> 3. Some commands didn't support persistent device name in arguments
> 4. Some commands message didn't show persistent device names
[...]
> Issue 3 and 4 is command releated issue. Commands have to be
> modified to use preferred name. We need to create library for
> preferred name.

Of course, in this step, we will provide a document how to setup
this feature, a udev rule(Nao sent an example rule in another mail)
and helper scripts, so that someone who are interested in the preferred
device naming can test this.

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2011-06-17 11:36:45

by Nao Nishijima

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

Hi Kay,

Thank you for looking at it.

(2011/06/17 2:20), Kay Sievers wrote:
> On Thu, Jun 16, 2011 at 19:09, Kay Sievers <[email protected]> wrote:
>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>> <[email protected]> wrote:
>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>> for disks, something like /dev/disk/by-preferred/<fred> which would be
>>>>> the usual symbolic link.
>>>>
>>>> No, udev can not create such a link after the preferred name is set, as
>>>> it has no way of knowing that the name was set.
>>>
>>> It can if we trigger a uevent. Note: I'm not advocating this ... I'd be
>>> equally happy having whatever sets the kernel name create the link (or
>>> tickle udev to create it). We definitely require device links, though,
>>> to get this to work.
>
> Guess all that would work now, including mount(8) not canonicalizing.
> What would happen if we mount:
> /dev/disk/by-pretty/foo
> and some tool later thinks the pretty name should better be 'bar', it
> writes the name to /sys, we get a uevent, the old link disappears, we
> get a new link, mount has no device node anymore for the mounted
> device ...
>
> So we basically get a one-shot additional pretty name? Guess, the
> _single_ name changed anytime later just asks for serious problems. We
> need to set it very early to be really useful, but how, where is it
> coming from?
>

If we can avoid serious problems, I will implement preferred name as
write once.

I have two idea for work preferred name from the very first steps during
early boot in initramfs.

1. I think we can provide udev rules for preferred name in initramfs.
2. (option) I consider using user-defined name for the LUNs(*) because
it is possible to get the name in initramfs. The user store preferred
name in LUNs by sg command if available.

(*) Hannes say:
> Well, certain storage arrays are able to print out the user-defined name for the LUNs:
>
> # sg_vpd -p 0xc8 /dev/sdc
> Extended device identification (RDAC) VPD Page:
> Volume Unique Identifier: 60080e50001bf1f0000005004ddb05a4
> Creation Number: 1280, Timestamp: Tue May 24 03:11:00 2011
> Volume User Label: mas-1
> Storage Array Unique Identifier: 60080e50001bf1f0000000004d418973
> Storage Array User Label: LSI-SAS-DIF
> Logical Unit Number: 0000000000000000
>
> where the 'Volume User Label' is the name the administrator has given to the LUN on the storage array.


Thanks,

--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: [email protected]

Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

(2011/06/17 8:04), Kay Sievers wrote:
[...]
>>>>>>> So as userspace tools will still need to be fixed, I don't see how
>>>>>>> adding a kernel file for this is going to help any. Well, a bit in that
>>>>>>> the kernel log files will look "different", but again, that really isn't
>>>>>>> a problem that userspace couldn't also solve with no kernel changes
>>>>>>> needed.
>>>>>>
>>>>>> This is true, but I think for the small effort it takes to implement the
>>>>>> feature in-kernel compared with what we'd have to do to the
>>>>>> distributions to get it implemented in userspace (we'd need klogd to do
>>>>>> the conversion for dmesg ... I'm entirely unclear what we need to modify
>>>>>> for /proc/partitions, etc.) the benefit outweighs the cost.
>>>>>>
>>>>>> Additionally, since renaming is something users seem to want (just look
>>>>>> at net interfaces), if we can make this work, we now have a definitive
>>>>>> answer to point people at.
>>>>>
>>>>> Renaming is something that we do NOT want to do, as we have learned our
>>>>> lesson of the network device renaming mess. And as Kay pointed out, we
>>>>> already have an "alias" name there, which no one uses.
>>>>
>>>> Look at this as an opportunity to get it right. The original proposal
>>>> was for renaming. By iterating over the actual requirements, we have it
>>>> reduced to simply having the kernel print a preferred name. I think
>>>> that's a nice achievement which we can point other proponents of
>>>> renaming to as they arise.
>>>
>>> Sure, we absolutely don't want renaming, and we can provide countless
>>> solid technical reasons why we should not allow it to happen. But I'm
>>> also pretty sure, we also don't want just-another-single-name to put
>>> somewhere in the kernel.
>>
>> I understand why we don't want renaming. However, the technical reason
>> why we want a preferred name is that it's often associated with a name
>> printed somewhere on the box (say a label on the disk enclosure, or
>> ethernet port). Not being able to use this name to address the device
>> is a usability issue which annoys the enterprise enormously.
>>
>> So if we stop there, regardless of solution (in-kernel or fix all
>> userspace), does everyone see what the actual problem is?
>
> I don't think that solves the problem, no. We need _smart_ userspace
> with a debug/error message channel from the kernel to userspace that
> pops out _structured_ data. Userspace needs to index the data, and
> merge a lot of userspace information into it.

If that is possible, I think it's so helpful. But most of driver
developers doesn't like that... They may tend to continue using
printk() debug/error notification. (actually I hope them to
use some notification API, like traceevent...)

Maybe, some kind of errors, like AER/MCE, easily move on to
such smarter system. But I doubt other device-specific errors
can do that too. There are so much specific kind of errors...

> Adding just another single-name to the kernel just makes the
> much-too-dumb free-text printk() a bit more readable, but still sounds
> not like a solution. Pimping up syslog is not the solution to this
> problem, and it can't be solved in the kernel alone.

I agree with you that the _smart_ error notifier can solve our
problem too. However, we can't jump into it directly.
And just making printk() readable helps us A LOT!


>>>>> So again, I really don't like this, just fix the userspace tools to map
>>>>> the proper device name that the kernel is using to the userspace name
>>>>> the tool used, and all is fine. This has been done already today,
>>>>> succesfully, by many of the big "enterprise" monitoring systems that
>>>>> work quite well on Linux, proving that this is not something that the
>>>>> kernel needs to provide to implement properly.
>>>>
>>>> Well, it's expediency. Sure we could try to patch the world, but I
>>>> think the simple patch of getting the kernel to print a preferred name
>>>> solves 90% of the problem. Sure there is a long tail of userspace
>>>> components that needs fixing, but that can be done gradually if we take
>>>> the kernel route. If we go the userspace route, it will be a long while
>>>> before we even get to 50% coverage.
>>>
>>> I need to ask again ask for an explanation why logging all symlinks at
>>> device discovery from udev, does not solve exactly this problem. With
>>> that tag in the syslog message stream, all later kernel names can be
>>> safely associated with _all_ the current device names in question,
>>> until the next tag from udev is found.
>>
>> So if the user has one preferred name, us logging all the names (and we
>> have quite a few for disks) doesn't really help because the user might
>> want to choose a different name. However, even if we assume they choose
>> one of the current names, they still have to do the mapping manually;
>> even if they have all the information, they can't just cut and paste
>> from dmesg say, they have to cut, edit the buffer to put in the
>> preferred name and then paste ... that's just one annoying step too far
>> for most users. I agree that all the output tools within reason can be
>> fixed to do this automatically, but fixing cat say, just so
>> cat /proc/partitions works would never be acceptable upstream.
>>
>> The reason for storing this in the kernel is just that it's easier than
>> trying to update all the tools, and it solves 90% of the problem, which
>> makes the solution usable, even if we have to update tools to get to
>> 100%.
>
> I don't think we can even solve 10% of the problems that way. It's
> just a hack that makes stuff a bit more pretty, but doesn't provide
> any reasonable solution to the problem. I doubt we can even make a
> simple use case out of it, what name to put into that field for a
> multipath setup.

Good point! we have to consider multipath case in document.
Perhaps, we need a special naming rule in that case. I think
it can be solved if udev script is enough smart :-)

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2011-06-17 12:28:37

by Nao Nishijima

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

Hi Hannes,

Thank you for looking at it.

(2011/06/17 15:27), Hannes Reinecke wrote:
> On 06/16/2011 07:20 PM, Kay Sievers wrote:
>> On Thu, Jun 16, 2011 at 19:09, Kay Sievers<[email protected]> wrote:
>>> On Thu, Jun 16, 2011 at 18:25, James Bottomley
>>> <[email protected]> wrote:
>>>> On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote:
>>>>> All userspace naming will be taken care of by the usual udev rules, so
>>>>>> for disks, something like /dev/disk/by-preferred/<fred> which
>>>>>> would be
>>>>>> the usual symbolic link.
>>>>>
>>>>> No, udev can not create such a link after the preferred name is
>>>>> set, as
>>>>> it has no way of knowing that the name was set.
>>>>
>>>> It can if we trigger a uevent. Note: I'm not advocating this ...
>>>> I'd be
>>>> equally happy having whatever sets the kernel name create the link (or
>>>> tickle udev to create it). We definitely require device links, though,
>>>> to get this to work.
>>
>> Guess all that would work now, including mount(8) not canonicalizing.
>> What would happen if we mount:
>> /dev/disk/by-pretty/foo
>> and some tool later thinks the pretty name should better be 'bar', it
>> writes the name to /sys, we get a uevent, the old link disappears, we
>> get a new link, mount has no device node anymore for the mounted
>> device ...
>>
>> So we basically get a one-shot additional pretty name? Guess, the
>> _single_ name changed anytime later just asks for serious problems. We
>> need to set it very early to be really useful, but how, where is it
>> coming from?
>>
> Well, certain storage arrays are able to print out the user-defined name
> for the LUNs:
>
> # sg_vpd -p 0xc8 /dev/sdc
> Extended device identification (RDAC) VPD Page:
> Volume Unique Identifier: 60080e50001bf1f0000005004ddb05a4
> Creation Number: 1280, Timestamp: Tue May 24 03:11:00 2011
> Volume User Label: mas-1
> Storage Array Unique Identifier: 60080e50001bf1f0000000004d418973
> Storage Array User Label: LSI-SAS-DIF
> Logical Unit Number: 0000000000000000
>
> where the 'Volume User Label' is the name the administrator has given to
> the LUN on the storage array.
>
> So for these kind of things it would be useful.
>
> However, a single pretty name is quite a limitation.
> And I also fail to see why this can't be handled in userspace.
>

I know that there are several names for one device.
Preferred name does not eliminate those names, but just add another
pretty name. Usually those machine-generated names are too long and
not familiar with users. preferred name will provide a human-readable
name and reduce the operation cost.

Thanks,

--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: [email protected]

2011-06-17 14:27:31

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
> >> We need many names, and we need all of them from the very
> beginning,
> >> and they should not change during device lifetime unless the device
> >> state changes.
> >
> > So that's actually an argument for leaving the links, surely? We
> can
> > have many inbound links, but the kernel can only print one name in
> > messages, which would be the preferred name that was currently set.
>
> I really question any concept of _the_ name. My take on it: It will
> never work in reality.

OK, so lets take the common example: a desktop with three disks and an
enclosure with three slots and labels "fred", "jim", and "betty".

The desired outcome is that whenever the user manipulates those devices
he uses a name related to the label, so whenever dmesg flags a problem,
it says sd betty: device offline or something. Whenever he mounts, he
mounts by /dev/disk/by-preferred/betty (or whatever the current udev
vernacular is). Whenever smartmon says there's an over temp problem. it
says that fred has it; cat /proc/partitions shows how fred, jim and
betty are partitioned and so on.

To do this, we set the preferred name at start of day via a machine
specific customisation. For an enclosure, there's a standard way of
mapping the name to the device, so we'd just use that, but it's not
impossible to imagine systems with stranger entities that require per
motherboard customisations.

Once the name is set in boot up, we, in fact, never alter it.

With the kernel patch proposed and a corresponding update to udev
actually makes all the above happen. There have to be tweaks to the
startup scripts, like smartd needs a file configuration that lists the
disk by preferred path so that the output is correct.

Obviously, I chose the commands above so there is no need to modify any
of them. There will be utilities (like overly smart san managers) that
do derive the name and will need updating, but they're not among the
standard workstation admin tools.

James

2011-06-17 14:30:51

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 13:53, Masami Hiramatsu
<[email protected]> wrote:
> (2011/06/17 8:04), Kay Sievers wrote:

>> I don't think that solves the problem, no. We need _smart_ userspace
>> with a debug/error message channel from the kernel to userspace that
>> pops out _structured_ data. Userspace needs to index the data, and
>> merge a lot of userspace information into it.
>
> If that is possible, I think it's so helpful. But most of driver
> developers doesn't like that... They may tend to continue using
> printk() debug/error notification. (actually I hope them to
> use some notification API, like traceevent...)

Sure they will add printk() for things, that's fine and should not
change. There are things that can not and should not be formalized,
and need a human to rad and fix it anyway.

But it's also trivial to wire up the interesting debug/error
information to some _sane_ channel at the same time, that does not
produce for programmatic processing absolutely useless random and
chaotic text dumps, but structured data for programs to consume. And
right, it's very much like the problems tracing tries to solve.

It's really time to leave the unix stone age behind us, and not to
pimp up broken-by-design, or never designed, facilities. Storage
management that needs this

> Maybe, some kind of errors, like AER/MCE, easily move on to
> such smarter system. But I doubt other device-specific errors
> can do that too. There are so much specific kind of errors...
>
>> Adding just another single-name to the kernel just makes the
>> much-too-dumb free-text printk() a bit more readable, but still sounds
>> not like a solution. Pimping up syslog is not the solution to this
>> problem, and it can't be solved in the kernel alone.
>
> I agree with you that the _smart_ error notifier can solve our
> problem too.

The lets start working on the real fix. We should stop papering over
issues caused by pretty much useless free-text printk() strings. It
just makes something totallydumb, a bit more pretty, introduces a lot
diffferent problems by doing that, and has not the potential of
solving the underlying problem properly. To me it sounds like a
promise it can't deliver in reality.

> However, we can't jump into it directly.
> And just making printk() readable helps us A LOT!

I'm not convinced. I think it only 'sounds' nice, but in reality, no
'single name' can be composed reliably.

>> I don't think we can even solve 10% of the problems that way. It's
>> just a hack that makes stuff a bit more pretty, but doesn't provide
>> any reasonable solution to the problem. I doubt we can even make a
>> simple use case out of it, what name to put into that field for a
>> multipath setup.
>
> Good point! we have to consider multipath case in document.
> Perhaps, we need a special naming rule in that case. I think
> it can be solved if udev script is enough smart :-)

I don't think that can be solved. You have the problem that all that
needs to be available very early during boot to be useful, and I don't
see how we can really have that policy information during that time.

Kay

2011-06-17 14:40:46

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 16:27, James Bottomley
<[email protected]> wrote:
> On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
>> >> We need many names, and we need all of them from the very
>> beginning,
>> >> and they should not change during device lifetime unless the device
>> >> state changes.
>> >
>> > So that's actually an argument for leaving the links, surely?  We
>> can
>> > have many inbound links, but the kernel can only print one name in
>> > messages, which would be the preferred name that was currently set.
>>
>> I really question any concept of _the_ name. My take on it: It will
>> never work in reality.
>
> OK, so lets take the common example: a desktop with three disks and an
> enclosure with three slots and labels "fred", "jim", and "betty".
>
> The desired outcome is that whenever the user manipulates those devices
> he uses a name related to the label, so whenever dmesg flags a problem,
> it says sd betty:  device offline or something.  Whenever he mounts, he
> mounts by /dev/disk/by-preferred/betty (or whatever the current udev
> vernacular is).  Whenever smartmon says there's an over temp problem. it
> says that fred has it;  cat /proc/partitions shows how fred, jim and
> betty are partitioned and so on.
>
> To do this, we set the preferred name at start of day via a machine
> specific customisation.  For an enclosure, there's a standard way of
> mapping the name to the device, so we'd just use that, but it's not
> impossible to imagine systems with stranger entities that require per
> motherboard customisations.
>
> Once the name is set in boot up, we, in fact, never alter it.
>
> With the kernel patch proposed and a corresponding update to udev
> actually makes all the above happen.  There have to be tweaks to the
> startup scripts, like smartd needs a file configuration that lists the
> disk by preferred path so that the output is correct.
>
> Obviously, I chose the commands above so there is no need to modify any
> of them.  There will be utilities (like overly smart san managers) that
> do derive the name and will need updating, but they're not among the
> standard workstation admin tools.

Ok, the still remaining questions:

Why isn't logging all symlink names during device discovery solving
all the problems we discuss without any change to the kernel? It's
just a single udev rule that can be added to ages old systems today. I
think that solves exactly the same problem and works with many names
at the same time.

Where is "fred", "jim", and "betty" stored on bootup?

What are the keys to match the pretty names to the disks?

The pretty name is a one-shot setting only? If not how is a change
handled for already used devices?

How will this information be available in the initramfs, where
mutltiple disks might need to be mounted already? Initramfs images are
generic and not created per host.

How are multipath setups handled where the exact same disk is behind
multiple kernel devices? What to put into these names in this case?

Kay

2011-06-17 14:50:43

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, 2011-06-17 at 16:40 +0200, Kay Sievers wrote:
> On Fri, Jun 17, 2011 at 16:27, James Bottomley
> <[email protected]> wrote:
> > On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
> >> >> We need many names, and we need all of them from the very
> >> beginning,
> >> >> and they should not change during device lifetime unless the device
> >> >> state changes.
> >> >
> >> > So that's actually an argument for leaving the links, surely? We
> >> can
> >> > have many inbound links, but the kernel can only print one name in
> >> > messages, which would be the preferred name that was currently set.
> >>
> >> I really question any concept of _the_ name. My take on it: It will
> >> never work in reality.
> >
> > OK, so lets take the common example: a desktop with three disks and an
> > enclosure with three slots and labels "fred", "jim", and "betty".
> >
> > The desired outcome is that whenever the user manipulates those devices
> > he uses a name related to the label, so whenever dmesg flags a problem,
> > it says sd betty: device offline or something. Whenever he mounts, he
> > mounts by /dev/disk/by-preferred/betty (or whatever the current udev
> > vernacular is). Whenever smartmon says there's an over temp problem. it
> > says that fred has it; cat /proc/partitions shows how fred, jim and
> > betty are partitioned and so on.
> >
> > To do this, we set the preferred name at start of day via a machine
> > specific customisation. For an enclosure, there's a standard way of
> > mapping the name to the device, so we'd just use that, but it's not
> > impossible to imagine systems with stranger entities that require per
> > motherboard customisations.
> >
> > Once the name is set in boot up, we, in fact, never alter it.
> >
> > With the kernel patch proposed and a corresponding update to udev
> > actually makes all the above happen. There have to be tweaks to the
> > startup scripts, like smartd needs a file configuration that lists the
> > disk by preferred path so that the output is correct.
> >
> > Obviously, I chose the commands above so there is no need to modify any
> > of them. There will be utilities (like overly smart san managers) that
> > do derive the name and will need updating, but they're not among the
> > standard workstation admin tools.
>
> Ok, the still remaining questions:
>
> Why isn't logging all symlink names during device discovery solving
> all the problems we discuss without any change to the kernel? It's
> just a single udev rule that can be added to ages old systems today. I
> think that solves exactly the same problem and works with many names
> at the same time.

It could ... but that doesn't solve the prink problem or
the /proc/partitions one. The idea is to allow naive users to identify
their device physically when the system logs something about it. Having
to describe a manual mapping procedure is very complex for them.

> Where is "fred", "jim", and "betty" stored on bootup?

So this is subsystem specific. For the case of a SCSI enclosure, I can
answer that it's actually burned into the enclosure firmware. When you
build an enclosure with labels, the label names are stored in a
diagnostic page. We can actually interrogate the enclosure directly or
use the ses driver to get these names mapped to current devices.

> What are the keys to match the pretty names to the disks?
>
> The pretty name is a one-shot setting only? If not how is a change
> handled for already used devices?

obviously, one device will be root, and it will already be used
as /dev/root, but the proposal isn't to change any name, it's merely to
set "fred" as the preferred name for /dev/root, which is also /dev/sdc
and /dev/disk/by-id/naa-566dce3ddf etc ...

> How will this information be available in the initramfs, where
> mutltiple disks might need to be mounted already? Initramfs images are
> generic and not created per host.

That's initramfs specific. However, if, in deference to your new
location, we look at dracut, it has a modules directory for plugin
extensions. The scripts which do the mapping can be inserted there as
an additional rpm.

> How are multipath setups handled where the exact same disk is behind
> multiple kernel devices? What to put into these names in this case?

I'm not sure I understand the question. a md setup of RAID-1 on fred
and betty would assemble using /dev/disk/by-preferred/fred
and /dev/disk/by-preferred/betty. Whether the user want's to
call /dev/md0 something pretty is up to them ... it's not a physically
labelled entity, so I'd tend just to leave it with its default name as
the preferred name.

James

2011-06-17 15:39:40

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 16:49, James Bottomley
<[email protected]> wrote:
> On Fri, 2011-06-17 at 16:40 +0200, Kay Sievers wrote:

>> Ok, the still remaining questions:
>>
>> Why isn't logging all symlink names during device discovery solving
>> all the problems we discuss without any change to the kernel? It's
>> just a single udev rule that can be added to ages old systems today. I
>> think that solves exactly the same problem and works with many names
>> at the same time.
>
> It could ... but that doesn't solve the prink problem or
> the /proc/partitions one.  The idea is to allow naive users to identify
> their device physically when the system logs something about it.  Having
> to describe a manual mapping procedure is very complex for them.

How? You see 'sda kaput' then you scroll up to the last udev message
with /dev/disk/by-all-the-many-names/. Anyone who can't do this should
not even get access to the log file. :)

>> Where is "fred", "jim", and "betty" stored on bootup?
>
> So this is subsystem specific.  For the case of a SCSI enclosure, I can
> answer that it's actually burned into the enclosure firmware.  When you
> build an enclosure with labels, the label names are stored in a
> diagnostic page.  We can actually interrogate the enclosure directly or
> use the ses driver to get these names mapped to current devices.

To me this sounds like a nice name on top of the current bunch of
names, not like a 'preferred' name.

I still don't like to introduce any new facility to the kernel that
can handle only one single name. Reality the last years has taught us
a very different story, and we've walked a long way to get where we
are. I really don't believe single names will ever work, it's just a
nice theory.

>> What are the keys to match the pretty names to the disks?
>>
>> The pretty name is a one-shot setting only? If not how is a change
>> handled for already used devices?
>
> obviously, one device will be root, and it will already be used
> as /dev/root, but the proposal isn't to change any name, it's merely to
> set "fred" as the preferred name for /dev/root, which is also /dev/sdc
> and /dev/disk/by-id/naa-566dce3ddf etc ...

Sure, sounds nice. I just don't see how all that can be done
automatically in a sane way. It's like telling people to name their
network interfaces "internal", "external", "dmz". Almost nobody is
doing this, there is even the netif alias support already, and nobody
wants to use it besides a few SNMP tools. In my experience, stuff
needs to work out-of-the-box, or it is not used.

>> How will this information be available in the initramfs, where
>> mutltiple disks might need to be mounted already? Initramfs images are
>> generic and not created per host.
>
> That's initramfs specific.  However, if, in deference to your new
> location, we look at dracut, it has a modules directory for plugin
> extensions.  The scripts which do the mapping can be inserted there as
> an additional rpm.

That's not really what I meant. I want to hear how the decision is
made which one of the multiple possibilities for the name composition
is chosen. Ideally it needs to work in a generic setup, not with a
customized host-specific file included even in the initramfs image.

>> How are multipath setups handled where the exact same disk is behind
>> multiple kernel devices? What to put into these names in this case?
>
> I'm not sure I understand the question.  a md setup of RAID-1 on fred
> and betty would assemble using /dev/disk/by-preferred/fred
> and /dev/disk/by-preferred/betty.  Whether the user want's to
> call /dev/md0 something pretty is up to them ... it's not a physically
> labelled entity, so I'd tend just to leave it with its default name as
> the preferred name.

I mean multi-path not raid. Disks have all the same ids, same
locations, but we have multiple kernel devices for them.

Kay

2011-06-17 15:41:56

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On 11-06-17 01:25 AM, Greg KH wrote:
> On Thu, Jun 16, 2011 at 04:31:29PM -0400, James Bottomley wrote:
>>> So again, I really don't like this, just fix the userspace tools to map
>>> the proper device name that the kernel is using to the userspace name
>>> the tool used, and all is fine. This has been done already today,
>>> succesfully, by many of the big "enterprise" monitoring systems that
>>> work quite well on Linux, proving that this is not something that the
>>> kernel needs to provide to implement properly.
>>
>> Well, it's expediency. Sure we could try to patch the world, but I
>> think the simple patch of getting the kernel to print a preferred name
>> solves 90% of the problem. Sure there is a long tail of userspace
>> components that needs fixing, but that can be done gradually if we take
>> the kernel route. If we go the userspace route, it will be a long while
>> before we even get to 50% coverage.
>
> I do not think that just because some people feel it is easier to change
> the kernel than change userspace tools, that we are somehow forced to
> accept their changes.
>
> As for "expediency", it has been a full year since the last time this
> was proposed. All userspace tools that would need to be changed to
> implement this in userspace have had updates released for them in that
> year, and the changes needed to make to them could have been done
> already.

Could you elaborate taking one user space tool as an
example and tell us what changes you think should be
made?

Are you talking about low level tools such as hdparm,
smartmontools and mine or tools further up the "food
chain"?

Doug Gilbert

2011-06-17 15:57:32

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 17:41, Douglas Gilbert <[email protected]> wrote:
> On 11-06-17 01:25 AM, Greg KH wrote:
>> As for "expediency", it has been a full year since the last time this
>> was proposed.  All userspace tools that would need to be changed to
>> implement this in userspace have had updates released for them in that
>> year, and the changes needed to make to them could have been done
>> already.
>
> Could you elaborate taking one user space tool as an
> example and tell us what changes you think should be
> made?
>
> Are you talking about low level tools such as hdparm,
> smartmontools and mine or tools further up the "food
> chain"?

I think the only thing really missing is a smarter 'syslog' that has
context, and not just a stream of almost random free-text packets.

Ideally the main source would be a _structured_ and properly machine
readable error/debug log from the kernel, maintained in userspace as
an indexable database, and constant merging of userspace state
information into it.

Debug/error messages from the kernel would need proper classification
and if needed can carry additional data structures, even binary data,
that contain dumps of sense results or firmware data.

I doubt there can be any really useful kernel only solution to
enterprise storage needs. Especially not based on printk().

I'm sure, we should really start to think in that direction instead of
extending things that are unlikely to solve the underlying problem
ever.

Userspace tools can already query the udev database very easily, it's
almost free, no IPC, no fork() involved, it's just a few libudev calls
which read files out of tmpfs. If they want to show names they can do
that already today. Making the kernel decide which one of the names is
the single one to choose, just can't work, I think.

Kay

2011-06-17 16:12:31

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 17:39, Kay Sievers <[email protected]> wrote:
> On Fri, Jun 17, 2011 at 16:49, James Bottomley

>> So this is subsystem specific.  For the case of a SCSI enclosure, I can
>> answer that it's actually burned into the enclosure firmware.  When you
>> build an enclosure with labels, the label names are stored in a
>> diagnostic page.  We can actually interrogate the enclosure directly or
>> use the ses driver to get these names mapped to current devices.
>
> To me this sounds like a nice name on top of the current bunch of
> names, not like a 'preferred' name.
>
> I still don't like to introduce any new facility to the kernel that
> can handle only one single name. Reality the last years has taught us
> a very different story, and we've walked a long way to get where we
> are. I really don't believe single names will ever work, it's just a
> nice theory.

I might need to clarify this a bit.

I have no problem in general to add a 'alias' to every disk, and use
that when stuff is logged. Just the same way the netifs have an alias.
Sure, it might be useful for some use cases. And if that helps to
solve any real problem, we should just do it.

I just want to make clear, that I don't think that it is anywhere near
to a solution for the problems which are described here. And that
nobody should see this as an excuse not to get their stuff together
and work on the problem, which is that we don't have machine-readable
error and debug from the kernel and a smart syslog.

If we had that, I'm very sure nobody would even ask for a 'pretty
name' in the kernel, and I think that is a good indication that we are
not on the right track here.

Kay

2011-06-17 16:23:05

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 06:12:14PM +0200, Kay Sievers wrote:
> On Fri, Jun 17, 2011 at 17:39, Kay Sievers <[email protected]> wrote:
> > On Fri, Jun 17, 2011 at 16:49, James Bottomley
>
> >> So this is subsystem specific. ?For the case of a SCSI enclosure, I can
> >> answer that it's actually burned into the enclosure firmware. ?When you
> >> build an enclosure with labels, the label names are stored in a
> >> diagnostic page. ?We can actually interrogate the enclosure directly or
> >> use the ses driver to get these names mapped to current devices.
> >
> > To me this sounds like a nice name on top of the current bunch of
> > names, not like a 'preferred' name.
> >
> > I still don't like to introduce any new facility to the kernel that
> > can handle only one single name. Reality the last years has taught us
> > a very different story, and we've walked a long way to get where we
> > are. I really don't believe single names will ever work, it's just a
> > nice theory.
>
> I might need to clarify this a bit.
>
> I have no problem in general to add a 'alias' to every disk, and use
> that when stuff is logged. Just the same way the netifs have an alias.
> Sure, it might be useful for some use cases. And if that helps to
> solve any real problem, we should just do it.
>
> I just want to make clear, that I don't think that it is anywhere near
> to a solution for the problems which are described here. And that
> nobody should see this as an excuse not to get their stuff together
> and work on the problem, which is that we don't have machine-readable
> error and debug from the kernel and a smart syslog.
>
> If we had that, I'm very sure nobody would even ask for a 'pretty
> name' in the kernel, and I think that is a good indication that we are
> not on the right track here.

And I totally agree here, which is why I don't want to accept this
change to the driver core to add this, as it's not the correct solution.

thanks,

greg k-h

2011-06-18 19:40:30

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, 2011-06-17 at 09:22 -0700, Greg KH wrote:
> On Fri, Jun 17, 2011 at 06:12:14PM +0200, Kay Sievers wrote:
> > On Fri, Jun 17, 2011 at 17:39, Kay Sievers <[email protected]> wrote:
> > > On Fri, Jun 17, 2011 at 16:49, James Bottomley
> >
> > >> So this is subsystem specific. For the case of a SCSI enclosure, I can
> > >> answer that it's actually burned into the enclosure firmware. When you
> > >> build an enclosure with labels, the label names are stored in a
> > >> diagnostic page. We can actually interrogate the enclosure directly or
> > >> use the ses driver to get these names mapped to current devices.
> > >
> > > To me this sounds like a nice name on top of the current bunch of
> > > names, not like a 'preferred' name.
> > >
> > > I still don't like to introduce any new facility to the kernel that
> > > can handle only one single name. Reality the last years has taught us
> > > a very different story, and we've walked a long way to get where we
> > > are. I really don't believe single names will ever work, it's just a
> > > nice theory.
> >
> > I might need to clarify this a bit.
> >
> > I have no problem in general to add a 'alias' to every disk, and use
> > that when stuff is logged. Just the same way the netifs have an alias.
> > Sure, it might be useful for some use cases. And if that helps to
> > solve any real problem, we should just do it.
> >
> > I just want to make clear, that I don't think that it is anywhere near
> > to a solution for the problems which are described here. And that
> > nobody should see this as an excuse not to get their stuff together
> > and work on the problem, which is that we don't have machine-readable
> > error and debug from the kernel and a smart syslog.
> >
> > If we had that, I'm very sure nobody would even ask for a 'pretty
> > name' in the kernel, and I think that is a good indication that we are
> > not on the right track here.
>
> And I totally agree here, which is why I don't want to accept this
> change to the driver core to add this, as it's not the correct solution.

OK, fine ... we'll do it as gendisk only then. I suppose that is the
95% use case anyway.

Thanks,

James

2011-06-18 19:55:39

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Sat, Jun 18, 2011 at 21:40, James Bottomley
<[email protected]> wrote:
> On Fri, 2011-06-17 at 09:22 -0700, Greg KH wrote:
>> On Fri, Jun 17, 2011 at 06:12:14PM +0200, Kay Sievers wrote:
>> > On Fri, Jun 17, 2011 at 17:39, Kay Sievers <[email protected]> wrote:
>> > > On Fri, Jun 17, 2011 at 16:49, James Bottomley
>> >
>> > >> So this is subsystem specific.  For the case of a SCSI enclosure, I can
>> > >> answer that it's actually burned into the enclosure firmware.  When you
>> > >> build an enclosure with labels, the label names are stored in a
>> > >> diagnostic page.  We can actually interrogate the enclosure directly or
>> > >> use the ses driver to get these names mapped to current devices.
>> > >
>> > > To me this sounds like a nice name on top of the current bunch of
>> > > names, not like a 'preferred' name.
>> > >
>> > > I still don't like to introduce any new facility to the kernel that
>> > > can handle only one single name. Reality the last years has taught us
>> > > a very different story, and we've walked a long way to get where we
>> > > are. I really don't believe single names will ever work, it's just a
>> > > nice theory.
>> >
>> > I might need to clarify this a bit.
>> >
>> > I have no problem in general to add a 'alias' to every disk, and use
>> > that when stuff is logged. Just the same way the netifs have an alias.
>> > Sure, it might be useful for some use cases. And if that helps to
>> > solve any real problem, we should just do it.
>> >
>> > I just want to make clear, that I don't think that it is anywhere near
>> > to a solution for the problems which are described here. And that
>> > nobody should see this as an excuse not to get their stuff together
>> > and work on the problem, which is that we don't have machine-readable
>> > error and debug from the kernel and a smart syslog.
>> >
>> > If we had that, I'm very sure nobody would even ask for a 'pretty
>> > name' in the kernel, and I think that is a good indication that we are
>> > not on the right track here.
>>
>> And I totally agree here, which is why I don't want to accept this
>> change to the driver core to add this, as it's not the correct solution.
>
> OK, fine ... we'll do it as gendisk only then.  I suppose that is the
> 95% use case anyway.

Sounds fine. It's probably easier to have it domain-specific anyway.
Just like the netif alias already is.

I would suggest not to call it 'preferred' though, but something
similar to 'alias'. Having /dev/disk/by-preferred/ doesn't sound too
convincing to me.

An 'change' uevent when the alias is set sounds fine. I don't think a
set-once policy is needed.

Thanks,
Kay

2011-06-19 01:54:39

by Kyle Moffett

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Fri, Jun 17, 2011 at 10:27, James Bottomley
<[email protected]> wrote:
> On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
>> >> We need many names, and we need all of them from the very beginning,
>> >> and they should not change during device lifetime unless the device
>> >> state changes.
>> >
>> > So that's actually an argument for leaving the links, surely?  We can
>> > have many inbound links, but the kernel can only print one name in
>> > messages, which would be the preferred name that was currently set.
>>
>> I really question any concept of _the_ name. My take on it: It will
>> never work in reality.
>
> OK, so lets take the common example: a desktop with three disks and an
> enclosure with three slots and labels "fred", "jim", and "betty".
>
> The desired outcome is that whenever the user manipulates those devices
> he uses a name related to the label, so whenever dmesg flags a problem,
> it says sd betty:  device offline or something.  Whenever he mounts, he
> mounts by /dev/disk/by-preferred/betty (or whatever the current udev
> vernacular is).  Whenever smartmon says there's an over temp problem. it
> says that fred has it;  cat /proc/partitions shows how fred, jim and
> betty are partitioned and so on.

Hm...

So there's already all this work going into an event-tracing framework,
and most of the interesting device errors are getting converted to use
functions such as "dev_err()" and the like.

Perhaps the kernel needs a "log" event? You could add a basic unique-id
allocator (64-bit integer) and give each device or other interesting object a
unique "tag". A generic printk without a "tag" field would automatically
get tag 0.

There would be another few special events generated to make it possible
to uniquely map tags to device-model objects (or filesystems or whatever)
long after the fact, including enough information to determine the parent
device or other key attributes.

Then all of the dev_dbg() would automatically generate the necessary
trace events tagged by device, with the log-level and "string" as the
payload.

Suddenly you can monitor a device (and optionally all of its parents or
children) for "interesting kernel events", even if that particular driver
is still doing all of its logging with "primitive" dev_err() printks.

Since it's tagged by device you can just install a modified "klogd" that
cooperates with udev to log events with information about exactly
which device-model node it applies to. You can even have that
program generate dbus messages, so your desktop environment
can complain that the kernel has reported filesystem errors on that
thumbdrive you just plugged in, but that the media itself seems to
be fine (no I/O errors).

A future extension might be to allow trace-events to have a "fallback"
handler of sorts analogous to the way that audit messages are
currently handled. If a process is monitoring events and has a filter
which matches the event then it will be handled by that process;
otherwise it will call the "fallback" handler and resort to a printk().

That would allow a more advanced driver to generate specific
status and error messages for consumption by monitoring software,
but still fall back to dmesg when the system is in single-user-mode
or the monitoring software dies, etc.

Thoughts?

Cheers,
Kyle Moffett

2011-06-19 04:15:04

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

On Sat, 2011-06-18 at 21:54 -0400, Kyle Moffett wrote:
> On Fri, Jun 17, 2011 at 10:27, James Bottomley
> <[email protected]> wrote:
> > On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
> >> >> We need many names, and we need all of them from the very beginning,
> >> >> and they should not change during device lifetime unless the device
> >> >> state changes.
> >> >
> >> > So that's actually an argument for leaving the links, surely? We can
> >> > have many inbound links, but the kernel can only print one name in
> >> > messages, which would be the preferred name that was currently set.
> >>
> >> I really question any concept of _the_ name. My take on it: It will
> >> never work in reality.
> >
> > OK, so lets take the common example: a desktop with three disks and an
> > enclosure with three slots and labels "fred", "jim", and "betty".
> >
> > The desired outcome is that whenever the user manipulates those devices
> > he uses a name related to the label, so whenever dmesg flags a problem,
> > it says sd betty: device offline or something. Whenever he mounts, he
> > mounts by /dev/disk/by-preferred/betty (or whatever the current udev
> > vernacular is). Whenever smartmon says there's an over temp problem. it
> > says that fred has it; cat /proc/partitions shows how fred, jim and
> > betty are partitioned and so on.
>
> Hm...
>
> So there's already all this work going into an event-tracing framework,
> and most of the interesting device errors are getting converted to use
> functions such as "dev_err()" and the like.
>
> Perhaps the kernel needs a "log" event? You could add a basic unique-id
> allocator (64-bit integer) and give each device or other interesting object a
> unique "tag". A generic printk without a "tag" field would automatically
> get tag 0.
>
> There would be another few special events generated to make it possible
> to uniquely map tags to device-model objects (or filesystems or whatever)
> long after the fact, including enough information to determine the parent
> device or other key attributes.
>
> Then all of the dev_dbg() would automatically generate the necessary
> trace events tagged by device, with the log-level and "string" as the
> payload.
>
> Suddenly you can monitor a device (and optionally all of its parents or
> children) for "interesting kernel events", even if that particular driver
> is still doing all of its logging with "primitive" dev_err() printks.
>
> Since it's tagged by device you can just install a modified "klogd" that
> cooperates with udev to log events with information about exactly
> which device-model node it applies to. You can even have that
> program generate dbus messages, so your desktop environment
> can complain that the kernel has reported filesystem errors on that
> thumbdrive you just plugged in, but that the media itself seems to
> be fine (no I/O errors).
>
> A future extension might be to allow trace-events to have a "fallback"
> handler of sorts analogous to the way that audit messages are
> currently handled. If a process is monitoring events and has a filter
> which matches the event then it will be handled by that process;
> otherwise it will call the "fallback" handler and resort to a printk().
>
> That would allow a more advanced driver to generate specific
> status and error messages for consumption by monitoring software,
> but still fall back to dmesg when the system is in single-user-mode
> or the monitoring software dies, etc.
>
> Thoughts?

It's been tried several times before. No-one who ever began this
project found the commitment to finish it ... however, perhaps you'll be
the first ...

James

2011-06-21 04:51:14

by Nao Nishijima

[permalink] [raw]
Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

(2011/06/19 4:55), Kay Sievers wrote:
> On Sat, Jun 18, 2011 at 21:40, James Bottomley
> <[email protected]> wrote:
>> On Fri, 2011-06-17 at 09:22 -0700, Greg KH wrote:
>>> On Fri, Jun 17, 2011 at 06:12:14PM +0200, Kay Sievers wrote:
>>>> On Fri, Jun 17, 2011 at 17:39, Kay Sievers <[email protected]> wrote:
>>>>> On Fri, Jun 17, 2011 at 16:49, James Bottomley
>>>>
>>>>>> So this is subsystem specific. For the case of a SCSI enclosure, I can
>>>>>> answer that it's actually burned into the enclosure firmware. When you
>>>>>> build an enclosure with labels, the label names are stored in a
>>>>>> diagnostic page. We can actually interrogate the enclosure directly or
>>>>>> use the ses driver to get these names mapped to current devices.
>>>>>
>>>>> To me this sounds like a nice name on top of the current bunch of
>>>>> names, not like a 'preferred' name.
>>>>>
>>>>> I still don't like to introduce any new facility to the kernel that
>>>>> can handle only one single name. Reality the last years has taught us
>>>>> a very different story, and we've walked a long way to get where we
>>>>> are. I really don't believe single names will ever work, it's just a
>>>>> nice theory.
>>>>
>>>> I might need to clarify this a bit.
>>>>
>>>> I have no problem in general to add a 'alias' to every disk, and use
>>>> that when stuff is logged. Just the same way the netifs have an alias.
>>>> Sure, it might be useful for some use cases. And if that helps to
>>>> solve any real problem, we should just do it.
>>>>
>>>> I just want to make clear, that I don't think that it is anywhere near
>>>> to a solution for the problems which are described here. And that
>>>> nobody should see this as an excuse not to get their stuff together
>>>> and work on the problem, which is that we don't have machine-readable
>>>> error and debug from the kernel and a smart syslog.
>>>>
>>>> If we had that, I'm very sure nobody would even ask for a 'pretty
>>>> name' in the kernel, and I think that is a good indication that we are
>>>> not on the right track here.
>>>
>>> And I totally agree here, which is why I don't want to accept this
>>> change to the driver core to add this, as it's not the correct solution.
>>
>> OK, fine ... we'll do it as gendisk only then. I suppose that is the
>> 95% use case anyway.
>
> Sounds fine. It's probably easier to have it domain-specific anyway.
> Just like the netif alias already is.
>

Agreed. Our purpose is to output disk's alias name in kernel messages.

> I would suggest not to call it 'preferred' though, but something
> similar to 'alias'. Having /dev/disk/by-preferred/ doesn't sound too
> convincing to me.
>

OK, I will change 'preferred name' to 'alias name'.

> An 'change' uevent when the alias is set sounds fine. I don't think a
> set-once policy is needed.
>

I will intent to implement as follows
- Change the name from preferred to alias
- Add alias_name in gendisk
- Notify 'change' uevent when the alias is set

Thanks,

--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: [email protected]